Name | hadam3p_pnw_c3rk_1975_1_007937839_1 |
Workunit | 8092951 |
Created | 3 May 2012, 19:32:37 UTC |
Sent | 3 May 2012, 19:53:26 UTC |
Report deadline | 16 Apr 2013, 1:13:26 UTC |
Received | 9 Jul 2012, 23:25:24 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1215735 |
Run time | 7 days 7 hours 18 min 18 sec |
CPU time | 2 days 1 hours 37 min 50 sec |
Validate state | Invalid |
Credit | 753.03 |
Device peak FLOPS | 1.76 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Pacific North West v6.09 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> Global Worker:: CPDN process is not rSuspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 14:53:37 (5128): No heartbeat from core client for 30 sec - exiting 14:53:38 (5128): No heartbeat from core client for 30 sec - exiting 14:53:40 (5128): No heartbeat from core client for 30 sec - exiting 14:53:41 (5128): No heartbeat from core client for 30 sec - exiting 14:53:42 (5128): No heartbeat from core client for 30 sec - exiting 14:53:43 (5128): No heartbeat from core client for 30 sec - exiting 14:53:44 (5128): No heartbeat from core client for 30 sec - exiting 14:53:45 (5128): No heartbeat from core client for 30 sec - exiting 14:53:46 (5128): No heartbeat from core client for 30 sec - exiting 14:53:47 (5128): No heartbeat from core client for 30 sec - exiting 14:53:48 (5128): No heartbeat from core client for 30 sec - exiting 14:53:49 (5128): No heartbeat from core client for 30 sec - exiting 14:53:51 (5128): No heartbeat from core client for 30 sec - exiting 14:53:52 (5128): No heartbeat from core client for 30 sec - exiting 14:53:53 (5128): No heartbeat from core client for 30 sec - exiting 14:53:54 (5128): No heartbeat from core client for 30 sec - exiting 14:53:55 (5128): No heartbeat from core client for 30 sec - exiting 14:53:56 (5128): No heartbeat from core client for 30 sec - exiting 14:53:57 (5128): No heartbeat from core client for 30 sec - exiting 14:53:58 (5128): No heartbeat from core client for 30 sec - exiting 14:53:59 (5128): No heartbeat from core client for 30 sec - exiting 14:54:00 (5128): No heartbeat from core client for 30 sec - exiting 14:54:01 (5128): No heartbeat from core client for 30 sec - exiting 14:54:02 (5128): No heartbeat from core client for 30 sec - exiting 14:54:04 (5128): No heartbeat from core client for 30 sec - exiting 14:54:05 (5128): No heartbeat from core client for 30 sec - exiting 14:54:06 (5128): No heartbeat from core client for 30 sec - exiting 14:54:07 (5128): No heartbeat from core client for 30 sec - exiting 14:54:08 (5128): No heartbeat from core client for 30 sec - exiting 14:54:09 (5128): No heartbeat from core client for 30 sec - exiting 14:54:10 (5128): No heartbeat from core client for 30 sec - exiting 14:54:11 (5128): No heartbeat from core client for 30 sec - exiting 14:54:12 (5128): No heartbeat from core client for 30 sec - exiting 14:54:13 (5128): No heartbeat from core client for 30 sec - exiting 14:54:14 (5128): No heartbeat from core client for 30 sec - exiting 14:54:16 (5128): No heartbeat from core client for 30 sec - exiting 14:54:17 (5128): No heartbeat from core client for 30 sec - exiting 14:54:18 (5128): No heartbeat from core client for 30 sec - exiting 14:54:19 (5128): No heartbeat from core client for 30 sec - exiting 14:54:20 (5128): No heartbeat from core client for 30 sec - exiting 14:54:21 (5128): No heartbeat from core client for 30 sec - exiting 14:54:22 (5128): No heartbeat from core client for 30 sec - exiting 14:54:23 (5128): No heartbeat from core client for 30 sec - exiting 14:54:24 (5128): No heartbeat from core client for 30 sec - exiting 14:54:25 (5128): No heartbeat from core client for 30 sec - exiting 14:54:26 (5128): No heartbeat from core client for 30 sec - exiting 14:54:28 (5128): No heartbeat from core client for 30 sec - exiting 14:54:29 (5128): No heartbeat from core client for 30 sec - exiting 14:54:30 (5128): No heartbeat from core client for 30 sec - exiting 14:54:31 (5128): No heartbeat from core client for 30 sec - exiting 14:54:32 (5128): No heartbeat from core client for 30 sec - exiting 14:54:33 (5128): No heartbeat from core client for 30 sec - exiting 14:54:34 (5128): No heartbeat from core client for 30 sec - exiting 14:54:35 (5128): No heartbeat from core client for 30 sec - exiting 14:54:36 (5128): No heartbeat from core client for 30 sec - exiting 14:54:37 (5128): No heartbeat from core client for 30 sec - exiting 14:54:38 (5128): No heartbeat from core client for 30 sec - exiting 14:54:40 (5128): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:54:41 (5128): No heartbeat from core client for 30 sec - exiting 14:54:42 (5128): No heartbeat from core client for 30 sec - exiting 14:54:43 (5128): No heartbeat from core client for 30 sec - exiting 14:57:50 (4972): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:57:53 (4972): No heartbeat from core client for 30 sec - exiting 14:57:54 (4972): No heartbeat from core client for 30 sec - exiting 14:57:55 (4972): No heartbeat from core client for 30 sec - exiting 14:13:35 (7120): No heartbeat from core client for 30 sec - exiting 14:13:36 (7120): No heartbeat from core client for 30 sec - exiting 14:13:37 (7120): No heartbeat from core client for 30 sec - exiting 14:13:38 (7120): No heartbeat from core client for 30 sec - exiting 14:13:40 (7120): No heartbeat from core client for 30 sec - exiting 14:13:41 (7120): No heartbeat from core client for 30 sec - exiting 14:13:42 (7120): No heartbeat from core client for 30 sec - exiting 14:13:43 (7120): No heartbeat from core client for 30 sec - exiting 14:13:44 (7120): No heartbeat from core client for 30 sec - exiting 14:13:45 (7120): No heartbeat from core client for 30 sec - exiting 14:13:46 (7120): No heartbeat from core client for 30 sec - exiting 14:13:47 (7120): No heartbeat from core client for 30 sec - exiting 14:13:49 (7120): No heartbeat from core client for 30 sec - exiting 14:13:50 (7120): No heartbeat from core client for 30 sec - exiting 14:13:51 (7120): No heartbeat from core client for 30 sec - exiting 14:13:52 (7120): No heartbeat from core client for 30 sec - exiting 14:13:53 (7120): No heartbeat from core client for 30 sec - exiting 14:13:54 (7120): No heartbeat from core client for 30 sec - exiting 14:13:55 (7120): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:02:51 (4192): No heartbeat from core client for 30 sec - exiting 21:02:52 (4192): No heartbeat from core client for 30 sec - exiting 21:02:53 (4192): No heartbeat from core client for 30 sec - exiting 21:02:54 (4192): No heartbeat from core client for 30 sec - exiting 21:02:56 (4192): No heartbeat from core client for 30 sec - exiting 21:02:57 (4192): No heartbeat from core client for 30 sec - exiting 21:02:58 (4192): No heartbeat from core client for 30 sec - exiting 21:02:59 (4192): No heartbeat from core client for 30 sec - exiting 21:03:00 (4192): No heartbeat from core client for 30 sec - exiting 21:03:01 (4192): No heartbeat from core client for 30 sec - exiting 21:03:02 (4192): No heartbeat from core client for 30 sec - exiting 21:03:03 (4192): No heartbeat from core client for 30 sec - exiting 21:03:04 (4192): No heartbeat from core client for 30 sec - exiting 21:03:05 (4192): No heartbeat from core client for 30 sec - exiting 21:03:06 (4192): No heartbeat from core client for 30 sec - exiting 21:03:08 (4192): No heartbeat from core client for 30 sec - exiting 21:03:09 (4192): No heartbeat from core client for 30 sec - exiting 21:03:10 (4192): No heartbeat from core client for 30 sec - exiting 21:03:11 (4192): No heartbeat from core client for 30 sec - exiting 21:03:12 (4192): No heartbeat from core client for 30 sec - exiting 21:03:13 (4192): No heartbeat from core client for 30 sec - exiting 21:03:14 (4192): No heartbeat from core client for 30 sec - exiting 21:03:15 (4192): No heartbeat from core client for 30 sec - exiting 21:03:16 (4192): No heartbeat from core client for 30 sec - exiting 21:03:17 (4192): No heartbeat from core client for 30 sec - exiting 21:03:18 (4192): No heartbeat from core client for 30 sec - exiting 21:03:20 (4192): No heartbeat from core client for 30 sec - exiting 21:03:21 (4192): No heartbeat from core client for 30 sec - exiting 21:03:22 (4192): No heartbeat from core client for 30 sec - exiting 21:03:23 (4192): No heartbeat from core client for 30 sec - exiting 21:03:24 (4192): No heartbeat from core client for 30 sec - exiting 21:03:25 (4192): No heartbeat from core client for 30 sec - exiting 21:03:26 (4192): No heartbeat from core client for 30 sec - exiting 21:03:27 (4192): No heartbeat from core client for 30 sec - exiting 21:03:28 (4192): No heartbeat from core client for 30 sec - exiting 21:03:29 (4192): No heartbeat from core client for 30 sec - exiting 21:03:31 (4192): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 21:17:39 (6860): No heartbeat from core client for 30 sec - exiting 21:17:41 (6860): No heartbeat from core client for 30 sec - exiting 21:17:42 (6860): No heartbeat from core client for 30 sec - exiting 21:17:43 (6860): No heartbeat from core client for 30 sec - exiting 21:17:44 (6860): No heartbeat from core client for 30 sec - exiting 21:17:45 (6860): No heartbeat from core client for 30 sec - exiting 21:17:46 (6860): No heartbeat from core client for 30 sec - exiting 21:17:47 (6860): No heartbeat from core client for 30 sec - exiting 21:17:48 (6860): No heartbeat from core client for 30 sec - exiting 21:17:49 (6860): No heartbeat from core client for 30 sec - exiting 21:17:50 (6860): No heartbeat from core client for 30 sec - exiting 21:17:51 (6860): No heartbeat from core client for 30 sec - exiting 21:17:53 (6860): No heartbeat from core client for 30 sec - exiting 21:17:54 (6860): No heartbeat from core client for 30 sec - exiting 21:17:55 (6860): No heartbeat from core client for 30 sec - exiting 21:17:56 (6860): No heartbeat from core client for 30 sec - exiting 21:17:57 (6860): No heartbeat from core client for 30 sec - exiting 21:17:58 (6860): No heartbeat from core client for 30 sec - exiting 21:17:59 (6860): No heartbeat from core client for 30 sec - exiting 21:18:00 (6860): No heartbeat from core client for 30 sec - exiting 21:18:01 (6860): No heartbeat from core client for 30 sec - exiting 21:18:02 (6860): No heartbeat from core client for 30 sec - exiting 21:18:04 (6860): No heartbeat from core client for 30 sec - exiting 21:18:05 (6860): No heartbeat from core client for 30 sec - exiting 21:18:06 (6860): No heartbeat from core client for 30 sec - exiting 21:18:07 (6860): No heartbeat from core client for 30 sec - exiting 21:18:08 (6860): No heartbeat from core client for 30 sec - exiting 21:18:09 (6860): No heartbeat from core client for 30 sec - exiting 21:18:10 (6860): No heartbeat from core client for 30 sec - exiting 21:18:09 (6860): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:55:18 (6868): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:31:53 (5888): No heartbeat from core client for 30 sec - exiting 21:31:54 (5888): No heartbeat from core client for 30 sec - exiting 21:31:55 (5888): No heartbeat from core client for 30 sec - exiting 21:31:56 (5888): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:31:57 (5888): No heartbeat from core client for 30 sec - exiting 21:31:59 (5888): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 22:59:06 (7160): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:59:07 (7160): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 04:01:56 (5848): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... GSuspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4592, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5876, selfPID=4416, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 1 Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5412, selfPID=5008, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5072, selfPID=4808, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 1 Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3696, selfPID=4772, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 1 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6892, selfPID=4996, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 2 Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6164, selfPID=4460, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 2 Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5008, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6168, selfPID=5428, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7144, selfPID=5068, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6020, selfPID=4640, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 2 Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5076, iMonCtr= 2 del crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 3 Called boinc_finish 07:27:03 (5548): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1232, selfPID=5488, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 0 Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6476, selfPID=6252, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4856, selfPID=5616, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5944, selfPID=3488, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6896, selfPID=464, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=396, selfPID=4640, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 0 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4048, selfPID=2608, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2844, selfPID=6148, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3696, selfPID=3708, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6792, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6884, selfPID=3588, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5500, selfPID=6096, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 1 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3972, selfPID=3436, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6200, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4472, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 2 Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1492, selfPID=5920, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 2 Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2808, selfPID=1576, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 2 C23:15:54 (4252): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... </stderr_txt> <message> <file_xfer_error> <file_name>hadam3p_pnw_c3rk_1975_1_007937839_1_4.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_c3rk_1975_1_007937839_1_5.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_c3rk_1975_1_007937839_1_6.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_c3rk_1975_1_007937839_1_7.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_c3rk_1975_1_007937839_1_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_c3rk_1975_1_007937839_1_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_c3rk_1975_1_007937839_1_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_c3rk_1975_1_007937839_1_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_c3rk_1975_1_007937839_1_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
12 Jun 2012 23:57:17 | 1215735 | 14625454 | hadam3p_pnw_c3rk_1975_1_007937839_1 | 34,656 | 199,315 | 5.7512 |
31 May 2012 00:34:04 | 1215735 | 14625454 | hadam3p_pnw_c3rk_1975_1_007937839_1 | 23,136 | 130,482 | 5.6398 |
14 May 2012 03:00:32 | 1215735 | 14625454 | hadam3p_pnw_c3rk_1975_1_007937839_1 | 11,616 | 60,689 | 5.2246 |
©2024 cpdn.org