Name | hadam3p_pnw_c3zz_1998_1_007938142_0 |
Workunit | 8093254 |
Created | 18 Apr 2012, 17:30:33 UTC |
Sent | 28 Apr 2012, 19:18:30 UTC |
Report deadline | 11 Apr 2013, 0:38:30 UTC |
Received | 6 May 2012, 4:07:41 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1051974 |
Run time | 3 days 20 hours 43 min 15 sec |
CPU time | 3 days 10 hours 35 min 31 sec |
Validate state | Invalid |
Credit | 2,007.05 |
Device peak FLOPS | 2.60 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Pacific North West v6.09 windows_intelx86 |
Stderr | <core_client_version>6.10.18</core_client_version> <![CDATA[ <stderr_txt> 00:19:50 (7928): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:19:51 (7928): No heartbeat from core client for 30 sec - exiting 00:19:52 (7928): No heartbeat from core client for 30 sec - exiting 00:19:53 (7928): No heartbeat from core client for 30 sec - exiting 00:19:54 (7928): No heartbeat from core client for 30 sec - exiting 00:19:55 (7928): No heartbeat from core client for 30 sec - exiting 00:19:56 (7928): No heartbeat from core client for 30 sec - exiting 00:19:57 (7928): No heartbeat from core client for 30 sec - exiting 00:19:58 (7928): No heartbeat from core client for 30 sec - exiting 00:19:59 (7928): No heartbeat from core client for 30 sec - exiting 00:20:00 (7928): No heartbeat from core client for 30 sec - exiting 00:20:01 (7928): No heartbeat from core client for 30 sec - exiting 00:20:02 (7928): No heartbeat from core client for 30 sec - exiting 00:20:03 (7928): No heartbeat from core client for 30 sec - exiting 00:20:04 (7928): No heartbeat from core client for 30 sec - exiting 00:20:05 (7928): No heartbeat from core client for 30 sec - exiting 00:20:06 (7928): No heartbeat from core client for 30 sec - exiting 00:20:07 (7928): No heartbeat from core client for 30 sec - exiting Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=988, selfPID=988, iMonCtr=1 Regional Worker:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=10912, selfPID=12176, iMonCtr=1 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=10912, selfPID=10912, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8396, selfPID=13084, iMonCtr=1 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8396, selfPID=8396, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5336, selfPID=5248, iMonCtr=1 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5336, selfPID=5336, iMonCtr=1 01:53:47 (5668): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:53:48 (5668): No heartbeat from core client for 30 sec - exiting 01:53:49 (5668): No heartbeat from core client for 30 sec - exiting 01:53:50 (5668): No heartbeat from core client for 30 sec - exiting 01:53:51 (5668): No heartbeat from core client for 30 sec - exiting 01:53:52 (5668): No heartbeat from core client for 30 sec - exiting 01:53:53 (5668): No heartbeat from core client for 30 sec - exiting Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7824, selfPID=4892, iMonCtr=1 Global Worker:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8600, selfPID=8600, iMonCtr=1 Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8600, selfPID=9048, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4824, selfPID=5464, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7860, selfPID=7860, iMonCtr=1 Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7860, selfPID=7844, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8412, selfPID=8728, iMonCtr=1 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8412, selfPID=8412, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=10000, selfPID=9836, iMonCtr=1 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=10000, selfPID=10000, iMonCtr=1 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=10200, selfPID=8088, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5916, selfPID=6040, iMonCtr=1 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5916, selfPID=5916, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=9500, selfPID=9168, iMonCtr=1 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=9500, selfPID=9500, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=9284, selfPID=9160, iMonCtr=1 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=9284, selfPID=9284, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6532, selfPID=10248, iMonCtr=1 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6532, selfPID=6532, iMonCtr=1 01:37:28 (10396): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:37:29 (10396): No heartbeat from core client for 30 sec - exiting 01:37:30 (10396): No heartbeat from core client for 30 sec - exiting 01:37:31 (10396): No heartbeat from core client for 30 sec - exiting 01:37:32 (10396): No heartbeat from core client for 30 sec - exiting 01:37:33 (10396): No heartbeat from core client for 30 sec - exiting 01:37:34 (10396): No heartbeat from core client for 30 sec - exiting 01:37:35 (10396): No heartbeat from core client for 30 sec - exiting 01:37:36 (10396): No heartbeat from core client for 30 sec - exiting 01:37:37 (10396): No heartbeat from core client for 30 sec - exiting 01:37:38 (10396): No heartbeat from core client for 30 sec - exiting 01:37:40 (10396): No heartbeat from core client for 30 sec - exiting 01:37:41 (10396): No heartbeat from core client for 30 sec - exiting 01:37:42 (10396): No heartbeat from core client for 30 sec - exiting 01:37:43 (10396): No heartbeat from core client for 30 sec - exiting 01:37:44 (10396): No heartbeat from core client for 30 sec - exiting 01:37:46 (10396): No heartbeat from core client for 30 sec - exiting 01:37:47 (10396): No heartbeat from core client for 30 sec - exiting 01:37:48 (10396): No heartbeat from core client for 30 sec - exiting Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=10592, selfPID=10592, iMonCtr=1 Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=10592, selfPID=2428, iMonCtr=1 02:39:04 (9248): No heartbeat from core client for 30 sec - exiting 02:39:05 (9248): No heartbeat from core client for 30 sec - exiting 02:39:06 (9248): No heartbeat from core client for 30 sec - exiting 02:39:07 (9248): No heartbeat from core client for 30 sec - exiting 02:39:08 (9248): No heartbeat from core client for 30 sec - exiting 02:39:09 (9248): No heartbeat from core client for 30 sec - exiting 02:39:10 (9248): No heartbeat from core client for 30 sec - exiting 02:39:11 (9248): No heartbeat from core client for 30 sec - exiting 02:39:12 (9248): No heartbeat from core client for 30 sec - exiting 02:39:13 (9248): No heartbeat from core client for 30 sec - exiting 02:39:14 (9248): No heartbeat from core client for 30 sec - exiting 02:39:15 (9248): No heartbeat from core client for 30 sec - exiting 02:39:16 (9248): No heartbeat from core client for 30 sec - exiting 02:39:17 (9248): No heartbeat from core client for 30 sec - exiting 02:39:18 (9248): No heartbeat from core client for 30 sec - exiting 02:39:19 (9248): No heartbeat from core client for 30 sec - exiting 02:39:20 (9248): No heartbeat from core client for 30 sec - exiting 02:39:21 (9248): No heartbeat from core client for 30 sec - exiting 02:39:22 (9248): No heartbeat from core client for 30 sec - exiting 02:39:23 (9248): No heartbeat from core client for 30 sec - exiting 02:39:24 (9248): No heartbeat from core client for 30 sec - exiting 02:39:25 (9248): No heartbeat from core client for 30 sec - exiting 02:39:26 (9248): No heartbeat from core client for 30 sec - exiting 02:39:27 (9248): No heartbeat from core client for 30 sec - exiting 02:39:28 (9248): No heartbeat from core client for 30 sec - exiting 02:39:29 (9248): No heartbeat from core client for 30 sec - exiting 02:39:30 (9248): No heartbeat from core client for 30 sec - exiting 02:39:31 (9248): No heartbeat from core client for 30 sec - exiting 02:39:32 (9248): No heartbeat from core client for 30 sec - exiting 02:39:33 (9248): No heartbeat from core client for 30 sec - exiting 02:39:34 (9248): No heartbeat from core client for 30 sec - exiting 02:39:35 (9248): No heartbeat from core client for 30 sec - exiting 02:39:36 (9248): No heartbeat from core client for 30 sec - exiting 02:39:37 (9248): No heartbeat from core client for 30 sec - exiting 02:39:38 (9248): No heartbeat from core client for 30 sec - exiting 02:39:39 (9248): No heartbeat from core client for 30 sec - exiting 02:39:40 (9248): No heartbeat from core client for 30 sec - exiting 02:39:41 (9248): No heartbeat from core client for 30 sec - exiting 02:39:42 (9248): No heartbeat from core client for 30 sec - exiting 02:39:43 (9248): No heartbeat from core client for 30 sec - exiting 02:39:44 (9248): No heartbeat from core client for 30 sec - exiting 02:39:45 (9248): No heartbeat from core client for 30 sec - exiting 02:39:46 (9248): No heartbeat from core client for 30 sec - exiting 02:39:47 (9248): No heartbeat from core client for 30 sec - exiting 02:39:48 (9248): No heartbeat from core client for 30 sec - exiting 02:39:49 (9248): No heartbeat from core client for 30 sec - exiting 02:39:50 (9248): No heartbeat from core client for 30 sec - exiting 02:39:51 (9248): No heartbeat from core client for 30 sec - exiting 02:39:52 (9248): No heartbeat from core client for 30 sec - exiting 02:40:23 (9248): No heartbeat from core client for 30 sec - exiting 02:40:24 (9248): No heartbeat from core client for 30 sec - exiting 02:40:25 (9248): No heartbeat from core client for 30 sec - exiting 02:40:26 (9248): No heartbeat from core client for 30 sec - exiting 02:40:27 (9248): No heartbeat from core client for 30 sec - exiting 02:40:28 (9248): No heartbeat from core client for 30 sec - exiting 02:40:29 (9248): No heartbeat from core client for 30 sec - exiting 02:40:30 (9248): No heartbeat from core client for 30 sec - exiting 02:40:31 (9248): No heartbeat from core client for 30 sec - exiting 02:40:32 (9248): No heartbeat from core client for 30 sec - exiting 02:40:33 (9248): No heartbeat from core client for 30 sec - exiting 02:40:34 (9248): No heartbeat from core client for 30 sec - exiting 02:40:35 (9248): No heartbeat from core client for 30 sec - exiting 02:40:36 (9248): No heartbeat from core client for 30 sec - exiting 02:40:37 (9248): No heartbeat from core client for 30 sec - exiting 02:40:38 (9248): No heartbeat from core client for 30 sec - exiting 02:40:39 (9248): No heartbeat from core client for 30 sec - exiting 02:40:40 (9248): No heartbeat from core client for 30 sec - exiting 02:40:41 (9248): No heartbeat from core client for 30 sec - exiting 02:41:18 (9248): No heartbeat from core client for 30 sec - exiting 02:41:19 (9248): No heartbeat from core client for 30 sec - exiting 02:41:20 (9248): No heartbeat from core client for 30 sec - exiting 02:41:21 (9248): No heartbeat from core client for 30 sec - exiting 02:41:22 (9248): No heartbeat from core client for 30 sec - exiting 02:41:23 (9248): No heartbeat from core client for 30 sec - exiting 02:41:24 (9248): No heartbeat from core client for 30 sec - exiting 02:41:25 (9248): No heartbeat from core client for 30 sec - exiting 02:41:26 (9248): No heartbeat from core client for 30 sec - exiting 02:41:27 (9248): No heartbeat from core client for 30 sec - exiting 02:41:28 (9248): No heartbeat from core client for 30 sec - exiting 02:41:29 (9248): No heartbeat from core client for 30 sec - exiting 02:41:30 (9248): No heartbeat from core client for 30 sec - exiting 02:41:31 (9248): No heartbeat from core client for 30 sec - exiting 02:41:32 (9248): No heartbeat from core client for 30 sec - exiting 02:41:33 (9248): No heartbeat from core client for 30 sec - exiting 02:41:34 (9248): No heartbeat from core client for 30 sec - exiting 02:41:35 (9248): No heartbeat from core client for 30 sec - exiting 02:41:36 (9248): No heartbeat from core client for 30 sec - exiting 02:41:37 (9248): No heartbeat from core client for 30 sec - exiting 02:41:38 (9248): No heartbeat from core client for 30 sec - exiting 02:41:39 (9248): No heartbeat from core client for 30 sec - exiting 02:41:40 (9248): No heartbeat from core client for 30 sec - exiting 02:41:41 (9248): No heartbeat from core client for 30 sec - exiting 02:41:42 (9248): No heartbeat from core client for 30 sec - exiting 02:41:43 (9248): No heartbeat from core client for 30 sec - exiting 02:41:44 (9248): No heartbeat from core client for 30 sec - exiting 02:41:45 (9248): No heartbeat from core client for 30 sec - exiting 02:41:46 (9248): No heartbeat from core client for 30 sec - exiting 02:41:47 (9248): No heartbeat from core client for 30 sec - exiting 02:41:48 (9248): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=9420, selfPID=9420, iMonCtr=1 Regional Worker:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 03:39:07 (9788): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:39:08 (9788): No heartbeat from core client for 30 sec - exiting 03:39:09 (9788): No heartbeat from core client for 30 sec - exiting 03:39:10 (9788): No heartbeat from core client for 30 sec - exiting 03:39:11 (9788): No heartbeat from core client for 30 sec - exiting 03:39:12 (9788): No heartbeat from core client for 30 sec - exiting 03:39:13 (9788): No heartbeat from core client for 30 sec - exiting Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2620, selfPID=10356, iMonCtr=1 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2620, selfPID=2620, iMonCtr=1 03:59:45 (6164): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:59:46 (6164): No heartbeat from core client for 30 sec - exiting 03:59:47 (6164): No heartbeat from core client for 30 sec - exiting Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=288, selfPID=288, iMonCtr=1 Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=288, selfPID=8896, iMonCtr=1 05:06:49 (4972): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:06:50 (4972): No heartbeat from core client for 30 sec - exiting 05:06:51 (4972): No heartbeat from core client for 30 sec - exiting 05:06:52 (4972): No heartbeat from core client for 30 sec - exiting 05:06:53 (4972): No heartbeat from core client for 30 sec - exiting 05:06:54 (4972): No heartbeat from core client for 30 sec - exiting 05:06:55 (4972): No heartbeat from core client for 30 sec - exiting Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=9684, selfPID=8476, iMonCtr=1 Atmos Restart file copy failed on atmos_restart.day Global Worker:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crashed: Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10000, iMonCtr=2 Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 8 Called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>hadam3p_pnw_c3zz_1998_1_007938142_0_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_c3zz_1998_1_007938142_0_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_c3zz_1998_1_007938142_0_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_c3zz_1998_1_007938142_0_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
06 May 2012 01:44:35 | 1051974 | 14480495 | hadam3p_pnw_c3zz_1998_1_007938142_0 | 92,368 | 292,705 | 3.1689 |
06 May 2012 01:44:35 | 1051974 | 14480495 | hadam3p_pnw_c3zz_1998_1_007938142_0 | 92,256 | 291,787 | 3.1628 |
05 May 2012 07:11:30 | 1051974 | 14480495 | hadam3p_pnw_c3zz_1998_1_007938142_0 | 80,736 | 258,226 | 3.1984 |
04 May 2012 19:45:50 | 1051974 | 14480495 | hadam3p_pnw_c3zz_1998_1_007938142_0 | 69,216 | 219,622 | 3.1730 |
03 May 2012 20:36:11 | 1051974 | 14480495 | hadam3p_pnw_c3zz_1998_1_007938142_0 | 57,696 | 187,435 | 3.2487 |
02 May 2012 21:16:36 | 1051974 | 14480495 | hadam3p_pnw_c3zz_1998_1_007938142_0 | 46,176 | 149,464 | 3.2368 |
01 May 2012 04:06:21 | 1051974 | 14480495 | hadam3p_pnw_c3zz_1998_1_007938142_0 | 34,656 | 110,468 | 3.1876 |
29 Apr 2012 23:42:46 | 1051974 | 14480495 | hadam3p_pnw_c3zz_1998_1_007938142_0 | 23,136 | 71,667 | 3.0976 |
29 Apr 2012 10:59:35 | 1051974 | 14480495 | hadam3p_pnw_c3zz_1998_1_007938142_0 | 11,616 | 33,279 | 2.8649 |
©2024 cpdn.org