Name | hadam3p_pnw_8oz4_2000_1_008000792_2 |
Workunit | 8155906 |
Created | 8 Jul 2012, 8:00:13 UTC |
Sent | 8 Jul 2012, 8:00:29 UTC |
Report deadline | 20 Jun 2013, 13:20:29 UTC |
Received | 30 Jul 2012, 17:35:32 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1129572 |
Run time | 2 days 10 hours 3 min 57 sec |
CPU time | 1 days 19 hours 59 min 11 sec |
Validate state | Invalid |
Credit | 252.40 |
Device peak FLOPS | 1.55 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Pacific North West v6.09 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7772, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3836, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6244, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7512, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3736, iMonCtr=2 Model crash detected, will try to restart... 23:12:11 (6800): No heartbeat from core client for 30 sec - exiting 23:12:12 (6800): No heartbeat from core client for 30 sec - exiting 23:12:13 (6800): No heartbeat from core client for 30 sec - exiting 23:12:14 (6800): No heartbeat from core client for 30 sec - exiting 23:12:17 (6800): No heartbeat from core client for 30 sec - exiting 23:12:18 (6800): No heartbeat from core client for 30 sec - exiting 23:12:19 (6800): No heartbeat from core client for 30 sec - exiting 23:12:20 (6800): No heartbeat from core client for 30 sec - exiting 23:12:22 (6800): No heartbeat from core client for 30 sec - exiting 23:12:23 (6800): No heartbeat from core client for 30 sec - exiting 23:12:24 (6800): No heartbeat from core client for 30 sec - exiting 23:12:25 (6800): No heartbeat from core client for 30 sec - exiting 23:12:26 (6800): No heartbeat from core client for 30 sec - exiting 23:12:27 (6800): No heartbeat from core client for 30 sec - exiting 23:12:28 (6800): No heartbeat from core client for 30 sec - exiting 23:12:29 (6800): No heartbeat from core client for 30 sec - exiting 23:12:30 (6800): No heartbeat from core client for 30 sec - exiting 23:12:32 (6800): No heartbeat from core client for 30 sec - exiting 23:12:33 (6800): No heartbeat from core client for 30 sec - exiting 23:12:34 (6800): No heartbeat from core client for 30 sec - exiting 23:12:35 (6800): No heartbeat from core client for 30 sec - exiting 23:12:36 (6800): No heartbeat from core client for 30 sec - exiting 23:12:37 (6800): No heartbeat from core client for 30 sec - exiting 23:12:40 (6800): No heartbeat from core client for 30 sec - exiting 23:12:41 (6800): No heartbeat from core client for 30 sec - exiting 23:12:42 (6800): No heartbeat from core client for 30 sec - exiting 23:12:43 (6800): No heartbeat from core client for 30 sec - exiting 23:12:44 (6800): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:12:45 (6800): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6260, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6532, iMonCtr=2 Suspended CPDN Monitor - Suspend request from BOINC... 16:39:41 (5180): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 12:04:09 (1192): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:04:11 (1192): No heartbeat from core client for 30 sec - exiting 12:04:12 (1192): No heartbeat from core client for 30 sec - exiting 12:04:13 (1192): No heartbeat from core client for 30 sec - exiting 12:04:14 (1192): No heartbeat from core client for 30 sec - exiting Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6436, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7024, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5928, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7004, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6100, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 1 Suspended CPDN Monitor - Suspend request from BOINC... Model crashed: REPLANCA: PP HEADERS ON ANCILLARY FILE DO NOT MATCH tmp/xaakm.pipe_dummy 2048 Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 1 Called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>hadam3p_pnw_8oz4_2000_1_008000792_2_2.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_8oz4_2000_1_008000792_2_3.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_8oz4_2000_1_008000792_2_4.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_8oz4_2000_1_008000792_2_5.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_8oz4_2000_1_008000792_2_6.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_8oz4_2000_1_008000792_2_7.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_8oz4_2000_1_008000792_2_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_8oz4_2000_1_008000792_2_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_8oz4_2000_1_008000792_2_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_8oz4_2000_1_008000792_2_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_8oz4_2000_1_008000792_2_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
28 Jul 2012 11:59:40 | 1129572 | 14876078 | hadam3p_pnw_8oz4_2000_1_008000792_2 | 11,616 | 81,389 | 7.0066 |
©2024 cpdn.org