climateprediction.net home page
Task 16424746

Task 16424746

Name hadam3p_anz_n73b_2012_1_008596807_1
Workunit 8743319
Created 28 Mar 2014, 22:32:19 UTC
Sent 28 Mar 2014, 22:57:48 UTC
Report deadline 11 Mar 2015, 4:17:48 UTC
Received 5 Apr 2014, 13:18:54 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1154356
Run time 4 days 5 hours 51 min 27 sec
CPU time 3 days 22 hours 5 min
Validate state Invalid
Credit 2,000.18
Device peak FLOPS 3.16 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Australia New Zealand v6.10
windows_intelx86
Stderr
<core_client_version>7.0.64</core_client_version>
<![CDATA[
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2392, iMonCtr=2
Model crash detected, will try to restart...
13:58:20 (3776): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:56:41 (2692): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
17:46:02 (600): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:53:46 (732): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:52:44 (3660): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
22:41:38 (3836): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:25:11 (1084): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:46:27 (3256): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
21:40:13 (3828): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
22:21:12 (4736): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3552, selfPID=3552, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3520, iMonCtr=2
Model crash detected, will try to restart...
21:11:50 (2176): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=364, iMonCtr=2
12:03:26 (2100): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:57:52 (4560): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3664, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3672, selfPID=3928, iMonCtr=1
Model crash detected, will try to restart...
17:23:36 (3512): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
19:07:13 (2344): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
20:51:07 (2984): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=5040, iMonCtr=1
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1224, selfPID=1224, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1224, selfPID=2828, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_anz_n73b_2012_1_008596807_1_5.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n73b_2012_1_008596807_1_6.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n73b_2012_1_008596807_1_7.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n73b_2012_1_008596807_1_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n73b_2012_1_008596807_1_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n73b_2012_1_008596807_1_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n73b_2012_1_008596807_1_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n73b_2012_1_008596807_1_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
03 Apr 2014 21:37:37 1154356 16424746 hadam3p_anz_n73b_2012_1_008596807_1 46,379 283,243 6.1071
02 Apr 2014 00:35:26 1154356 16424746 hadam3p_anz_n73b_2012_1_008596807_1 34,859 213,270 6.1181
31 Mar 2014 03:16:22 1154356 16424746 hadam3p_anz_n73b_2012_1_008596807_1 23,339 141,798 6.0756
29 Mar 2014 22:10:15 1154356 16424746 hadam3p_anz_n73b_2012_1_008596807_1 11,819 72,140 6.1037


©2024 cpdn.org