climateprediction.net home page
Task 16446781

Task 16446781

Name hadam3p_anz_a79a_2012_1_008615742_0
Workunit 8762254
Created 2 Apr 2014, 15:40:19 UTC
Sent 29 Apr 2014, 0:04:56 UTC
Report deadline 11 Apr 2015, 5:24:56 UTC
Received 17 May 2014, 7:25:36 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1067463
Run time 7 days 18 hours 14 min 43 sec
CPU time 7 days 10 hours 0 min 54 sec
Validate state Invalid
Credit 3,490.64
Device peak FLOPS 2.53 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Australia New Zealand v6.10
windows_intelx86
Stderr
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10212, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4976, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6680, selfPID=3196, iMonCtr=1
Model crash detected, will try to restart...
09:33:08 (6196): No heartbeat from core client for 30 sec - exiting
09:33:09 (6196): No heartbeat from core client for 30 sec - exiting
09:33:10 (6196): No heartbeat from core client for 30 sec - exiting
09:33:11 (6196): No heartbeat from core client for 30 sec - exiting
09:33:12 (6196): No heartbeat from core client for 30 sec - exiting
09:33:13 (6196): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5428, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5580, selfPID=2536, iMonCtr=1
Model crash detected, will try to restart...
12:57:32 (4544): No heartbeat from core client for 30 sec - exiting
12:57:33 (4544): No heartbeat from core client for 30 sec - exiting
12:57:34 (4544): No heartbeat from core client for 30 sec - exiting
12:57:35 (4544): No heartbeat from core client for 30 sec - exiting
12:57:36 (4544): No heartbeat from core client for 30 sec - exiting
12:57:37 (4544): No heartbeat from core client for 30 sec - exiting
12:57:38 (4544): No heartbeat from core client for 30 sec - exiting
12:57:39 (4544): No heartbeat from core client for 30 sec - exiting
12:57:40 (4544): No heartbeat from core client for 30 sec - exiting
12:57:41 (4544): No heartbeat from core client for 30 sec - exiting
12:57:42 (4544): No heartbeat from core client for 30 sec - exiting
12:57:43 (4544): No heartbeat from core client for 30 sec - exiting
12:57:44 (4544): No heartbeat from core client for 30 sec - exiting
12:57:45 (4544): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5196, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6868, selfPID=5712, iMonCtr=1
Model crash detected, will try to restart...
14:16:11 (1956): No heartbeat from core client for 30 sec - exiting
14:16:12 (1956): No heartbeat from core client for 30 sec - exiting
14:16:13 (1956): No heartbeat from core client for 30 sec - exiting
14:16:14 (1956): No heartbeat from core client for 30 sec - exiting
14:16:15 (1956): No heartbeat from core client for 30 sec - exiting
14:16:16 (1956): No heartbeat from core client for 30 sec - exiting
14:16:17 (1956): No heartbeat from core client for 30 sec - exiting
14:16:18 (1956): No heartbeat from core client for 30 sec - exiting
14:16:19 (1956): No heartbeat from core client for 30 sec - exiting
14:16:20 (1956): No heartbeat from core client for 30 sec - exiting
14:16:21 (1956): No heartbeat from core client for 30 sec - exiting
14:16:22 (1956): No heartbeat from core client for 30 sec - exiting
14:16:23 (1956): No heartbeat from core client for 30 sec - exiting
14:16:24 (1956): No heartbeat from core client for 30 sec - exiting
14:16:25 (1956): No heartbeat from core client for 30 sec - exiting
14:16:26 (1956): No heartbeat from core client for 30 sec - exiting
14:16:27 (1956): No heartbeat from core client for 30 sec - exiting
14:16:28 (1956): No heartbeat from core client for 30 sec - exiting
14:16:29 (1956): No heartbeat from core client for 30 sec - exiting
14:16:30 (1956): No heartbeat from core client for 30 sec - exiting
14:16:31 (1956): No heartbeat from core client for 30 sec - exiting
14:16:32 (1956): No heartbeat from core client for 30 sec - exiting
14:16:33 (1956): No heartbeat from core client for 30 sec - exiting
14:16:34 (1956): No heartbeat from core client for 30 sec - exiting
14:16:35 (1956): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7684, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6512, selfPID=7584, iMonCtr=1
Model crash detected, will try to restart...
20:19:45 (1520): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:26:21 (23176): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
GCGntroller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6256, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5928, selfPID=10044, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6212, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6916, selfPID=3688, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=420, selfPID=4208, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10752, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7016, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6528, selfPID=4184, iMonCtr=1
Model crash detected, will try to restart...
10:19:07 (6448): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
19:16:41 (3700): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:27:57 (14288): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4616, selfPID=4616, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6424, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6716, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8812, iMonCtr=2
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6808, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6852, selfPID=7160, iMonCtr=1
Model crash detected, will try to restart...
16:40:35 (1200): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CGnlobal Worker:: CPDN process is not running, exiting, bRetVal = 1, cIh=0, selfPID=6980, iMonCtr=2
Model crash detected, will try to restart...
eckPID=0, selfPID=1264, iMonCtr=2
Leaving CPDN_Main::Monitor...
08:58:26 (6300): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7572, iMonCtr=2
Model crash detected, will try to restart...
07:16:19 (4900): No heartbeat from core client for 30 sec - exiting
07:16:20 (4900): No heartbeat from core client for 30 sec - exiting
07:16:21 (4900): No heartbeat from core client for 30 sec - exiting
07:16:22 (4900): No heartbeat from core client for 30 sec - exiting
07:16:23 (4900): No heartbeat from core client for 30 sec - exiting
07:16:24 (4900): No heartbeat from core client for 30 sec - exiting
07:16:25 (4900): No heartbeat from core client for 30 sec - exiting
07:16:26 (4900): No heartbeat from core client for 30 sec - exiting
07:16:27 (4900): No heartbeat from core client for 30 sec - exiting
07:16:28 (4900): No heartbeat from core client for 30 sec - exiting
07:16:29 (4900): No heartbeat from core client for 30 sec - exiting
07:16:30 (4900): No heartbeat from core client for 30 sec - exiting
07:16:31 (4900): No heartbeat from core client for 30 sec - exiting
07:16:32 (4900): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7872, selfPID=11692, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_anz_a79a_2012_1_008615742_0_8.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_a79a_2012_1_008615742_0_9.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_a79a_2012_1_008615742_0_10.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_a79a_2012_1_008615742_0_11.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_a79a_2012_1_008615742_0_12.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
15 May 2014 18:41:22 1067463 16446781 hadam3p_anz_a79a_2012_1_008615742_0 80,939 609,644 7.5321
14 May 2014 01:10:05 1067463 16446781 hadam3p_anz_a79a_2012_1_008615742_0 69,419 523,133 7.5359
11 May 2014 16:41:21 1067463 16446781 hadam3p_anz_a79a_2012_1_008615742_0 57,899 436,595 7.5406
08 May 2014 21:03:05 1067463 16446781 hadam3p_anz_a79a_2012_1_008615742_0 46,379 349,647 7.5389
07 May 2014 08:33:41 1067463 16446781 hadam3p_anz_a79a_2012_1_008615742_0 34,859 263,133 7.5485
04 May 2014 23:31:57 1067463 16446781 hadam3p_anz_a79a_2012_1_008615742_0 23,339 176,696 7.5708
02 May 2014 02:37:28 1067463 16446781 hadam3p_anz_a79a_2012_1_008615742_0 11,819 90,512 7.6582


©2024 climateprediction.net