climateprediction.net home page
Task 16969714

Task 16969714

Name hadam3p_anz_rm8e_2012_1_008955393_1
Workunit 9099568
Created 2 Sep 2014, 0:30:16 UTC
Sent 2 Sep 2014, 0:37:50 UTC
Report deadline 15 Aug 2015, 5:57:50 UTC
Received 4 Oct 2014, 23:29:18 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1143523
Run time 3 days 18 hours 18 min 28 sec
CPU time 3 days 10 hours 15 min 55 sec
Validate state Invalid
Credit 1,503.36
Device peak FLOPS 2.56 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Australia New Zealand v6.10
windows_intelx86
Stderr
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
22:02:28 (5148): No heartbeat from core client for 30 sec - exiting
22:02:29 (5148): No heartbeat from core client for 30 sec - exiting
22:02:30 (5148): No heartbeat from core client for 30 sec - exiting
22:02:31 (5148): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
22:07:48 (5160): No heartbeat from core client for 30 sec - exiting
22:07:49 (5160): No heartbeat from core client for 30 sec - exiting
22:07:50 (5160): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6128, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3256, selfPID=5324, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5624, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5560, selfPID=2308, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5452, selfPID=4684, iMonCtr=1
Model crash detected, will try to restart...
20:36:23 (5136): No heartbeat from core client for 30 sec - exiting
20:36:24 (5136): No heartbeat from core client for 30 sec - exiting
20:36:25 (5136): No heartbeat from core client for 30 sec - exiting
20:36:26 (5136): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5632, selfPID=4816, iMonCtr=1
Model crash detected, will try to restart...
18:54:54 (4296): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
22:07:11 (4772): No heartbeat from core client for 30 sec - exiting
22:07:12 (4772): No heartbeat from core client for 30 sec - exiting
22:07:13 (4772): No heartbeat from core client for 30 sec - exiting
22:07:14 (4772): No heartbeat from core client for 30 sec - exiting
22:07:15 (4772): No heartbeat from core client for 30 sec - exiting
22:07:16 (4772): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5960, selfPID=5204, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5380, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6012, selfPID=5036, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5380, selfPID=5356, iMonCtr=1
Model crash detected, will try to restart...
19:59:22 (5028): No heartbeat from core client for 30 sec - exiting
19:59:23 (5028): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5444, selfPID=5720, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2440, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5068, iMonCtr=2
Model crash detected, will try to restart...
21:35:08 (4564): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6060, selfPID=6060, iMonCtr=2
10:23:48 (5308): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5180, selfPID=5916, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5764, selfPID=3316, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1488, selfPID=4784, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_anz_rm8e_2012_1_008955393_1_4.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_rm8e_2012_1_008955393_1_5.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_rm8e_2012_1_008955393_1_6.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_rm8e_2012_1_008955393_1_7.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_rm8e_2012_1_008955393_1_8.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_rm8e_2012_1_008955393_1_9.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_rm8e_2012_1_008955393_1_10.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_rm8e_2012_1_008955393_1_11.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_rm8e_2012_1_008955393_1_12.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
23 Sep 2014 01:49:50 1143523 16969714 hadam3p_anz_rm8e_2012_1_008955393_1 34,859 233,514 6.6988
17 Sep 2014 14:56:19 1143523 16969714 hadam3p_anz_rm8e_2012_1_008955393_1 23,339 156,274 6.6958
08 Sep 2014 01:28:14 1143523 16969714 hadam3p_anz_rm8e_2012_1_008955393_1 11,819 78,962 6.6809


©2024 climateprediction.net