climateprediction.net home page
Task 16451183

Task 16451183

Name hadam3p_anz_aal0_2012_1_008620052_0
Workunit 8766564
Created 2 Apr 2014, 16:21:57 UTC
Sent 22 Apr 2014, 10:21:50 UTC
Report deadline 4 Apr 2015, 15:41:50 UTC
Received 14 Dec 2014, 20:22:30 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1316929
Run time 7 days 5 hours 48 min 39 sec
CPU time 5 days 18 hours 15 min 18 sec
Validate state Invalid
Credit 2,993.82
Device peak FLOPS 2.48 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Australia New Zealand v6.10
windows_intelx86
Stderr
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8512, selfPID=6908, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1920, selfPID=4996, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6036, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4876, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4928, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6112, iMonCtr=2
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4592, selfPID=4376, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2592, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4544, selfPID=1516, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3684, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5072, selfPID=2100, iMonCtr=1
Model crash detected, will try to restart...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2648, selfPID=4704, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4776, selfPID=4332, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4736, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2124, iMonCtr=2
CGlobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5984, iMonCtr=2
ontroller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4764, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
19:12:56 (2964): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:12:57 (2964): No heartbeat from core client for 30 sec - exiting
19:12:58 (2964): No heartbeat from core client for 30 sec - exiting
19:12:59 (2964): No heartbeat from core client for 30 sec - exiting
19:13:00 (2964): No heartbeat from core client for 30 sec - exiting
19:13:01 (2964): No heartbeat from core client for 30 sec - exiting
19:13:03 (2964): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4500, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4580, selfPID=328, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4876, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5596, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5812, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5808, iMonCtr=2
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5064, iMonCtr=2
Model crash detected, will try to restart...
CGCPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CGGCPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4932, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not rController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5528, selfPID=4760, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=1
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=0, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4604, selfPID=3416, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_anz_aal0_2012_1_008620052_0_7.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_aal0_2012_1_008620052_0_8.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_aal0_2012_1_008620052_0_9.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_aal0_2012_1_008620052_0_10.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_aal0_2012_1_008620052_0_11.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_aal0_2012_1_008620052_0_12.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
15 Aug 2014 17:28:10 1316929 16451183 hadam3p_anz_aal0_2012_1_008620052_0 69,419 455,783 6.5657
28 Jun 2014 20:53:45 1316929 16451183 hadam3p_anz_aal0_2012_1_008620052_0 57,899 381,680 6.5922
21 May 2014 21:30:02 1316929 16451183 hadam3p_anz_aal0_2012_1_008620052_0 46,379 307,316 6.6262
08 May 2014 21:23:10 1316929 16451183 hadam3p_anz_aal0_2012_1_008620052_0 34,859 232,523 6.6704
29 Apr 2014 20:54:08 1316929 16451183 hadam3p_anz_aal0_2012_1_008620052_0 23,339 155,849 6.6776
25 Apr 2014 13:53:20 1316929 16451183 hadam3p_anz_aal0_2012_1_008620052_0 11,819 79,230 6.7036


©2024 climateprediction.net