climateprediction.net home page
Task 15737445

Task 15737445

Name hadam3p_pnw_q7xg_2043_1_008353677_0
Workunit 8504536
Created 19 Apr 2013, 16:28:33 UTC
Sent 19 Apr 2013, 16:28:51 UTC
Report deadline 1 Apr 2014, 21:48:51 UTC
Received 3 Jul 2013, 20:55:40 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1097161
Run time 4 days 20 hours 35 min 42 sec
CPU time 4 days 12 hours 14 min 47 sec
Validate state Invalid
Credit 2,004.63
Device peak FLOPS 1.65 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Pacific North West v6.09
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3496, selfPID=3056, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5280, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4368, selfPID=6052, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CSuspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3976, selfPID=2416, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3892, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4112, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3444, selfPID=5520, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2688, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7016, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4088, selfPID=5160, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4484, selfPID=5604, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5376, selfPID=5256, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6016, selfPID=3952, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1340, selfPID=4420, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5868, selfPID=5388, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3116, selfPID=5304, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 5
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4500, selfPID=5104, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5980, selfPID=5036, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
10:25:22 (5180): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2664, selfPID=5268, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4308, selfPID=2500, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3276, selfPID=4244, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3156, selfPID=3340, iMonCtr=1
Model crash detected, will try to restart...
CSignal 11 received, exiting...
Called boinc_finish
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5612, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5072, selfPID=5260, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 8
Called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_pnw_q7xg_2043_1_008353677_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_q7xg_2043_1_008353677_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_q7xg_2043_1_008353677_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_q7xg_2043_1_008353677_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
04 Jul 2013 14:00:29 1097161 15737445 hadam3p_pnw_q7xg_2043_1_008353677_0 92,257 388,729 4.2135
03 Jul 2013 18:45:15 1097161 15737445 hadam3p_pnw_q7xg_2043_1_008353677_0 92,256 388,162 4.2074
02 Jul 2013 11:36:44 1097161 15737445 hadam3p_pnw_q7xg_2043_1_008353677_0 80,736 340,197 4.2137
02 Jul 2013 10:20:15 1097161 15737445 hadam3p_pnw_q7xg_2043_1_008353677_0 69,216 291,977 4.2183
23 Jun 2013 19:05:21 1097161 15737445 hadam3p_pnw_q7xg_2043_1_008353677_0 57,696 242,631 4.2053
21 Jun 2013 19:38:25 1097161 15737445 hadam3p_pnw_q7xg_2043_1_008353677_0 46,176 194,697 4.2164
07 Jun 2013 21:47:42 1097161 15737445 hadam3p_pnw_q7xg_2043_1_008353677_0 34,656 146,202 4.2187
14 May 2013 18:47:14 1097161 15737445 hadam3p_pnw_q7xg_2043_1_008353677_0 23,136 98,128 4.2414
02 May 2013 17:28:31 1097161 15737445 hadam3p_pnw_q7xg_2043_1_008353677_0 11,616 50,011 4.3054


©2024 cpdn.org