climateprediction.net home page
Task 12120458

Task 12120458

Name hadam3p_eu_wkgj_1978_1_006850667_0
Workunit 7053983
Created 18 Nov 2010, 18:25:05 UTC
Sent 19 Mar 2011, 3:36:07 UTC
Report deadline 29 Feb 2012, 8:56:07 UTC
Received 7 Apr 2011, 0:05:37 UTC
Server state Over
Outcome Didn't need
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1044542
Run time 9 days 5 hours 13 min 52 sec
CPU time 5 days 23 hours 57 min 40 sec
Validate state Invalid
Credit 1,591.48
Device peak FLOPS 1.99 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.08
windows_intelx86
Stderr
<core_client_version>6.10.18</core_client_version>
<![CDATA[
<stderr_txt>
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1564, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5360, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4900, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1360, iMonCtr=2
Leaving CPDN_Main::Monitor...
07:47:56 (4804): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4332, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2248, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2396, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5928, selfPID=2944, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5132, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=156, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2080, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5372, selfPID=5088, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4888, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=172, selfPID=2012, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5892, selfPID=172, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1780, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4880, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5240, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3400, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7316, iMonCtr=2
Model crash detected, will try to restart...
lobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5260, iMonCtr=2
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5524, selfPID=2732, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4088, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5448, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4488, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3408, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5420, selfPID=672, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4592, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7600, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5464, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6064, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5824, selfPID=5328, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
07:51:39 (5916): No heartbeat from core client for 30 sec - exiting
07:51:43 (5916): No heartbeat from core client for 30 sec - exiting
07:51:44 (5916): No heartbeat from core client for 30 sec - exiting
07:51:45 (5916): No heartbeat from core client for 30 sec - exiting
07:51:46 (5916): No heartbeat from core client for 30 sec - exiting
07:51:47 (5916): No heartbeat from core client for 30 sec - exiting
07:51:48 (5916): No heartbeat from core client for 30 sec - exiting
07:51:49 (5916): No heartbeat from core client for 30 sec - exiting
07:51:50 (5916): No heartbeat from core client for 30 sec - exiting
07:51:51 (5916): No heartbeat from core client for 30 sec - exiting
07:51:52 (5916): No heartbeat from core client for 30 sec - exiting
07:51:53 (5916): No heartbeat from core client for 30 sec - exiting
07:51:54 (5916): No heartbeat from core client for 30 sec - exiting
07:51:55 (5916): No heartbeat from core client for 30 sec - exiting
07:51:56 (5916): No heartbeat from core client for 30 sec - exiting
07:51:57 (5916): No heartbeat from core client for 30 sec - exiting
07:51:58 (5916): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5644, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4188, selfPID=5708, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6072, selfPID=3520, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1300, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4484, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5884, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4280, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4288, selfPID=2112, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4324, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4568, selfPID=4524, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4800, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4624, selfPID=3336, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2664, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2120, selfPID=2328, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5748, selfPID=5748, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5956, selfPID=1100, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3520, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4832, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2188, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6368, selfPID=6368, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9696, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4944, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5532, selfPID=5144, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5016, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3416, selfPID=3580, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
16:01:58 (3300): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5952, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4664, selfPID=868, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
03:09:41 (868): called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_eu_wkgj_1978_1_006850667_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_wkgj_1978_1_006850667_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_wkgj_1978_1_006850667_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_wkgj_1978_1_006850667_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
04 Apr 2011 08:07:50 1044542 12120458 hadam3p_eu_wkgj_1978_1_006850667_0 92,256 473,043 5.1275
02 Apr 2011 15:24:31 1044542 12120458 hadam3p_eu_wkgj_1978_1_006850667_0 80,736 415,928 5.1517
31 Mar 2011 07:29:38 1044542 12120458 hadam3p_eu_wkgj_1978_1_006850667_0 69,216 355,151 5.1311
29 Mar 2011 10:29:32 1044542 12120458 hadam3p_eu_wkgj_1978_1_006850667_0 57,696 294,976 5.1126
27 Mar 2011 23:24:18 1044542 12120458 hadam3p_eu_wkgj_1978_1_006850667_0 46,177 235,635 5.1029
27 Mar 2011 17:26:17 1044542 12120458 hadam3p_eu_wkgj_1978_1_006850667_0 46,176 234,858 5.0861
25 Mar 2011 18:03:26 1044542 12120458 hadam3p_eu_wkgj_1978_1_006850667_0 34,656 177,457 5.1205
23 Mar 2011 15:51:34 1044542 12120458 hadam3p_eu_wkgj_1978_1_006850667_0 23,136 119,463 5.1635
21 Mar 2011 15:51:03 1044542 12120458 hadam3p_eu_wkgj_1978_1_006850667_0 11,616 60,402 5.1999


©2024 cpdn.org