climateprediction.net home page
Task 15324458

Task 15324458

Name hadam3p_eu_2rio_1993_1_008208687_0
Workunit 8363811
Created 3 Oct 2012, 17:09:39 UTC
Sent 3 Oct 2012, 17:09:55 UTC
Report deadline 15 Sep 2013, 22:29:55 UTC
Received 7 Nov 2012, 21:40:30 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1092573
Run time 3 days 10 hours 43 min 24 sec
CPU time 2 days 15 hours 56 min 39 sec
Validate state Invalid
Credit 1,591.55
Device peak FLOPS 2.81 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5484, selfPID=5016, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3444, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5296, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
21:06:09 (4440): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:06:10 (4440): No heartbeat from core client for 30 sec - exiting
21:06:11 (4440): No heartbeat from core client for 30 sec - exiting
21:06:12 (4440): No heartbeat from core client for 30 sec - exiting
21:06:13 (4440): No heartbeat from core client for 30 sec - exiting
21:01:08 (2172): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1924, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5124, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4100, selfPID=4696, iMonCtr=1
Model crash detected, will try to restart...
07:30:57 (5212): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:01:04 (4368): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:43:45 (5008): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1584, selfPID=4644, iMonCtr=1
Model crash detected, will try to restart...
21:01:31 (4880): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:06:34 (7172): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1640, selfPID=1640, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7160, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
07:44:31 (5188): No heartbeat from core client for 30 sec - exiting
07:44:32 (5188): No heartbeat from core client for 30 sec - exiting
07:44:33 (5188): No heartbeat from core client for 30 sec - exiting
07:44:34 (5188): No heartbeat from core client for 30 sec - exiting
07:44:35 (5188): No heartbeat from core client for 30 sec - exiting
07:44:37 (5188): No heartbeat from core client for 30 sec - exiting
07:44:38 (5188): No heartbeat from core client for 30 sec - exiting
07:44:39 (5188): No heartbeat from core client for 30 sec - exiting
07:44:40 (5188): No heartbeat from core client for 30 sec - exiting
07:44:41 (5188): No heartbeat from core client for 30 sec - exiting
07:44:42 (5188): No heartbeat from core client for 30 sec - exiting
07:44:43 (5188): No heartbeat from core client for 30 sec - exiting
07:44:44 (5188): No heartbeat from core client for 30 sec - exiting
07:44:45 (5188): No heartbeat from core client for 30 sec - exiting
07:44:46 (5188): No heartbeat from core client for 30 sec - exiting
07:44:47 (5188): No heartbeat from core client for 30 sec - exiting
07:44:49 (5188): No heartbeat from core client for 30 sec - exiting
07:44:50 (5188): No heartbeat from core client for 30 sec - exiting
07:44:51 (5188): No heartbeat from core client for 30 sec - exiting
07:44:52 (5188): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2272, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5100, selfPID=3308, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5004, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5968, selfPID=928, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1316, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3304, selfPID=5328, iMonCtr=1
Model crash detected, will try to restart...
10:32:34 (4352): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1092, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4400, selfPID=2832, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4604, iMonCtr=2
Model crash detected, will try to restart...
21:01:01 (4628): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
07:37:22 (4948): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6068, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2440, iMonCtr=2
Model crash detected, will try to restart...

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_eu_2rio_1993_1_008208687_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_2rio_1993_1_008208687_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_2rio_1993_1_008208687_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_2rio_1993_1_008208687_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
06 Nov 2012 12:01:04 1092573 15324458 hadam3p_eu_2rio_1993_1_008208687_0 92,260 215,300 2.3336
06 Nov 2012 07:05:27 1092573 15324458 hadam3p_eu_2rio_1993_1_008208687_0 92,256 214,942 2.3298
03 Nov 2012 21:18:24 1092573 15324458 hadam3p_eu_2rio_1993_1_008208687_0 80,736 188,169 2.3307
03 Nov 2012 12:40:19 1092573 15324458 hadam3p_eu_2rio_1993_1_008208687_0 69,216 161,680 2.3359
01 Nov 2012 09:56:44 1092573 15324458 hadam3p_eu_2rio_1993_1_008208687_0 57,696 134,419 2.3298
29 Oct 2012 20:28:34 1092573 15324458 hadam3p_eu_2rio_1993_1_008208687_0 46,176 107,572 2.3296
27 Oct 2012 22:32:52 1092573 15324458 hadam3p_eu_2rio_1993_1_008208687_0 34,656 80,774 2.3307
26 Oct 2012 18:44:41 1092573 15324458 hadam3p_eu_2rio_1993_1_008208687_0 23,136 54,115 2.3390
25 Oct 2012 19:26:35 1092573 15324458 hadam3p_eu_2rio_1993_1_008208687_0 11,619 27,771 2.3901
25 Oct 2012 09:39:55 1092573 15324458 hadam3p_eu_2rio_1993_1_008208687_0 11,617 27,413 2.3597
25 Oct 2012 06:54:38 1092573 15324458 hadam3p_eu_2rio_1993_1_008208687_0 11,616 27,071 2.3305


©2024 cpdn.org