climateprediction.net home page
Task 13197863

Task 13197863

Name hadam3p_eu_2ljv_1981_1_007384444_2
Workunit 7581874
Created 3 Aug 2011, 22:53:16 UTC
Sent 3 Aug 2011, 22:58:20 UTC
Report deadline 16 Jul 2012, 4:18:20 UTC
Received 29 Dec 2011, 0:20:37 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1143523
Run time 3 days 0 hours 32 min 26 sec
CPU time 2 days 19 hours 53 min 50 sec
Validate state Invalid
Credit 1,392.75
Device peak FLOPS 2.56 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>6.10.60</core_client_version>
<![CDATA[
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1524, selfPID=1524, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
20:52:21 (5440): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5188, iMonCtr=2
Model crash detected, will try to restart...
17:38:54 (4676): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4640, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2780, selfPID=1532, iMonCtr=1
Model crash detected, will try to restart...
23:14:43 (3080): No heartbeat from core client for 30 sec - exiting
23:14:44 (3080): No heartbeat from core client for 30 sec - exiting
20:59:27 (3104): No heartbeat from core client for 30 sec - exiting
20:59:28 (3104): No heartbeat from core client for 30 sec - exiting
20:59:30 (3104): No heartbeat from core client for 30 sec - exiting
20:59:31 (3104): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4804, selfPID=4804, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
11:43:56 (2956): No heartbeat from core client for 30 sec - exiting
11:43:57 (2956): No heartbeat from core client for 30 sec - exiting
11:43:58 (2956): No heartbeat from core client for 30 sec - exiting
11:43:59 (2956): No heartbeat from core client for 30 sec - exiting
11:44:00 (2956): No heartbeat from core client for 30 sec - exiting
11:44:01 (2956): No heartbeat from core client for 30 sec - exiting
11:44:03 (2956): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
11:44:04 (2956): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
RegionalGlobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3712, iMonCtr=2
13:53:26 (2796): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2424, selfPID=2424, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3868, selfPID=3632, iMonCtr=1
Model crash detected, will try to restart...
18:42:32 (2748): No heartbeat from core client for 30 sec - exiting
18:42:33 (2748): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:29:18 (2708): No heartbeat from core client for 30 sec - exiting
21:29:20 (2708): No heartbeat from core client for 30 sec - exiting
21:29:21 (2708): No heartbeat from core client for 30 sec - exiting
21:29:22 (2708): No heartbeat from core client for 30 sec - exiting
21:29:23 (2708): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6096, selfPID=4048, iMonCtr=1
Model crash detected, will try to restart...
13:55:07 (2576): No heartbeat from core client for 30 sec - exiting
13:55:08 (2576): No heartbeat from core client for 30 sec - exiting
13:55:09 (2576): No heartbeat from core client for 30 sec - exiting
13:55:10 (2576): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1784, selfPID=3700, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
20:23:52 (2704): No heartbeat from core client for 30 sec - exiting
20:23:53 (2704): No heartbeat from core client for 30 sec - exiting
20:23:54 (2704): No heartbeat from core client for 30 sec - exiting
20:23:56 (2704): No heartbeat from core client for 30 sec - exiting
20:23:57 (2704): No heartbeat from core client for 30 sec - exiting
20:23:58 (2704): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2508, selfPID=2740, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2400, selfPID=2400, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2256, selfPID=5776, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2388, selfPID=2676, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2256, selfPID=2256, iMonCtr=2
18:36:35 (2704): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:41:50 (2720): No heartbeat from core client for 30 sec - exiting
19:41:51 (2720): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
21:43:20 (2732): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...

Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO                                                                                                                                                                                           tmp/xaakm.pipe_dummy                                                            2048    
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_eu_2ljv_1981_1_007384444_2_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_2ljv_1981_1_007384444_2_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_2ljv_1981_1_007384444_2_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_2ljv_1981_1_007384444_2_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_2ljv_1981_1_007384444_2_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
27 Dec 2011 23:12:27 1143523 13197863 hadam3p_eu_2ljv_1981_1_007384444_2 80,736 232,684 2.8820
25 Dec 2011 23:08:13 1143523 13197863 hadam3p_eu_2ljv_1981_1_007384444_2 69,216 199,248 2.8786
24 Dec 2011 15:31:13 1143523 13197863 hadam3p_eu_2ljv_1981_1_007384444_2 57,696 166,248 2.8814
16 Dec 2011 23:31:25 1143523 13197863 hadam3p_eu_2ljv_1981_1_007384444_2 46,176 132,460 2.8686
08 Sep 2011 01:28:14 1143523 13197863 hadam3p_eu_2ljv_1981_1_007384444_2 34,656 98,262 2.8354
06 Sep 2011 00:47:18 1143523 13197863 hadam3p_eu_2ljv_1981_1_007384444_2 23,136 65,687 2.8392
20 Aug 2011 23:33:02 1143523 13197863 hadam3p_eu_2ljv_1981_1_007384444_2 11,616 32,544 2.8017


©2024 climateprediction.net