climateprediction.net home page
Task 15295914

Task 15295914

Name hadam3p_eu_67lg_2009_1_007471857_2
Workunit 7669360
Created 21 Sep 2012, 19:20:36 UTC
Sent 21 Sep 2012, 19:20:43 UTC
Report deadline 4 Sep 2013, 0:40:43 UTC
Received 6 Oct 2012, 0:50:09 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1185147
Run time 3 days 15 hours 21 min 52 sec
CPU time 3 days 11 hours 30 min 7 sec
Validate state Invalid
Credit 1,790.21
Device peak FLOPS 2.32 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>7.0.25</core_client_version>
<![CDATA[
<stderr_txt>
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1636, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5908, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=400, selfPID=676, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5132, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5112, selfPID=3408, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4072, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5324, selfPID=5060, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
09:07:19 (2628): No heartbeat from core client for 30 sec - exiting
09:07:20 (2628): No heartbeat from core client for 30 sec - exiting
09:07:21 (2628): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=904, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3612, selfPID=1840, iMonCtr=1
Model crash detected, will try to restart...
19:32:11 (3020): No heartbeat from core client for 30 sec - exiting
19:32:12 (3020): No heartbeat from core client for 30 sec - exiting
19:32:13 (3020): No heartbeat from core client for 30 sec - exiting
19:32:14 (3020): No heartbeat from core client for 30 sec - exiting
19:32:15 (3020): No heartbeat from core client for 30 sec - exiting
19:32:16 (3020): No heartbeat from core client for 30 sec - exiting
19:32:18 (3020): No heartbeat from core client for 30 sec - exiting
19:32:19 (3020): No heartbeat from core client for 30 sec - exiting
19:32:20 (3020): No heartbeat from core client for 30 sec - exiting
19:32:21 (3020): No heartbeat from core client for 30 sec - exiting
19:32:22 (3020): No heartbeat from core client for 30 sec - exiting
19:32:23 (3020): No heartbeat from core client for 30 sec - exiting
19:32:24 (3020): No heartbeat from core client for 30 sec - exiting
19:32:25 (3020): No heartbeat from core client for 30 sec - exiting
19:32:26 (3020): No heartbeat from core client for 30 sec - exiting
19:32:27 (3020): No heartbeat from core client for 30 sec - exiting
19:32:28 (3020): No heartbeat from core client for 30 sec - exiting
19:32:30 (3020): No heartbeat from core client for 30 sec - exiting
19:32:31 (3020): No heartbeat from core client for 30 sec - exiting
19:32:32 (3020): No heartbeat from core client for 30 sec - exiting
19:32:33 (3020): No heartbeat from core client for 30 sec - exiting
19:32:34 (3020): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
07:06:05 (3148): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5400, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5480, iMonCtr=2
11:44:36 (4148): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5796, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5592, selfPID=2540, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1120, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3192, selfPID=5204, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5388, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3404, selfPID=3404, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5672, selfPID=2692, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5348, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3080, selfPID=4200, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_eu_67lg_2009_1_007471857/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_eu_67lg_2009_1_007471857/dataout/region_restart.day after 11 attempts
18:34:51 (2576): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_eu_67lg_2009_1_007471857/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_eu_67lg_2009_1_007471857/dataout/region_restart.day after 11 attempts

Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO                                                                                                                                                                                           tmp/xaakm.pipe_dummy                                                            2048    

Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO                                                                                                                                                                                           tmp/xaakg.pipe_dummy                                                            2048    
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_eu_67lg_2009_1_007471857_2_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_67lg_2009_1_007471857_2_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_67lg_2009_1_007471857_2_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
05 Oct 2012 19:01:39 1185147 15295914 hadam3p_eu_67lg_2009_1_007471857_2 103,776 288,921 2.7841
04 Oct 2012 12:28:59 1185147 15295914 hadam3p_eu_67lg_2009_1_007471857_2 92,256 256,004 2.7749
04 Oct 2012 02:26:44 1185147 15295914 hadam3p_eu_67lg_2009_1_007471857_2 80,736 224,494 2.7806
02 Oct 2012 16:09:07 1185147 15295914 hadam3p_eu_67lg_2009_1_007471857_2 69,216 193,769 2.7995
27 Sep 2012 20:22:38 1185147 15295914 hadam3p_eu_67lg_2009_1_007471857_2 57,696 161,498 2.7991
25 Sep 2012 19:16:55 1185147 15295914 hadam3p_eu_67lg_2009_1_007471857_2 46,176 130,333 2.8225
24 Sep 2012 21:55:19 1185147 15295914 hadam3p_eu_67lg_2009_1_007471857_2 34,656 98,770 2.8500
22 Sep 2012 16:19:36 1185147 15295914 hadam3p_eu_67lg_2009_1_007471857_2 23,136 66,969 2.8946
22 Sep 2012 07:02:37 1185147 15295914 hadam3p_eu_67lg_2009_1_007471857_2 11,616 33,607 2.8932


©2024 climateprediction.net