climateprediction.net home page
Task 16344601

Task 16344601

Name hadam3p_eu_k2j7_2013_1_008554525_1
Workunit 8702037
Created 5 Mar 2014, 19:24:41 UTC
Sent 5 Mar 2014, 19:27:16 UTC
Report deadline 16 Feb 2015, 0:47:16 UTC
Received 5 Apr 2014, 15:50:11 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1312373
Run time 3 days 5 hours 8 min 52 sec
CPU time 23 hours 15 min 50 sec
Validate state Invalid
Credit 1,194.02
Device peak FLOPS 2.72 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4228, selfPID=4876, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3084, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4452, selfPID=4452, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4780, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2752, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1656, selfPID=4936, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
23:22:22 (3496): No heartbeat from core client for 30 sec - exiting
23:22:23 (3496): No heartbeat from core client for 30 sec - exiting
23:22:24 (3496): No heartbeat from core client for 30 sec - exiting
23:22:25 (3496): No heartbeat from core client for 30 sec - exiting
23:22:26 (3496): No heartbeat from core client for 30 sec - exiting
23:22:28 (3496): No heartbeat from core client for 30 sec - exiting
23:22:29 (3496): No heartbeat from core client for 30 sec - exiting
23:22:30 (3496): No heartbeat from core client for 30 sec - exiting
23:22:31 (3496): No heartbeat from core client for 30 sec - exiting
23:22:32 (3496): No heartbeat from core client for 30 sec - exiting
23:22:33 (3496): No heartbeat from core client for 30 sec - exiting
23:22:34 (3496): No heartbeat from core client for 30 sec - exiting
23:22:35 (3496): No heartbeat from core client for 30 sec - exiting
23:22:36 (3496): No heartbeat from core client for 30 sec - exiting
23:22:37 (3496): No heartbeat from core client for 30 sec - exiting
23:22:39 (3496): No heartbeat from core client for 30 sec - exiting
23:22:40 (3496): No heartbeat from core client for 30 sec - exiting
23:22:41 (3496): No heartbeat from core client for 30 sec - exiting
23:22:42 (3496): No heartbeat from core client for 30 sec - exiting
23:22:43 (3496): No heartbeat from core client for 30 sec - exiting
23:22:44 (3496): No heartbeat from core client for 30 sec - exiting
23:22:45 (3496): No heartbeat from core client for 30 sec - exiting
23:22:46 (3496): No heartbeat from core client for 30 sec - exiting
23:22:47 (3496): No heartbeat from core client for 30 sec - exiting
23:22:48 (3496): No heartbeat from core client for 30 sec - exiting
23:22:49 (3496): No heartbeat from core client for 30 sec - exiting
23:22:51 (3496): No heartbeat from core client for 30 sec - exiting
23:22:52 (3496): No heartbeat from core client for 30 sec - exiting
23:22:53 (3496): No heartbeat from core client for 30 sec - exiting
23:22:54 (3496): No heartbeat from core client for 30 sec - exiting
23:22:55 (3496): No heartbeat from core client for 30 sec - exiting
23:22:56 (3496): No heartbeat from core client for 30 sec - exiting
23:22:57 (3496): No heartbeat from core client for 30 sec - exiting
23:22:58 (3496): No heartbeat from core client for 30 sec - exiting
23:22:59 (3496): No heartbeat from core client for 30 sec - exiting
23:23:00 (3496): No heartbeat from core client for 30 sec - exiting
23:23:01 (3496): No heartbeat from core client for 30 sec - exiting
23:23:03 (3496): No heartbeat from core client for 30 sec - exiting
23:23:04 (3496): No heartbeat from core client for 30 sec - exiting
23:23:05 (3496): No heartbeat from core client for 30 sec - exiting
23:23:06 (3496): No heartbeat from core client for 30 sec - exiting
23:23:07 (3496): No heartbeat from core client for 30 sec - exiting
23:23:08 (3496): No heartbeat from core client for 30 sec - exiting
23:23:09 (3496): No heartbeat from core client for 30 sec - exiting
23:23:10 (3496): No heartbeat from core client for 30 sec - exiting
23:23:11 (3496): No heartbeat from core client for 30 sec - exiting
23:23:12 (3496): No heartbeat from core client for 30 sec - exiting
23:23:13 (3496): No heartbeat from core client for 30 sec - exiting
23:23:15 (3496): No heartbeat from core client for 30 sec - exiting
23:23:16 (3496): No heartbeat from core client for 30 sec - exiting
23:23:17 (3496): No heartbeat from core client for 30 sec - exiting
23:23:18 (3496): No heartbeat from core client for 30 sec - exiting
23:23:19 (3496): No heartbeat from core client for 30 sec - exiting
23:23:20 (3496): No heartbeat from core client for 30 sec - exiting
23:23:21 (3496): No heartbeat from core client for 30 sec - exiting
23:23:22 (3496): No heartbeat from core client for 30 sec - exiting
23:23:23 (3496): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
23:23:24 (3496): No heartbeat from core client for 30 sec - exiting
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3068, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
11:46:26 (4064): No heartbeat from core client for 30 sec - exiting
11:46:27 (4064): No heartbeat from core client for 30 sec - exiting
11:46:28 (4064): No heartbeat from core client for 30 sec - exiting
11:46:29 (4064): No heartbeat from core client for 30 sec - exiting
11:46:30 (4064): No heartbeat from core client for 30 sec - exiting
11:46:31 (4064): No heartbeat from core client for 30 sec - exiting
11:46:32 (4064): No heartbeat from core client for 30 sec - exiting
11:46:33 (4064): No heartbeat from core client for 30 sec - exiting
11:46:35 (4064): No heartbeat from core client for 30 sec - exiting
11:46:36 (4064): No heartbeat from core client for 30 sec - exiting
11:46:37 (4064): No heartbeat from core client for 30 sec - exiting
11:46:38 (4064): No heartbeat from core client for 30 sec - exiting
11:46:39 (4064): No heartbeat from core client for 30 sec - exiting
11:46:40 (4064): No heartbeat from core client for 30 sec - exiting
11:46:41 (4064): No heartbeat from core client for 30 sec - exiting
11:46:42 (4064): No heartbeat from core client for 30 sec - exiting
11:46:43 (4064): No heartbeat from core client for 30 sec - exiting
11:46:44 (4064): No heartbeat from core client for 30 sec - exiting
11:46:45 (4064): No heartbeat from core client for 30 sec - exiting
11:46:47 (4064): No heartbeat from core client for 30 sec - exiting
11:46:48 (4064): No heartbeat from core client for 30 sec - exiting
11:46:49 (4064): No heartbeat from core client for 30 sec - exiting
11:46:50 (4064): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1412, selfPID=5540, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1724, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=396, selfPID=2076, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=608, selfPID=608, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
21:18:52 (4344): No heartbeat from core client for 30 sec - exiting
21:18:53 (4344): No heartbeat from core client for 30 sec - exiting
21:18:54 (4344): No heartbeat from core client for 30 sec - exiting
21:18:55 (4344): No heartbeat from core client for 30 sec - exiting
21:18:56 (4344): No heartbeat from core client for 30 sec - exiting
21:18:57 (4344): No heartbeat from core client for 30 sec - exiting
21:18:58 (4344): No heartbeat from core client for 30 sec - exiting
21:18:59 (4344): No heartbeat from core client for 30 sec - exiting
21:19:00 (4344): No heartbeat from core client for 30 sec - exiting
21:19:02 (4344): No heartbeat from core client for 30 sec - exiting
21:19:03 (4344): No heartbeat from core client for 30 sec - exiting
21:19:04 (4344): No heartbeat from core client for 30 sec - exiting
21:19:05 (4344): No heartbeat from core client for 30 sec - exiting
21:19:06 (4344): No heartbeat from core client for 30 sec - exiting
21:19:07 (4344): No heartbeat from core client for 30 sec - exiting
21:19:08 (4344): No heartbeat from core client for 30 sec - exiting
21:19:09 (4344): No heartbeat from core client for 30 sec - exiting
21:19:10 (4344): No heartbeat from core client for 30 sec - exiting
21:19:11 (4344): No heartbeat from core client for 30 sec - exiting
21:19:12 (4344): No heartbeat from core client for 30 sec - exiting
21:19:13 (4344): No heartbeat from core client for 30 sec - exiting
21:19:14 (4344): No heartbeat from core client for 30 sec - exiting
21:19:15 (4344): No heartbeat from core client for 30 sec - exiting
21:19:16 (4344): No heartbeat from core client for 30 sec - exiting
21:19:18 (4344): No heartbeat from core client for 30 sec - exiting
21:19:19 (4344): No heartbeat from core client for 30 sec - exiting
21:19:20 (4344): No heartbeat from core client for 30 sec - exiting
21:19:21 (4344): No heartbeat from core client for 30 sec - exiting
21:19:22 (4344): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5560, selfPID=3860, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
14:52:18 (3808): No heartbeat from core client for 30 sec - exiting
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_eu_k2j7_2013_1_008554525/dataout/atmos_restart.day after 11 attempts
14:52:19 (3808): No heartbeat from core client for 30 sec - exiting
14:52:21 (3808): No heartbeat from core client for 30 sec - exiting
14:52:22 (3808): No heartbeat from core client for 30 sec - exiting
14:52:23 (3808): No heartbeat from core client for 30 sec - exiting
14:52:24 (3808): No heartbeat from core client for 30 sec - exiting
14:52:25 (3808): No heartbeat from core client for 30 sec - exiting
14:52:26 (3808): No heartbeat from core client for 30 sec - exiting
14:52:27 (3808): No heartbeat from core client for 30 sec - exiting
14:52:28 (3808): No heartbeat from core client for 30 sec - exiting
14:52:29 (3808): No heartbeat from core client for 30 sec - exiting
14:52:30 (3808): No heartbeat from core client for 30 sec - exiting
14:52:31 (3808): No heartbeat from core client for 30 sec - exiting
14:52:33 (3808): No heartbeat from core client for 30 sec - exiting
14:52:34 (3808): No heartbeat from core client for 30 sec - exiting
14:52:35 (3808): No heartbeat from core client for 30 sec - exiting
14:52:36 (3808): No heartbeat from core client for 30 sec - exiting
14:52:37 (3808): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_eu_k2j7_2013_1_008554525/dataout/atmos_restart.day after 11 attempts
forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_k2j7_2013_1_008554525\tmp\xaakm.namelists

Image              PC        Routine            Line        Source             
hadam3p_eu_um_6.0  0150A39A  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  014B2CD0  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  014B1E9A  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  01492819  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  01392287  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  0142E7B2  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  0142F2DA  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  011A9BD2  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  014EE638  Unknown               Unknown  Unknown
kernel32.dll       769A336A  Unknown               Unknown  Unknown
ntdll.dll          776E9F72  Unknown               Unknown  Unknown
ntdll.dll          776E9F45  Unknown               Unknown  Unknown
forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_k2j7_2013_1_008554525\tmp\xaakg.namelists

Image              PC        Routine            Line        Source             
hadrm3p_eu_um_6.0  0133C52A  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  012E4460  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  012E362A  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  012C2469  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  011C66EB  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  01262AE2  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  012635AF  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  01009860  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  01320893  Unknown               Unknown  Unknown
kernel32.dll       769A336A  Unknown               Unknown  Unknown
ntdll.dll          776E9F72  Unknown               Unknown  Unknown
ntdll.dll          776E9F45  Unknown               Unknown  Unknown
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2452, selfPID=3152, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_eu_k2j7_2013_1_008554525_1_7.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_k2j7_2013_1_008554525_1_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_k2j7_2013_1_008554525_1_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_k2j7_2013_1_008554525_1_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_k2j7_2013_1_008554525_1_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_k2j7_2013_1_008554525_1_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
23 Mar 2014 12:39:27 1312373 16344601 hadam3p_eu_k2j7_2013_1_008554525_1 69,216 134,655 1.9454
22 Mar 2014 18:16:36 1312373 16344601 hadam3p_eu_k2j7_2013_1_008554525_1 57,696 112,404 1.9482
19 Mar 2014 21:58:47 1312373 16344601 hadam3p_eu_k2j7_2013_1_008554525_1 46,176 89,626 1.9410
16 Mar 2014 12:02:30 1312373 16344601 hadam3p_eu_k2j7_2013_1_008554525_1 34,656 67,414 1.9452
15 Mar 2014 17:23:04 1312373 16344601 hadam3p_eu_k2j7_2013_1_008554525_1 23,136 44,841 1.9381
14 Mar 2014 21:33:26 1312373 16344601 hadam3p_eu_k2j7_2013_1_008554525_1 11,631 23,023 1.9795
09 Mar 2014 13:42:57 1312373 16344601 hadam3p_eu_k2j7_2013_1_008554525_1 11,616 22,712 1.9552


©2024 cpdn.org