climateprediction.net home page
Task 16800323

Task 16800323

Name hadam3p_eu_p6q9_2013_1_008880266_0
Workunit 9026195
Created 9 Jul 2014, 17:14:09 UTC
Sent 10 Jul 2014, 16:45:42 UTC
Report deadline 22 Jun 2015, 22:05:42 UTC
Received 11 Sep 2014, 17:11:29 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1330426
Run time 6 days 23 hours 14 min 1 sec
CPU time 5 days 21 hours 56 min 59 sec
Validate state Invalid
Credit 1,988.94
Device peak FLOPS 1.30 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5984, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5120, selfPID=2288, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1624, selfPID=5524, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5324, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
19:30:14 (5796): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1540, selfPID=5948, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7968, selfPID=7256, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4360, selfPID=4160, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=696, selfPID=5720, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4664, selfPID=4288, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2984, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2604, selfPID=4912, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2592, selfPID=4760, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5068, selfPID=4304, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
10:35:17 (6104): No heartbeat from core client for 30 sec - exiting
10:35:18 (6104): No heartbeat from core client for 30 sec - exiting
10:35:19 (6104): No heartbeat from core client for 30 sec - exiting
10:35:20 (6104): No heartbeat from core client for 30 sec - exiting
10:35:21 (6104): No heartbeat from core client for 30 sec - exiting
10:35:22 (6104): No heartbeat from core client for 30 sec - exiting
10:35:23 (6104): No heartbeat from core client for 30 sec - exiting
10:35:24 (6104): No heartbeat from core client for 30 sec - exiting
10:35:25 (6104): No heartbeat from core client for 30 sec - exiting
10:35:26 (6104): No heartbeat from core client for 30 sec - exiting
10:35:27 (6104): No heartbeat from core client for 30 sec - exiting
10:35:28 (6104): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3968, selfPID=4528, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7700, selfPID=6672, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2012, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5292, selfPID=2692, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=168, selfPID=4648, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3828, selfPID=3332, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3684, selfPID=3812, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7720, selfPID=6336, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
20:00:29 (3800): No heartbeat from core client for 30 sec - exiting
20:00:30 (3800): No heartbeat from core client for 30 sec - exiting
20:00:31 (3800): No heartbeat from core client for 30 sec - exiting
20:00:32 (3800): No heartbeat from core client for 30 sec - exiting
20:00:33 (3800): No heartbeat from core client for 30 sec - exiting
20:00:34 (3800): No heartbeat from core client for 30 sec - exiting
20:00:35 (3800): No heartbeat from core client for 30 sec - exiting
20:00:36 (3800): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5488, selfPID=4556, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_eu_p6q9_2013_1_008880266_0_11.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_p6q9_2013_1_008880266_0_12.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
09 Sep 2014 20:23:16 1330426 16800323 hadam3p_eu_p6q9_2013_1_008880266_0 115,296 507,148 4.3987
05 Sep 2014 12:59:53 1330426 16800323 hadam3p_eu_p6q9_2013_1_008880266_0 103,776 457,104 4.4047
04 Sep 2014 09:07:13 1330426 16800323 hadam3p_eu_p6q9_2013_1_008880266_0 92,256 407,192 4.4137
02 Sep 2014 14:23:57 1330426 16800323 hadam3p_eu_p6q9_2013_1_008880266_0 80,736 357,528 4.4284
01 Sep 2014 11:33:22 1330426 16800323 hadam3p_eu_p6q9_2013_1_008880266_0 69,216 307,116 4.4371
29 Aug 2014 11:46:13 1330426 16800323 hadam3p_eu_p6q9_2013_1_008880266_0 57,696 256,629 4.4480
29 Aug 2014 09:25:46 1330426 16800323 hadam3p_eu_p6q9_2013_1_008880266_0 46,176 204,836 4.4360
18 Aug 2014 19:12:00 1330426 16800323 hadam3p_eu_p6q9_2013_1_008880266_0 34,656 154,259 4.4511
06 Aug 2014 19:33:14 1330426 16800323 hadam3p_eu_p6q9_2013_1_008880266_0 23,136 104,333 4.5096
23 Jul 2014 16:44:51 1330426 16800323 hadam3p_eu_p6q9_2013_1_008880266_0 11,616 53,359 4.5936


©2024 climateprediction.net