climateprediction.net home page
Task 18826647

Task 18826647

Name hadam3p_anz_n88l_2008_1_010107441_1
Workunit 10087031
Created 15 Aug 2015, 21:02:25 UTC
Sent 16 Aug 2015, 16:16:53 UTC
Report deadline 28 Jul 2016, 21:36:53 UTC
Received 3 Sep 2015, 8:13:48 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1348327
Run time 3 days 11 hours 11 min 22 sec
CPU time 3 days 9 hours 15 min 38 sec
Validate state Invalid
Credit 3,490.64
Device peak FLOPS 3.26 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Australia New Zealand v6.10
windows_intelx86
Stderr
<core_client_version>7.4.27</core_client_version>
<![CDATA[
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6512, iMonCtr=2
Model crash detected, will try to restart...
17:34:58 (4396): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5600, selfPID=4176, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6476, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6008, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6036, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4840, selfPID=5744, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
C13:43:07 (5312): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:42:41 (492): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6936, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2484, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6936, selfPID=716, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2564, selfPID=4680, iMonCtr=1
Model crash detected, will try to restart...
12:24:34 (2948): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
C10:23:03 (5796): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CGlobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6576, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6644, selfPID=4832, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3852, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6884, iMonCtr=2
11:27:30 (5976): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2456, selfPID=5804, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
22:18:45 (5996): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7016, selfPID=5676, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3456, selfPID=5592, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_anz_n88l_2008_1_010107441_1_8.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n88l_2008_1_010107441_1_9.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n88l_2008_1_010107441_1_10.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n88l_2008_1_010107441_1_11.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n88l_2008_1_010107441_1_12.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
31 Aug 2015 19:03:37 1348327 18826647 hadam3p_anz_n88l_2008_1_010107441_1 80,939 269,110 3.3248
27 Aug 2015 09:54:18 1348327 18826647 hadam3p_anz_n88l_2008_1_010107441_1 69,419 230,738 3.3238
26 Aug 2015 08:30:04 1348327 18826647 hadam3p_anz_n88l_2008_1_010107441_1 57,899 192,470 3.3242
23 Aug 2015 16:24:02 1348327 18826647 hadam3p_anz_n88l_2008_1_010107441_1 46,379 155,719 3.3575
22 Aug 2015 16:22:14 1348327 18826647 hadam3p_anz_n88l_2008_1_010107441_1 34,859 116,670 3.3469
21 Aug 2015 11:30:51 1348327 18826647 hadam3p_anz_n88l_2008_1_010107441_1 23,339 76,587 3.2815
19 Aug 2015 08:06:17 1348327 18826647 hadam3p_anz_n88l_2008_1_010107441_1 11,819 37,940 3.2101


©2024 cpdn.org