climateprediction.net home page
Task 14576450

Task 14576450

Name hadam3p_pnw_ysi9_1992_1_006882409_1
Workunit 7085725
Created 23 Apr 2012, 11:09:16 UTC
Sent 24 Apr 2012, 19:12:39 UTC
Report deadline 7 Apr 2013, 0:32:39 UTC
Received 2 Apr 2015, 19:19:29 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1029234
Run time 5 days 13 hours 44 min 26 sec
CPU time 3 days 10 hours 19 min 44 sec
Validate state Invalid
Credit 1,003.35
Device peak FLOPS 2.16 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Pacific North West v6.09
windows_intelx86
Stderr
<core_client_version>6.10.18</core_client_version>
<![CDATA[
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=984, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4328, selfPID=4328, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5676, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=876, iMonCtr=2
Model crash detected, will try to restart...
19:40:39 (3632): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:13:40 (3076): No heartbeat from core client for 30 sec - exiting
14:13:41 (3076): No heartbeat from core client for 30 sec - exiting
14:13:42 (3076): No heartbeat from core client for 30 sec - exiting
14:13:43 (3076): No heartbeat from core client for 30 sec - exiting
14:13:44 (3076): No heartbeat from core client for 30 sec - exiting
14:13:46 (3076): No heartbeat from core client for 30 sec - exiting
14:13:47 (3076): No heartbeat from core client for 30 sec - exiting
14:13:48 (3076): No heartbeat from core client for 30 sec - exiting
14:13:49 (3076): No heartbeat from core client for 30 sec - exiting
14:13:50 (3076): No heartbeat from core client for 30 sec - exiting
14:13:51 (3076): No heartbeat from core client for 30 sec - exiting
14:13:52 (3076): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:13:53 (3076): No heartbeat from core client for 30 sec - exiting
14:13:55 (3076): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4344, selfPID=2548, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 4
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4636, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=448, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 4
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3652, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4324, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
19:07:29 (2164): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5088, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2732, selfPID=392, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 4
Called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_pnw_ysi9_1992_1_006882409_1_5.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_ysi9_1992_1_006882409_1_6.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_ysi9_1992_1_006882409_1_7.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_ysi9_1992_1_006882409_1_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_ysi9_1992_1_006882409_1_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_ysi9_1992_1_006882409_1_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_ysi9_1992_1_006882409_1_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_ysi9_1992_1_006882409_1_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
30 May 2012 19:28:04 1029234 14576450 hadam3p_pnw_ysi9_1992_1_006882409_1 46,176 237,611 5.1458
26 May 2012 22:29:22 1029234 14576450 hadam3p_pnw_ysi9_1992_1_006882409_1 34,656 178,750 5.1578
25 May 2012 19:57:50 1029234 14576450 hadam3p_pnw_ysi9_1992_1_006882409_1 23,136 119,673 5.1726
24 May 2012 13:31:29 1029234 14576450 hadam3p_pnw_ysi9_1992_1_006882409_1 11,616 62,391 5.3711


©2024 cpdn.org