climateprediction.net home page
Task 14861608

Task 14861608

Name hadam3p_pnw_capz_1991_1_008027401_0
Workunit 8182515
Created 4 Jul 2012, 15:35:06 UTC
Sent 4 Jul 2012, 15:35:15 UTC
Report deadline 16 Jun 2013, 20:55:15 UTC
Received 11 Feb 2013, 19:20:34 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1026045
Run time 6 days 18 hours 44 min 43 sec
CPU time 3 days 20 hours 50 min 24 sec
Validate state Invalid
Credit 1,003.35
Device peak FLOPS 1.54 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Pacific North West v6.09
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5212, selfPID=5292, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3136, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6048, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4140, selfPID=4720, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
GGlobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4080, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
C13:52:18 (2708): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
13:52:19 (2708): No heartbeat from core client for 30 sec - exiting
13:52:20 (2708): No heartbeat from core client for 30 sec - exiting
13:52:21 (2708): No heartbeat from core client for 30 sec - exiting
13:52:22 (2708): No heartbeat from core client for 30 sec - exiting
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1428, selfPID=1428, iMonCtr=2
13:52:24 (2708): No heartbeat from core client for 30 sec - exiting
13:52:25 (2708): No heartbeat from core client for 30 sec - exiting
13:52:26 (2708): No heartbeat from core client for 30 sec - exiting
13:52:27 (2708): No heartbeat from core client for 30 sec - exiting
13:52:28 (2708): No heartbeat from core client for 30 sec - exiting
GG20:14:19 (2248): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:14:21 (2248): No heartbeat from core client for 30 sec - exiting
20:14:22 (2248): No heartbeat from core client for 30 sec - exiting
20:14:23 (2248): No heartbeat from core client for 30 sec - exiting
20:14:24 (2248): No heartbeat from core client for 30 sec - exiting
20:14:25 (2248): No heartbeat from core client for 30 sec - exiting
20:14:26 (2248): No heartbeat from core client for 30 sec - exiting
20:14:27 (2248): No heartbeat from core client for 30 sec - exiting
20:14:28 (2248): No heartbeat from core client for 30 sec - exiting
20:14:29 (2248): No heartbeat from core client for 30 sec - exiting
GGSuspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=968, iMonCtr=2
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2084, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3468, selfPID=3468, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3468, selfPID=2876, iMonCtr=1
Model crash detected, will try to restart...

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_pnw_capz_1991_1_008027401_0_5.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_capz_1991_1_008027401_0_6.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_capz_1991_1_008027401_0_7.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_capz_1991_1_008027401_0_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_capz_1991_1_008027401_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_capz_1991_1_008027401_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_capz_1991_1_008027401_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_capz_1991_1_008027401_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
18 Sep 2012 20:05:10 1026045 14861608 hadam3p_pnw_capz_1991_1_008027401_0 46,176 301,821 6.5363
12 Sep 2012 21:39:02 1026045 14861608 hadam3p_pnw_capz_1991_1_008027401_0 34,656 225,020 6.4930
05 Sep 2012 23:16:00 1026045 14861608 hadam3p_pnw_capz_1991_1_008027401_0 23,136 149,518 6.4626
20 Aug 2012 17:47:50 1026045 14861608 hadam3p_pnw_capz_1991_1_008027401_0 11,616 76,324 6.5706


©2024 climateprediction.net