climateprediction.net home page
Task 13720018

Task 13720018

Name hadam3p_pnw_71hw_2004_1_007597180_0
Workunit 7775310
Created 5 Dec 2011, 12:01:58 UTC
Sent 18 Dec 2011, 23:32:37 UTC
Report deadline 30 Nov 2012, 4:52:37 UTC
Received 27 Feb 2012, 17:01:04 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1143523
Run time 2 days 3 hours 39 min 54 sec
CPU time 2 days 0 hours 53 min 56 sec
Validate state Invalid
Credit 1,003.35
Device peak FLOPS 2.56 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Pacific North West v6.09
windows_intelx86
Stderr
<core_client_version>6.10.60</core_client_version>
<![CDATA[
<stderr_txt>
13:55:13 (3228): No heartbeat from core client for 30 sec - exiting
13:55:14 (3228): No heartbeat from core client for 30 sec - exiting
13:55:15 (3228): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3904, selfPID=2828, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
20:23:53 (2776): No heartbeat from core client for 30 sec - exiting
20:23:54 (2776): No heartbeat from core client for 30 sec - exiting
20:23:56 (2776): No heartbeat from core client for 30 sec - exiting
20:23:57 (2776): No heartbeat from core client for 30 sec - exiting
20:23:58 (2776): No heartbeat from core client for 30 sec - exiting
20:23:59 (2776): No heartbeat from core client for 30 sec - exiting
20:24:00 (2776): No heartbeat from core client for 30 sec - exiting
20:24:01 (2776): No heartbeat from core client for 30 sec - exiting
20:24:02 (2776): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4172, selfPID=2624, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4856, selfPID=5084, iMonCtr=1
Model crash detected, will try to restart...
22:48:07 (3184): No heartbeat from core client for 30 sec - exiting
22:48:08 (3184): No heartbeat from core client for 30 sec - exiting
22:48:09 (3184): No heartbeat from core client for 30 sec - exiting
22:48:10 (3184): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
23:22:52 (2744): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1284, selfPID=2396, iMonCtr=1
Model crash detected, will try to restart...
13:56:42 (2720): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2996, selfPID=2996, iMonCtr=2
14:55:03 (2732): No heartbeat from core client for 30 sec - exiting
14:55:04 (2732): No heartbeat from core client for 30 sec - exiting
14:55:05 (2732): No heartbeat from core client for 30 sec - exiting
14:55:06 (2732): No heartbeat from core client for 30 sec - exiting
14:55:07 (2732): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:55:09 (2732): No heartbeat from core client for 30 sec - exiting
16:13:16 (2764): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
16:58:02 (4528): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
17:40:27 (2800): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
17:40:28 (2800): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1484, selfPID=2772, iMonCtr=1
Model crash detected, will try to restart...
11:22:55 (2800): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3836, selfPID=3836, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3384, selfPID=3568, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3340, selfPID=2500, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
22:01:43 (2748): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
09:52:33 (2888): No heartbeat from core client for 30 sec - exiting
09:52:34 (2888): No heartbeat from core client for 30 sec - exiting
09:52:35 (2888): No heartbeat from core client for 30 sec - exiting
09:52:37 (2888): No heartbeat from core client for 30 sec - exiting
09:52:38 (2888): No heartbeat from core client for 30 sec - exiting
09:52:39 (2888): No heartbeat from core client for 30 sec - exiting
09:52:40 (2888): No heartbeat from core client for 30 sec - exiting
09:52:41 (2888): No heartbeat from core client for 30 sec - exiting
09:52:42 (2888): No heartbeat from core client for 30 sec - exiting
09:52:43 (2888): No heartbeat from core client for 30 sec - exiting
09:52:44 (2888): No heartbeat from core client for 30 sec - exiting
09:52:45 (2888): No heartbeat from core client for 30 sec - exiting
09:52:46 (2888): No heartbeat from core client for 30 sec - exiting
09:52:47 (2888): No heartbeat from core client for 30 sec - exiting
09:52:49 (2888): No heartbeat from core client for 30 sec - exiting
09:52:50 (2888): No heartbeat from core client for 30 sec - exiting
09:52:51 (2888): No heartbeat from core client for 30 sec - exiting
09:52:52 (2888): No heartbeat from core client for 30 sec - exiting
09:52:53 (2888): No heartbeat from core client for 30 sec - exiting
09:52:54 (2888): No heartbeat from core client for 30 sec - exiting
09:52:55 (2888): No heartbeat from core client for 30 sec - exiting
09:52:56 (2888): No heartbeat from core client for 30 sec - exiting
09:52:57 (2888): No heartbeat from core client for 30 sec - exiting
09:52:58 (2888): No heartbeat from core client for 30 sec - exiting
09:52:59 (2888): No heartbeat from core client for 30 sec - exiting
09:53:01 (2888): No heartbeat from core client for 30 sec - exiting
09:53:02 (2888): No heartbeat from core client for 30 sec - exiting
09:53:03 (2888): No heartbeat from core client for 30 sec - exiting
09:53:04 (2888): No heartbeat from core client for 30 sec - exiting
09:53:05 (2888): No heartbeat from core client for 30 sec - exiting
09:53:06 (2888): No heartbeat from core client for 30 sec - exiting
09:53:07 (2888): No heartbeat from core client for 30 sec - exiting
09:53:08 (2888): No heartbeat from core client for 30 sec - exiting
09:53:09 (2888): No heartbeat from core client for 30 sec - exiting
09:53:10 (2888): No heartbeat from core client for 30 sec - exiting
09:53:11 (2888): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
15:07:29 (1748): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
16:56:25 (4136): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:18:35:16 (2244): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:38:42 (936): Can't acquire lockfile (32) - waiting 35s
19:14:51 (936): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
12:02:40 (5792): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...

Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO                                                                                                                                                                                           tmp/xaakm.pipe_dummy                                                            2048    
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 4
Called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_pnw_71hw_2004_1_007597180_0_5.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_71hw_2004_1_007597180_0_6.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_71hw_2004_1_007597180_0_7.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_71hw_2004_1_007597180_0_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_71hw_2004_1_007597180_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_71hw_2004_1_007597180_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_71hw_2004_1_007597180_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_71hw_2004_1_007597180_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
24 Feb 2012 00:13:41 1143523 13720018 hadam3p_pnw_71hw_2004_1_007597180_0 46,176 159,481 3.4538
13 Jan 2012 00:40:57 1143523 13720018 hadam3p_pnw_71hw_2004_1_007597180_0 34,657 120,651 3.4813
11 Jan 2012 23:05:22 1143523 13720018 hadam3p_pnw_71hw_2004_1_007597180_0 34,656 120,201 3.4684
01 Jan 2012 19:50:35 1143523 13720018 hadam3p_pnw_71hw_2004_1_007597180_0 23,136 80,905 3.4969
31 Dec 2011 01:54:38 1143523 13720018 hadam3p_pnw_71hw_2004_1_007597180_0 11,623 40,919 3.5205
31 Dec 2011 00:54:19 1143523 13720018 hadam3p_pnw_71hw_2004_1_007597180_0 11,622 40,419 3.4778
30 Dec 2011 20:01:52 1143523 13720018 hadam3p_pnw_71hw_2004_1_007597180_0 11,616 39,965 3.4405


©2024 climateprediction.net