climateprediction.net home page
Task 15116828

Task 15116828

Name hadam3p_pnw_34hn_1985_1_008143737_0
Workunit 8298851
Created 14 Aug 2012, 10:54:45 UTC
Sent 22 Aug 2012, 6:12:41 UTC
Report deadline 4 Aug 2013, 11:32:41 UTC
Received 12 Sep 2012, 21:26:02 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1229231
Run time 19 days 18 hours 54 min 51 sec
CPU time 13 days 10 hours 30 min 58 sec
Validate state Invalid
Credit 2,004.61
Device peak FLOPS 0.78 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Pacific North West v6.09
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5584, selfPID=4608, iMonCtr=1
Model crash detected,Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4460, selfPID=6004, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5024, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4304, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5716, selfPID=4208, iMonCtr=1
Model crash detected, will try to restart...
02:08:51 (4440): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
02:08:52 (4440): No heartbeat from core client for 30 sec - exiting
02:08:53 (4440): No heartbeat from core client for 30 sec - exiting
02:08:54 (4440): No heartbeat from core client for 30 sec - exiting
02:08:55 (4440): No heartbeat from core client for 30 sec - exiting
02:08:56 (4440): No heartbeat from core client for 30 sec - exiting
02:08:57 (4440): No heartbeat from core client for 30 sec - exiting
02:47:06 (5300): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
02:47:07 (5300): No heartbeat from core client for 30 sec - exiting
02:53:46 (5128): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
02:53:47 (5128): No heartbeat from core client for 30 sec - exiting
02:53:48 (5128): No heartbeat from core client for 30 sec - exiting
02:53:49 (5128): No heartbeat from core client for 30 sec - exiting
02:53:50 (5128): No heartbeat from core client for 30 sec - exiting
02:53:51 (5128): No heartbeat from core client for 30 sec - exiting
02:59:03 (176): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
03:06:03 (5756): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
03:06:04 (5756): No heartbeat from core client for 30 sec - exiting
03:15:57 (2348): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
03:48:44 (1560): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
03:48:45 (1560): No heartbeat from core client for 30 sec - exiting
04:49:34 (1872): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
04:49:35 (1872): No heartbeat from core client for 30 sec - exiting
05:25:29 (5896): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5272, selfPID=5272, iMonCtr=2
05:25:30 (5896): No heartbeat from core client for 30 sec - exiting
05:28:30 (2636): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
05:28:31 (2636): No heartbeat from core client for 30 sec - exiting
05:28:33 (2636): No heartbeat from core client for 30 sec - exiting
05:28:34 (2636): No heartbeat from core client for 30 sec - exiting
05:28:35 (2636): No heartbeat from core client for 30 sec - exiting
05:28:36 (2636): No heartbeat from core client for 30 sec - exiting
05:30:28 (4232): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
05:31:48 (3396): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
05:41:21 (5904): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
06:01:10 (6056): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
06:01:11 (6056): No heartbeat from core client for 30 sec - exiting
06:01:12 (6056): No heartbeat from core client for 30 sec - exiting
06:01:13 (6056): No heartbeat from core client for 30 sec - exiting
06:01:14 (6056): No heartbeat from core client for 30 sec - exiting
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1396, selfPID=1396, iMonCtr=2
06:01:15 (6056): No heartbeat from core client for 30 sec - exiting
06:01:16 (6056): No heartbeat from core client for 30 sec - exiting
06:01:17 (6056): No heartbeat from core client for 30 sec - exiting
GSuspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4728, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 3
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1568, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4780, selfPID=4248, iMonCtr=1
Model crash detected, will try to restart...
05:48:29 (4648): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
05:48:30 (4648): No heartbeat from core client for 30 sec - exiting
05:48:31 (4648): No heartbeat from core client for 30 sec - exiting
05:48:32 (4648): No heartbeat from core client for 30 sec - exiting
05:48:33 (4648): No heartbeat from core client for 30 sec - exiting
05:48:34 (4648): No heartbeat from core client for 30 sec - exiting
06:31:28 (424): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
06:33:35 (1776): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
06:33:36 (1776): No heartbeat from core client for 30 sec - exiting
06:33:37 (1776): No heartbeat from core client for 30 sec - exiting
06:33:38 (1776): No heartbeat from core client for 30 sec - exiting
06:33:39 (1776): No heartbeat from core client for 30 sec - exiting
06:35:23 (5476): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
06:35:24 (5476): No heartbeat from core client for 30 sec - exiting
06:35:25 (5476): No heartbeat from core client for 30 sec - exiting
06:38:02 (4140): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
06:38:03 (4140): No heartbeat from core client for 30 sec - exiting
06:38:04 (4140): No heartbeat from core client for 30 sec - exiting
06:38:05 (4140): No heartbeat from core client for 30 sec - exiting
06:38:06 (4140): No heartbeat from core client for 30 sec - exiting
06:38:07 (4140): No heartbeat from core client for 30 sec - exiting
06:38:08 (4140): No heartbeat from core client for 30 sec - exiting
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5492, selfPID=5492, iMonCtr=2
06:38:16 (4140): No heartbeat from core client for 30 sec - exiting
06:38:17 (4140): No heartbeat from core client for 30 sec - exiting
06:40:28 (380): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
06:42:09 (6080): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
06:42:10 (6080): No heartbeat from core client for 30 sec - exiting
07:16:10 (4780): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
07:16:11 (4780): No heartbeat from core client for 30 sec - exiting
07:16:12 (4780): No heartbeat from core client for 30 sec - exiting
07:16:13 (4780): No heartbeat from core client for 30 sec - exiting
07:18:11 (3196): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3244, selfPID=4516, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2964, selfPID=5048, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 4
Contbal Worker:: CPDN process is not running, exiting, bRetVal = 1, rholler:: CPDN process is iMnot runn
ing, exiting, bRetVal = 1, checkPID=0, selfPID=4816, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 6
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5896, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4924, selfPID=3532, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4140, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4204, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2076, iMonCtr=2
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 7
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5676, selfPID=4700, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 8
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_pnw_34hn_1985_1_008143737_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_34hn_1985_1_008143737_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_34hn_1985_1_008143737_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_34hn_1985_1_008143737_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
11 Sep 2012 02:29:12 1229231 15116828 hadam3p_pnw_34hn_1985_1_008143737_0 92,256 1,055,436 11.4403
07 Sep 2012 22:18:52 1229231 15116828 hadam3p_pnw_34hn_1985_1_008143737_0 80,737 924,806 11.4545
07 Sep 2012 21:18:20 1229231 15116828 hadam3p_pnw_34hn_1985_1_008143737_0 80,736 923,352 11.4367
05 Sep 2012 09:55:44 1229231 15116828 hadam3p_pnw_34hn_1985_1_008143737_0 69,216 787,586 11.3787
03 Sep 2012 08:16:02 1229231 15116828 hadam3p_pnw_34hn_1985_1_008143737_0 57,696 654,776 11.3487
31 Aug 2012 05:09:32 1229231 15116828 hadam3p_pnw_34hn_1985_1_008143737_0 46,176 516,432 11.1840
28 Aug 2012 10:06:16 1229231 15116828 hadam3p_pnw_34hn_1985_1_008143737_0 34,656 380,375 10.9757
26 Aug 2012 05:54:40 1229231 15116828 hadam3p_pnw_34hn_1985_1_008143737_0 23,136 251,826 10.8846
24 Aug 2012 06:45:43 1229231 15116828 hadam3p_pnw_34hn_1985_1_008143737_0 11,616 127,193 10.9498


©2024 cpdn.org