climateprediction.net home page
Task 12844707

Task 12844707

Name hadam3p_saf_2a65_1985_1_007234158_0
Workunit 7432398
Created 29 Apr 2011, 6:14:44 UTC
Sent 29 Apr 2011, 19:13:31 UTC
Report deadline 11 Apr 2012, 0:33:31 UTC
Received 30 May 2011, 3:46:01 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 981127
Run time 3 days 17 hours 22 min 22 sec
CPU time 3 days 11 hours 8 min 36 sec
Validate state Invalid
Credit 1,309.70
Device peak FLOPS 2.61 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Southern Africa v6.09
windows_intelx86
Stderr
<core_client_version>6.6.38</core_client_version>
<![CDATA[
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4984, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5516, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1620, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
12:01:47 (5616): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4920, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5172, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4604, selfPID=4604, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4168, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5576, iMonCtr=2
Model crash detected, will try to restart...
08:56:52 (3620): No heartbeat from core client for 30 sec - exiting
08:56:53 (3620): No heartbeat from core client for 30 sec - exiting
08:56:54 (3620): No heartbeat from core client for 30 sec - exiting
08:56:55 (3620): No heartbeat from core client for 30 sec - exiting
08:56:56 (3620): No heartbeat from core client for 30 sec - exiting
08:56:57 (3620): No heartbeat from core client for 30 sec - exiting
08:56:58 (3620): No heartbeat from core client for 30 sec - exiting
08:56:59 (3620): No heartbeat from core client for 30 sec - exiting
08:57:00 (3620): No heartbeat from core client for 30 sec - exiting
08:57:01 (3620): No heartbeat from core client for 30 sec - exiting
08:57:02 (3620): No heartbeat from core client for 30 sec - exiting
08:57:03 (3620): No heartbeat from core client for 30 sec - exiting
08:57:04 (3620): No heartbeat from core client for 30 sec - exiting
08:57:05 (3620): No heartbeat from core client for 30 sec - exiting
08:57:06 (3620): No heartbeat from core client for 30 sec - exiting
08:57:07 (3620): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4800, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
17:51:44 (5052): No heartbeat from core client for 30 sec - exiting
17:51:45 (5052): No heartbeat from core client for 30 sec - exiting
17:51:46 (5052): No heartbeat from core client for 30 sec - exiting
17:51:47 (5052): No heartbeat from core client for 30 sec - exiting
17:51:49 (5052): No heartbeat from core client for 30 sec - exiting
17:51:50 (5052): No heartbeat from core client for 30 sec - exiting
17:51:51 (5052): No heartbeat from core client for 30 sec - exiting
17:51:52 (5052): No heartbeat from core client for 30 sec - exiting
17:51:53 (5052): No heartbeat from core client for 30 sec - exiting
17:51:54 (5052): No heartbeat from core client for 30 sec - exiting
17:51:55 (5052): No heartbeat from core client for 30 sec - exiting
17:51:56 (5052): No heartbeat from core client for 30 sec - exiting
17:51:57 (5052): No heartbeat from core client for 30 sec - exiting
17:51:58 (5052): No heartbeat from core client for 30 sec - exiting
17:51:59 (5052): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4600, iMonCtr=2
Model crash detected, will try to restart...
09:31:38 (5356): No heartbeat from core client for 30 sec - exiting
09:31:39 (5356): No heartbeat from core client for 30 sec - exiting
09:31:40 (5356): No heartbeat from core client for 30 sec - exiting
09:31:42 (5356): No heartbeat from core client for 30 sec - exiting
09:31:43 (5356): No heartbeat from core client for 30 sec - exiting
09:31:44 (5356): No heartbeat from core client for 30 sec - exiting
09:31:45 (5356): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
09:31:46 (5356): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=912, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3528, selfPID=4800, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5544, selfPID=4364, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3336, selfPID=5996, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4208, selfPID=5272, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3528, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4368, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=216, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5444, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3796, selfPID=5736, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5684, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5692, selfPID=4936, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5532, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4548, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_saf_2a65_1985_1_007234158_0_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_2a65_1985_1_007234158_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_2a65_1985_1_007234158_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_2a65_1985_1_007234158_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_2a65_1985_1_007234158_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
29 May 2011 17:03:20 981127 12844707 hadam3p_saf_2a65_1985_1_007234158_0 80,736 292,979 3.6289
26 May 2011 20:19:19 981127 12844707 hadam3p_saf_2a65_1985_1_007234158_0 69,216 250,534 3.6196
24 May 2011 05:27:20 981127 12844707 hadam3p_saf_2a65_1985_1_007234158_0 57,717 208,792 3.6175
24 May 2011 00:45:31 981127 12844707 hadam3p_saf_2a65_1985_1_007234158_0 57,705 208,121 3.6066
23 May 2011 20:23:28 981127 12844707 hadam3p_saf_2a65_1985_1_007234158_0 57,696 207,327 3.5934
22 May 2011 05:07:59 981127 12844707 hadam3p_saf_2a65_1985_1_007234158_0 46,176 165,920 3.5932
21 May 2011 17:50:44 981127 12844707 hadam3p_saf_2a65_1985_1_007234158_0 34,656 125,851 3.6314
14 May 2011 03:25:52 981127 12844707 hadam3p_saf_2a65_1985_1_007234158_0 23,136 85,405 3.6914
05 May 2011 05:29:03 981127 12844707 hadam3p_saf_2a65_1985_1_007234158_0 11,616 43,029 3.7043


©2024 climateprediction.net