climateprediction.net home page
Task 13489636

Task 13489636

Name hadam3p_eu_6mzo_2004_1_007497065_0
Workunit 7694540
Created 14 Oct 2011, 19:36:47 UTC
Sent 17 Oct 2011, 17:55:56 UTC
Report deadline 28 Sep 2012, 23:15:56 UTC
Received 18 Jul 2012, 8:29:06 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1110046
Run time 8 days 18 hours 11 min 37 sec
CPU time 4 days 22 hours 46 min 6 sec
Validate state Invalid
Credit 1,790.21
Device peak FLOPS 1.70 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4888, selfPID=6188, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3064, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3472, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5308, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3676, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3496, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2452, selfPID=3636, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Glontobal Worker:: CPDN proc is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3640, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2972, selfPID=2676, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3316, selfPID=2652, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
13:38:04 (1976): Can't acquire lockfile (32) - waiting 35s
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1976, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3296, selfPID=2888, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=804, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2216, selfPID=2820, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5072, selfPID=5072, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5012, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3468, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3148, selfPID=3148, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
08:06:35 (4472): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5152, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2196, iMonCtr=2
Model crash detected, will try to restart...
18:56:11 (5296): No heartbeat from core client for 30 sec - exiting
18:56:12 (5296): No heartbeat from core client for 30 sec - exiting
18:56:13 (5296): No heartbeat from core client for 30 sec - exiting
18:56:14 (5296): No heartbeat from core client for 30 sec - exiting
18:56:15 (5296): No heartbeat from core client for 30 sec - exiting
18:56:16 (5296): No heartbeat from core client for 30 sec - exiting
18:56:17 (5296): No heartbeat from core client for 30 sec - exiting
18:56:18 (5296): No heartbeat from core client for 30 sec - exiting
18:56:19 (5296): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5616, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3744, selfPID=3744, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=944, selfPID=944, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5420, selfPID=5420, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
GCPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3520, selfPID=3520, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3340, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CCPDN Monitor - Quit request from BOINC...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5332, selfPID=5388, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
07:04:39 (5052): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
07:18:41 (5148): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
07:19:48 (4848): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
12:34:58 (2568): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3072, selfPID=2400, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3440, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3704, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4928, selfPID=3044, iMonCtr=1
Model crash detected, will try to restart...
GController:: CPDN process :s:not running, exiting, bRetVal = 1, checkPID=0, selfPID=2952, iMonCtr=2
Model crash detected, will try to restart...
 CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5272, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5932, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3688, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
GGlobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5460, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4648, selfPID=4028, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4736, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3156, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3592, selfPID=3592, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3552, selfPID=3552, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5328, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
07:50:43 (3244): No heartbeat from core client for 30 sec - exiting
07:50:44 (3244): No heartbeat from core client for 30 sec - exiting
07:50:45 (3244): No heartbeat from core client for 30 sec - exiting
07:50:46 (3244): No heartbeat from core client for 30 sec - exiting
07:50:47 (3244): No heartbeat from core client for 30 sec - exiting
07:50:48 (3244): No heartbeat from core client for 30 sec - exiting
07:50:49 (3244): No heartbeat from core client for 30 sec - exiting
07:50:50 (3244): No heartbeat from core client for 30 sec - exiting
07:50:51 (3244): No heartbeat from core client for 30 sec - exiting
07:50:52 (3244): No heartbeat from core client for 30 sec - exiting
07:50:53 (3244): No heartbeat from core client for 30 sec - exiting
07:50:54 (3244): No heartbeat from core client for 30 sec - exiting
07:50:55 (3244): No heartbeat from core client for 30 sec - exiting
07:50:56 (3244): No heartbeat from core client for 30 sec - exiting
07:50:57 (3244): No heartbeat from core client for 30 sec - exiting
07:50:58 (3244): No heartbeat from core client for 30 sec - exiting
07:50:59 (3244): No heartbeat from core client for 30 sec - exiting
07:51:00 (3244): No heartbeat from core client for 30 sec - exiting
07:51:01 (3244): No heartbeat from core client for 30 sec - exiting
07:51:02 (3244): No heartbeat from core client for 30 sec - exiting
07:51:03 (3244): No heartbeat from core client for 30 sec - exiting
07:51:04 (3244): No heartbeat from core client for 30 sec - exiting
07:51:06 (3244): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2708, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3252, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3164, selfPID=2920, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3040, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=284, selfPID=3248, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...

Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO                                                                                                                                                                                           tmp/xaakm.pipe_dummy                                                            2048    
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_eu_6mzo_2004_1_007497065_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_6mzo_2004_1_007497065_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_6mzo_2004_1_007497065_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
24 Jun 2012 04:50:38 1110046 13489636 hadam3p_eu_6mzo_2004_1_007497065_0 103,776 404,771 3.9004
30 May 2012 11:13:17 1110046 13489636 hadam3p_eu_6mzo_2004_1_007497065_0 92,256 351,077 3.8055
04 May 2012 16:08:57 1110046 13489636 hadam3p_eu_6mzo_2004_1_007497065_0 80,744 299,894 3.7141
02 May 2012 17:27:12 1110046 13489636 hadam3p_eu_6mzo_2004_1_007497065_0 80,741 299,350 3.7075
02 May 2012 15:24:04 1110046 13489636 hadam3p_eu_6mzo_2004_1_007497065_0 80,739 298,854 3.7015
29 Apr 2012 07:04:20 1110046 13489636 hadam3p_eu_6mzo_2004_1_007497065_0 80,736 298,279 3.6945
03 Apr 2012 12:43:37 1110046 13489636 hadam3p_eu_6mzo_2004_1_007497065_0 69,216 249,875 3.6101
07 Mar 2012 12:38:07 1110046 13489636 hadam3p_eu_6mzo_2004_1_007497065_0 57,705 194,738 3.3747
05 Mar 2012 18:56:08 1110046 13489636 hadam3p_eu_6mzo_2004_1_007497065_0 57,701 193,540 3.3542
04 Mar 2012 07:22:02 1110046 13489636 hadam3p_eu_6mzo_2004_1_007497065_0 57,699 192,372 3.3341
03 Mar 2012 17:40:53 1110046 13489636 hadam3p_eu_6mzo_2004_1_007497065_0 57,697 191,914 3.3262
01 Mar 2012 16:15:38 1110046 13489636 hadam3p_eu_6mzo_2004_1_007497065_0 57,696 191,453 3.3183
15 Feb 2012 12:21:23 1110046 13489636 hadam3p_eu_6mzo_2004_1_007497065_0 46,176 152,837 3.3099
01 Feb 2012 14:00:49 1110046 13489636 hadam3p_eu_6mzo_2004_1_007497065_0 34,656 114,700 3.3097
22 Jan 2012 12:06:51 1110046 13489636 hadam3p_eu_6mzo_2004_1_007497065_0 23,136 76,935 3.3253
15 Jan 2012 06:19:55 1110046 13489636 hadam3p_eu_6mzo_2004_1_007497065_0 11,619 39,222 3.3757
12 Jan 2012 19:29:05 1110046 13489636 hadam3p_eu_6mzo_2004_1_007497065_0 11,616 38,717 3.3331


©2024 climateprediction.net