climateprediction.net home page
Task 13732791

Task 13732791

Name hadcm3n_yh21_1980_40_007607623_0
Workunit 7785753
Created 6 Dec 2011, 0:34:04 UTC
Sent 6 Dec 2011, 0:47:20 UTC
Report deadline 6 Mar 2012, 8:14:31 UTC
Received 5 Mar 2012, 21:27:54 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1125448
Run time 33 days 2 hours 3 min 39 sec
CPU time 31 days 9 hours 19 min 3 sec
Validate state Invalid
Credit 12,130.56
Device peak FLOPS 2.66 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4344, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4760, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5864, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1764, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7164, iMonCtr=1
Model crash detected, will try to restart...
11:03:15 (4600): No heartbeat from core client for 30 sec - exiting
11:03:16 (4600): No heartbeat from core client for 30 sec - exiting
11:03:17 (4600): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:23:16 (4920): No heartbeat from core client for 30 sec - exiting
18:23:17 (4920): No heartbeat from core client for 30 sec - exiting
18:23:18 (4920): No heartbeat from core client for 30 sec - exiting
18:23:19 (4920): No heartbeat from core client for 30 sec - exiting
18:23:20 (4920): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
10:35:49 (4844): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3684, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
08:20:19 (4656): No heartbeat from core client for 30 sec - exiting
08:20:20 (4656): No heartbeat from core client for 30 sec - exiting
08:20:21 (4656): No heartbeat from core client for 30 sec - exiting
08:20:22 (4656): No heartbeat from core client for 30 sec - exiting
08:20:23 (4656): No heartbeat from core client for 30 sec - exiting
08:20:24 (4656): No heartbeat from core client for 30 sec - exiting
08:20:25 (4656): No heartbeat from core client for 30 sec - exiting
08:20:26 (4656): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
22:06:10 (4772): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
11:36:36 (5224): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5508, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4292, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
18:43:37 (4660): No heartbeat from core client for 30 sec - exiting
18:43:38 (4660): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
00:51:39 (5232): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CCSuspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3192, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4084, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5372, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5372, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4532, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4532, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4532, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4532, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
04 Mar 2012 14:34:40 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 1,010,880 2,613,293 2.5852
03 Mar 2012 07:01:59 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 984,960 2,509,905 2.5482
29 Feb 2012 22:22:08 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 959,040 2,409,070 2.5120
27 Feb 2012 21:21:42 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 933,120 2,306,195 2.4715
24 Feb 2012 21:37:44 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 907,200 2,205,383 2.4310
21 Feb 2012 21:28:07 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 881,280 2,104,437 2.3879
19 Feb 2012 18:57:39 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 855,360 2,002,965 2.3417
08 Feb 2012 14:29:27 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 829,440 1,928,617 2.3252
06 Feb 2012 18:39:23 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 803,520 1,863,486 2.3192
04 Feb 2012 13:56:17 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 777,600 1,799,543 2.3142
02 Feb 2012 22:47:16 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 751,680 1,739,189 2.3137
31 Jan 2012 23:09:17 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 725,760 1,682,903 2.3188
29 Jan 2012 14:44:01 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 699,840 1,624,053 2.3206
26 Jan 2012 22:42:37 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 673,920 1,557,029 2.3104
24 Jan 2012 00:50:50 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 648,000 1,482,094 2.2872
22 Jan 2012 06:55:49 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 622,080 1,393,373 2.2399
20 Jan 2012 07:04:26 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 596,160 1,326,482 2.2250
17 Jan 2012 21:41:27 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 570,240 1,264,038 2.2167
16 Jan 2012 11:25:18 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 544,320 1,206,421 2.2164
14 Jan 2012 21:17:50 1125448 13732791 hadcm3n_yh21_1980_40_007607623_0 518,400 1,151,083 2.2205


©2024 climateprediction.net