climateprediction.net home page
Task 13670699

Task 13670699

Name hadcm3n_0045_1940_40_007546702_0
Workunit 7743934
Created 29 Nov 2011, 8:18:36 UTC
Sent 30 Nov 2011, 12:30:34 UTC
Report deadline 29 Feb 2012, 19:57:45 UTC
Received 25 Nov 2012, 17:11:45 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1093455
Run time 13 days 0 hours 26 min 17 sec
CPU time 12 days 20 hours 36 min 24 sec
Validate state Invalid
Credit 7,153.92
Device peak FLOPS 2.80 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
21:39:04 (4144): No heartbeat from core client for 30 sec - exiting
21:39:05 (4144): No heartbeat from core client for 30 sec - exiting
21:39:06 (4144): No heartbeat from core client for 30 sec - exiting
21:39:07 (4144): No heartbeat from core client for 30 sec - exiting
21:39:08 (4144): No heartbeat from core client for 30 sec - exiting
21:39:09 (4144): No heartbeat from core client for 30 sec - exiting
21:39:10 (4144): No heartbeat from core client for 30 sec - exiting
21:39:11 (4144): No heartbeat from core client for 30 sec - exiting
21:39:12 (4144): No heartbeat from core client for 30 sec - exiting
21:39:13 (4144): No heartbeat from core client for 30 sec - exiting
21:39:14 (4144): No heartbeat from core client for 30 sec - exiting
21:39:15 (4144): No heartbeat from core client for 30 sec - exiting
21:39:16 (4144): No heartbeat from core client for 30 sec - exiting
21:39:17 (4144): No heartbeat from core client for 30 sec - exiting
21:39:18 (4144): No heartbeat from core client for 30 sec - exiting
21:39:19 (4144): No heartbeat from core client for 30 sec - exiting
21:39:20 (4144): No heartbeat from core client for 30 sec - exiting
21:39:21 (4144): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:40:15 (4664): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:46:39 (3628): No heartbeat from core client for 30 sec - exiting
21:46:40 (3628): No heartbeat from core client for 30 sec - exiting
21:46:41 (3628): No heartbeat from core client for 30 sec - exiting
21:46:42 (3628): No heartbeat from core client for 30 sec - exiting
21:46:43 (3628): No heartbeat from core client for 30 sec - exiting
21:46:44 (3628): No heartbeat from core client for 30 sec - exiting
21:46:45 (3628): No heartbeat from core client for 30 sec - exiting
21:46:46 (3628): No heartbeat from core client for 30 sec - exiting
21:46:47 (3628): No heartbeat from core client for 30 sec - exiting
21:46:48 (3628): No heartbeat from core client for 30 sec - exiting
21:46:49 (3628): No heartbeat from core client for 30 sec - exiting
21:46:50 (3628): No heartbeat from core client for 30 sec - exiting
21:46:51 (3628): No heartbeat from core client for 30 sec - exiting
21:46:52 (3628): No heartbeat from core client for 30 sec - exiting
21:46:53 (3628): No heartbeat from core client for 30 sec - exiting
21:46:54 (3628): No heartbeat from core client for 30 sec - exiting
21:46:55 (3628): No heartbeat from core client for 30 sec - exiting
21:46:56 (3628): No heartbeat from core client for 30 sec - exiting
21:46:57 (3628): No heartbeat from core client for 30 sec - exiting
21:46:58 (3628): No heartbeat from core client for 30 sec - exiting
21:46:59 (3628): No heartbeat from core client for 30 sec - exiting
21:47:00 (3628): No heartbeat from core client for 30 sec - exiting
21:47:01 (3628): No heartbeat from core client for 30 sec - exiting
21:47:02 (3628): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:47:40 (4884): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
01:20:02 (5720): No heartbeat from core client for 30 sec - exiting
01:20:03 (5720): No heartbeat from core client for 30 sec - exiting
01:20:04 (5720): No heartbeat from core client for 30 sec - exiting
01:20:05 (5720): No heartbeat from core client for 30 sec - exiting
01:20:06 (5720): No heartbeat from core client for 30 sec - exiting
01:20:07 (5720): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5324, iMonCtr=1
Model crash detected, will try to restart...
22:04:19 (1716): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
00:58:16 (3976): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4860, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4860, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4860, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4860, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4860, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4860, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
13 Jul 2012 19:36:30 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 596,160 1,067,879 1.7913
13 Jul 2012 06:49:08 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 570,240 1,022,129 1.7925
12 Jul 2012 17:52:57 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 544,320 975,897 1.7929
12 Jul 2012 05:36:13 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 518,400 929,938 1.7939
11 Jul 2012 16:05:23 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 492,480 883,847 1.7947
11 Jul 2012 03:17:31 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 466,560 838,114 1.7964
10 Jul 2012 14:31:39 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 440,640 792,567 1.7987
10 Jul 2012 01:39:02 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 414,720 746,615 1.8003
09 Jul 2012 12:55:38 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 388,800 701,151 1.8034
09 Jul 2012 00:03:13 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 362,880 655,142 1.8054
08 Jul 2012 11:20:26 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 336,960 609,560 1.8090
07 Jul 2012 22:22:59 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 311,040 563,693 1.8123
07 Jul 2012 09:35:27 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 285,120 518,155 1.8173
06 Jul 2012 20:47:53 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 259,200 472,448 1.8227
06 Jul 2012 07:59:03 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 233,280 426,804 1.8296
05 Jul 2012 19:15:38 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 207,360 381,116 1.8379
05 Jul 2012 06:18:03 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 181,440 335,169 1.8473
04 Jul 2012 17:25:16 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 155,520 289,112 1.8590
04 Jul 2012 04:10:59 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 129,600 242,279 1.8694
03 Jul 2012 14:29:06 1093455 13670699 hadcm3n_0045_1940_40_007546702_0 103,680 193,249 1.8639


©2024 cpdn.org