climateprediction.net home page
Task 13026162

Task 13026162

Name hadcm3n_t5eh_1940_40_007316123_2
Workunit 7513553
Created 29 Jun 2011, 2:27:08 UTC
Sent 29 Jun 2011, 2:30:37 UTC
Report deadline 28 Sep 2011, 9:57:48 UTC
Received 18 Jul 2011, 0:12:17 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1155912
Run time 9 days 15 hours 53 min 59 sec
CPU time 9 days 15 hours 13 min 8 sec
Validate state Invalid
Credit 8,398.08
Device peak FLOPS 2.98 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.12.26</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
21:00:52 (5776): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
21:00:56 (5776): No heartbeat from core client for 30 sec - exiting
21:00:57 (5776): No heartbeat from core client for 30 sec - exiting
21:00:58 (5776): No heartbeat from core client for 30 sec - exiting
21:01:00 (5776): No heartbeat from core client for 30 sec - exiting
21:01:01 (5776): No heartbeat from core client for 30 sec - exiting
21:01:02 (5776): No heartbeat from core client for 30 sec - exiting
21:01:03 (5776): No heartbeat from core client for 30 sec - exiting
21:01:04 (5776): No heartbeat from core client for 30 sec - exiting
02:34:40 (1120): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
02:34:41 (1120): No heartbeat from core client for 30 sec - exiting
02:34:42 (1120): No heartbeat from core client for 30 sec - exiting
02:34:43 (1120): No heartbeat from core client for 30 sec - exiting
02:34:45 (1120): No heartbeat from core client for 30 sec - exiting
02:34:46 (1120): No heartbeat from core client for 30 sec - exiting
02:34:47 (1120): No heartbeat from core client for 30 sec - exiting
02:34:48 (1120): No heartbeat from core client for 30 sec - exiting
02:34:49 (1120): No heartbeat from core client for 30 sec - exiting
02:34:50 (1120): No heartbeat from core client for 30 sec - exiting
02:34:51 (1120): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4668, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
15:11:36 (4924): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
15:16:42 (2220): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
15:21:47 (4432): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:12:34 (4860): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:32:19 (1756): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
16:50:51 (2196): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
17:10:35 (1276): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3856, iMonCtr=1
Model crash detected, will try to restart...
11:25:46 (3856): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5888, iMonCtr=1
Model crash detected, will try to restart...
CSignal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4184, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4184, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4184, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4184, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4184, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3164, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
25 Jul 2011 16:38:43 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 699,840 827,354 1.1822
25 Jul 2011 16:38:43 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 673,920 797,072 1.1827
10 Jul 2011 15:19:23 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 648,000 766,883 1.1835
10 Jul 2011 06:56:04 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 622,080 736,642 1.1842
09 Jul 2011 21:22:18 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 596,160 706,252 1.1847
09 Jul 2011 13:06:50 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 570,240 676,443 1.1862
09 Jul 2011 04:48:25 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 544,320 646,610 1.1879
08 Jul 2011 18:48:28 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 518,400 616,171 1.1886
08 Jul 2011 11:04:24 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 492,480 584,742 1.1873
08 Jul 2011 01:36:12 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 466,560 554,437 1.1884
07 Jul 2011 17:56:39 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 440,640 524,440 1.1902
07 Jul 2011 17:56:39 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 414,720 494,258 1.1918
07 Jul 2011 17:56:39 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 388,800 463,250 1.1915
07 Jul 2011 15:36:41 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 362,880 432,100 1.1908
07 Jul 2011 15:36:41 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 336,960 401,012 1.1901
04 Jul 2011 12:06:13 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 311,040 370,013 1.1896
04 Jul 2011 03:34:03 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 285,120 339,116 1.1894
03 Jul 2011 21:51:00 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 259,200 308,212 1.1891
03 Jul 2011 21:51:00 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 233,280 277,664 1.1903
03 Jul 2011 21:51:00 1155912 13026162 hadcm3n_t5eh_1940_40_007316123_2 207,360 247,094 1.1916


©2024 cpdn.org