climateprediction.net home page
Task 13143905

Task 13143905

Name hadcm3n_ycqf_1900_40_007349361_2
Workunit 7546791
Created 17 Jul 2011, 19:05:34 UTC
Sent 17 Jul 2011, 19:28:47 UTC
Report deadline 17 Oct 2011, 2:55:58 UTC
Received 6 Nov 2011, 10:06:38 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 936524
Run time 15 days 11 hours 3 min 11 sec
CPU time 15 days 3 hours 17 min 34 sec
Validate state Invalid
Credit 10,575.36
Device peak FLOPS 3.03 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.6.36</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
14:36:38 (6132): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
14:24:12 (6032): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
09:04:15 (6216): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
09:04:16 (6216): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5016, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
21:26:36 (6052): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
21:20:04 (9984): No heartbeat from core client for 30 sec - exiting
21:20:05 (9984): No heartbeat from core client for 30 sec - exiting
21:20:06 (9984): No heartbeat from core client for 30 sec - exiting
21:20:07 (9984): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
23:33:22 (6528): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:35:28 (6520): No heartbeat from core client for 30 sec - exiting
14:35:29 (6520): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
16:32:54 (9296): No heartbeat from core client for 30 sec - exiting
16:32:55 (9296): No heartbeat from core client for 30 sec - exiting
16:32:56 (9296): No heartbeat from core client for 30 sec - exiting
16:32:57 (9296): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
16:32:58 (9296): No heartbeat from core client for 30 sec - exiting
14:08:33 (3140): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
21:42:40 (9072): No heartbeat from core client for 30 sec - exiting
21:42:41 (9072): No heartbeat from core client for 30 sec - exiting
21:42:42 (9072): No heartbeat from core client for 30 sec - exiting
21:42:43 (9072): No heartbeat from core client for 30 sec - exiting
21:42:44 (9072): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
BUFFOUT: C I/O Error - Return code = 32

Model crashed: WRITDUMP: BAD BUFFOUT OF DATA                                                                                                                                                                                                                                   tmp/pipe_dummy                                                                  2048    
21:49:05 (8012): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:49:06 (8012): No heartbeat from core client for 30 sec - exiting
21:49:07 (8012): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
12:07:18 (9772): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
10:54:41 (5016): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6200, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6200, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6200, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6200, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6200, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6200, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
05 Nov 2011 22:27:47 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 881,280 1,303,564 1.4792
04 Nov 2011 08:09:50 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 855,360 1,265,143 1.4791
01 Nov 2011 20:26:10 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 829,440 1,226,754 1.4790
31 Oct 2011 17:41:46 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 803,520 1,182,970 1.4722
31 Oct 2011 17:12:12 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 777,600 1,142,682 1.4695
31 Oct 2011 15:11:39 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 751,680 1,105,415 1.4706
31 Oct 2011 12:51:24 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 725,760 1,068,588 1.4724
31 Oct 2011 12:51:23 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 699,840 1,032,291 1.4750
31 Oct 2011 12:51:20 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 673,920 995,826 1.4777
16 Oct 2011 15:22:22 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 648,000 958,566 1.4793
15 Oct 2011 17:52:22 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 622,080 921,640 1.4815
14 Oct 2011 19:08:02 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 596,160 884,949 1.4844
12 Oct 2011 11:43:03 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 570,240 846,741 1.4849
11 Oct 2011 15:27:16 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 544,320 809,562 1.4873
10 Oct 2011 14:31:24 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 518,400 772,210 1.4896
05 Oct 2011 15:32:12 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 492,480 734,079 1.4906
03 Oct 2011 13:54:16 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 466,560 696,470 1.4928
22 Sep 2011 18:04:07 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 440,640 656,144 1.4891
07 Sep 2011 16:02:19 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 414,720 617,283 1.4884
06 Sep 2011 19:13:12 936524 13143905 hadcm3n_ycqf_1900_40_007349361_2 388,800 578,622 1.4882


©2024 cpdn.org