climateprediction.net home page
Task 15288830

Task 15288830

Name hadcm3n_z8kq_1880_40_008200461_2
Workunit 8355585
Created 16 Sep 2012, 14:46:30 UTC
Sent 16 Sep 2012, 14:47:17 UTC
Report deadline 16 Dec 2012, 22:14:28 UTC
Received 4 Nov 2012, 1:47:43 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 775427
Run time 23 days 7 hours 37 min 22 sec
CPU time 21 days 15 hours 4 min 57 sec
Validate state Invalid
Credit 11,819.52
Device peak FLOPS 2.30 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
14:50:18 (1564): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
18:50:56 (7692): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9428, iMonCtr=1
Model crash detected, will try to restart...
16:22:51 (5216): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
17:24:29 (7400): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:55:48 (4092): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
08:01:46 (4016): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:26:21 (7952): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
15:27:56 (7776): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:21:38 (8632): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
11:22:50 (4932): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
16:25:31 (4592): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
21:34:11 (5192): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:34:12 (5192): No heartbeat from core client for 30 sec - exiting
11:20:21 (4232): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:11:22 (4812): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:34:47 (8788): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8580, iMonCtr=1
Model crash detected, will try to restart...
19:44:45 (3860): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:44:46 (3860): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
20:46:05 (8268): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7240, iMonCtr=1
Model crash detected, will try to restart...
22:47:51 (3988): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
22:47:52 (3988): No heartbeat from core client for 30 sec - exiting
23:49:43 (6288): No heartbeat from core client for 30 sec - exiting
23:49:44 (6288): No heartbeat from core client for 30 sec - exiting
23:49:45 (6288): No heartbeat from core client for 30 sec - exiting
23:49:46 (6288): No heartbeat from core client for 30 sec - exiting
23:49:47 (6288): No heartbeat from core client for 30 sec - exiting
23:49:48 (6288): No heartbeat from core client for 30 sec - exiting
23:49:49 (6288): No heartbeat from core client for 30 sec - exiting
23:49:50 (6288): No heartbeat from core client for 30 sec - exiting
23:49:51 (6288): No heartbeat from core client for 30 sec - exiting
23:49:52 (6288): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
BUFFIN: C I/O Error feof - Unit 63 - Return code = 16
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Error converting file to netcdf: dataout/z8kqko.pj89c10
Error converting file to netcdf: dataout/z8kqko.pi89c10
Error converting file to netcdf: dataout/z8kqko.pf89c10
Error converting file to netcdf: dataout/z8kqka.ph89c10
Error converting file to netcdf: dataout/z8kqka.pg89c10
Error converting file to netcdf: dataout/z8kqka.pe89c10
Error converting file to netcdf: dataout/z8kqka.pd89c10
12:32:04 (5744): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:32:05 (5744): No heartbeat from core client for 30 sec - exiting
20:03:39 (4204): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:12:30 (7776): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7156, iMonCtr=1
Model crash detected, will try to restart...
08:14:33 (3612): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:12:30 (2432): No heartbeat from core client for 30 sec - exiting
12:12:32 (2432): No heartbeat from core client for 30 sec - exiting
12:12:33 (2432): No heartbeat from core client for 30 sec - exiting
12:12:34 (2432): No heartbeat from core client for 30 sec - exiting
12:12:35 (2432): No heartbeat from core client for 30 sec - exiting
12:12:36 (2432): No heartbeat from core client for 30 sec - exiting
12:12:37 (2432): No heartbeat from core client for 30 sec - exiting
12:12:38 (2432): No heartbeat from core client for 30 sec - exiting
12:12:39 (2432): No heartbeat from core client for 30 sec - exiting
12:12:40 (2432): No heartbeat from core client for 30 sec - exiting
12:12:41 (2432): No heartbeat from core client for 30 sec - exiting
12:12:42 (2432): No heartbeat from core client for 30 sec - exiting
12:12:43 (2432): No heartbeat from core client for 30 sec - exiting
12:12:44 (2432): No heartbeat from core client for 30 sec - exiting
12:12:45 (2432): No heartbeat from core client for 30 sec - exiting
12:12:46 (2432): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
16:56:17 (4488): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:34:40 (8156): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
13:23:06 (1504): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:24:35 (4352): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
15:47:52 (7656): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
22:27:08 (4512): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1156, iMonCtr=1
Model crash detected, will try to restart...
08:07:09 (4948): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
08:08:48 (3744): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
08:13:32 (5780): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
08:19:39 (4492): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
C16:45:54 (4828): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
18:37:34 (5404): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
22:27:07 (5712): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4228, iMonCtr=1
Model crash detected, will try to restart...
22:28:52 (4660): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
22:28:53 (4660): No heartbeat from core client for 30 sec - exiting
15:57:37 (6272): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
15:57:38 (6272): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
03:40:27 (5604): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
03:40:28 (5604): No heartbeat from core client for 30 sec - exiting
05:57:57 (7348): No heartbeat from core client for 30 sec - exiting
05:57:58 (7348): No heartbeat from core client for 30 sec - exiting
05:57:59 (7348): No heartbeat from core client for 30 sec - exiting
05:58:00 (7348): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...

zip error: Could not create output file (was replacing the original zip file)
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
19:48:15 (5656): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:59:32 (2624): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2460, iMonCtr=1
Model crash detected, will try to restart...
18:09:02 (1600): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:16:08 (6072): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
17:58:00 (7856): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
17:58:01 (7856): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
22:42:27 (4756): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
11:48:42 (5404): No heartbeat from core client for 30 sec - exiting
11:48:43 (5404): No heartbeat from core client for 30 sec - exiting
11:48:44 (5404): No heartbeat from core client for 30 sec - exiting
11:48:45 (5404): No heartbeat from core client for 30 sec - exiting
11:48:46 (5404): No heartbeat from core client for 30 sec - exiting
11:48:47 (5404): No heartbeat from core client for 30 sec - exiting
11:48:48 (5404): No heartbeat from core client for 30 sec - exiting
11:48:49 (5404): No heartbeat from core client for 30 sec - exiting
11:48:50 (5404): No heartbeat from core client for 30 sec - exiting
11:48:51 (5404): No heartbeat from core client for 30 sec - exiting
11:48:52 (5404): No heartbeat from core client for 30 sec - exiting
11:48:53 (5404): No heartbeat from core client for 30 sec - exiting
11:48:54 (5404): No heartbeat from core client for 30 sec - exiting
11:48:55 (5404): No heartbeat from core client for 30 sec - exiting
11:48:56 (5404): No heartbeat from core client for 30 sec - exiting
11:48:57 (5404): No heartbeat from core client for 30 sec - exiting
11:48:58 (5404): No heartbeat from core client for 30 sec - exiting
11:48:59 (5404): No heartbeat from core client for 30 sec - exiting
11:49:00 (5404): No heartbeat from core client for 30 sec - exiting
11:49:01 (5404): No heartbeat from core client for 30 sec - exiting
11:49:02 (5404): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
MainError:	05:49:09 PM	No files match the supplied pattern.
MainError:	05:49:09 PM	No files match the supplied pattern.
19:12:12 (6640): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:40:03 (5044): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
16:06:23 (6476): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
17:07:56 (7112): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
14:15:06 (5864): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
15:23:05 (7548): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
17:18:41 (7784): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:20:08 (4912): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:34:00 (4632): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:21:27 (4992): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
23:55:22 (6264): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
00:56:40 (6412): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
16:07:22 (7564): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
04:03:49 (5312): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
17:24:03 (6428): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:27:42 (7832): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7476, iMonCtr=1
Model crash detected, will try to restart...
13:40:17 (5268): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:22:55 (7400): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
BUFFIN: C I/O Error feof - Unit 63 - Return code = 16
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
04 Nov 2012 00:24:24 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 984,960 1,868,671 1.8972
02 Nov 2012 23:28:28 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 959,040 1,818,946 1.8966
02 Nov 2012 08:14:51 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 933,120 1,770,493 1.8974
01 Nov 2012 16:32:46 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 907,200 1,719,921 1.8959
01 Nov 2012 01:25:16 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 881,280 1,666,972 1.8915
01 Nov 2012 01:25:16 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 855,360 1,614,461 1.8875
30 Oct 2012 10:20:10 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 829,440 1,563,111 1.8845
29 Oct 2012 18:43:06 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 803,520 1,510,090 1.8793
28 Oct 2012 17:49:55 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 777,600 1,459,024 1.8763
27 Oct 2012 18:52:26 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 751,680 1,409,701 1.8754
26 Oct 2012 18:14:34 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 725,760 1,360,469 1.8745
25 Oct 2012 18:56:25 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 699,840 1,313,313 1.8766
24 Oct 2012 18:43:05 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 673,920 1,266,627 1.8795
23 Oct 2012 19:48:38 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 648,000 1,220,951 1.8842
22 Oct 2012 20:55:09 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 622,080 1,175,311 1.8893
21 Oct 2012 23:31:11 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 596,160 1,129,181 1.8941
21 Oct 2012 00:18:45 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 570,240 1,083,523 1.9001
20 Oct 2012 01:05:30 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 544,320 1,038,093 1.9071
19 Oct 2012 12:01:56 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 518,400 993,170 1.9158
18 Oct 2012 22:14:39 775427 15288830 hadcm3n_z8kq_1880_40_008200461_2 492,480 947,620 1.9242


©2024 cpdn.org