climateprediction.net home page
Task 15517161

Task 15517161

Name hadcm3n_o7cq_2140_40_008269308_2
Workunit 8424432
Created 29 Dec 2012, 20:01:53 UTC
Sent 29 Dec 2012, 20:51:55 UTC
Report deadline 31 Mar 2013, 4:19:06 UTC
Received 26 Feb 2013, 5:35:31 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1233715
Run time 17 days 21 hours 40 min 57 sec
CPU time 14 days 16 hours 33 min 14 sec
Validate state Invalid
Credit 9,331.20
Device peak FLOPS 2.77 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5280, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
18:04:22 (348): No heartbeat from core client for 30 sec - exiting
18:04:23 (348): No heartbeat from core client for 30 sec - exiting
18:04:25 (348): No heartbeat from core client for 30 sec - exiting
18:04:26 (348): No heartbeat from core client for 30 sec - exiting
18:04:27 (348): No heartbeat from core client for 30 sec - exiting
18:04:28 (348): No heartbeat from core client for 30 sec - exiting
18:04:29 (348): No heartbeat from core client for 30 sec - exiting
18:04:30 (348): No heartbeat from core client for 30 sec - exiting
18:04:31 (348): No heartbeat from core client for 30 sec - exiting
18:04:32 (348): No heartbeat from core client for 30 sec - exiting
18:04:33 (348): No heartbeat from core client for 30 sec - exiting
18:04:34 (348): No heartbeat from core client for 30 sec - exiting
18:04:35 (348): No heartbeat from core client for 30 sec - exiting
18:04:37 (348): No heartbeat from core client for 30 sec - exiting
18:04:38 (348): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
MainError:	07:08:21 AM	No files match the supplied pattern.
MainError:	07:08:21 AM	No files match the supplied pattern.
Suspended CPDN Monitor - Suspend request from BOINC...
MainError:	06:43:27 AM	No files match the supplied pattern.
MainError:	06:43:27 AM	No files match the supplied pattern.
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
MainError:	09:27:46 AM	No files match the supplied pattern.
MainError:	09:27:46 AM	No files match the supplied pattern.
MainError:	12:19:33 AM	No files match the supplied pattern.
MainError:	12:19:33 AM	No files match the supplied pattern.
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
MainError:	01:04:18 AM	No files match the supplied pattern.
MainError:	01:04:18 AM	No files match the supplied pattern.
MainError:	03:45:52 PM	No files match the supplied pattern.
MainError:	03:45:52 PM	No files match the supplied pattern.
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
MainError:	11:30:28 AM	No files match the supplied pattern.
MainError:	11:30:28 AM	No files match the supplied pattern.
MainError:	02:09:03 AM	No files match the supplied pattern.
MainError:	02:09:03 AM	No files match the supplied pattern.
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7240, iMonCtr=1
Model crash detected, will try to restart...
03:37:41 (3700): No heartbeat from core client for 30 sec - exiting
03:37:42 (3700): No heartbeat from core client for 30 sec - exiting
03:37:43 (3700): No heartbeat from core client for 30 sec - exiting
03:37:44 (3700): No heartbeat from core client for 30 sec - exiting
03:37:45 (3700): No heartbeat from core client for 30 sec - exiting
03:37:46 (3700): No heartbeat from core client for 30 sec - exiting
03:37:47 (3700): No heartbeat from core client for 30 sec - exiting
03:37:48 (3700): No heartbeat from core client for 30 sec - exiting
03:37:49 (3700): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
MainError:	05:21:05 PM	No files match the supplied pattern.
MainError:	05:21:05 PM	No files match the supplied pattern.
MainError:	08:12:18 AM	No files match the supplied pattern.
MainError:	08:12:18 AM	No files match the supplied pattern.
Error converting file to netcdf: dataout/o7cqka.ph11c10
Error converting file to netcdf: dataout/o7cqka.pg11c10
Error converting file to netcdf: dataout/o7cqka.pe11c10
MainError:	10:43:23 PM	No files match the supplied pattern.
MainError:	10:43:23 PM	No files match the supplied pattern.
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
25 Feb 2013 22:46:42 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 777,600 1,392,027 1.7902
25 Feb 2013 08:14:34 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 751,680 1,345,306 1.7897
24 Feb 2013 17:22:15 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 725,760 1,298,816 1.7896
13 Feb 2013 02:09:59 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 699,840 1,252,193 1.7893
12 Feb 2013 11:35:15 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 673,920 1,205,623 1.7890
10 Feb 2013 15:50:54 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 648,000 1,159,099 1.7887
10 Feb 2013 01:05:08 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 622,080 1,112,499 1.7884
09 Feb 2013 00:24:42 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 596,160 1,066,279 1.7886
08 Feb 2013 09:42:24 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 570,240 1,019,412 1.7877
06 Feb 2013 06:57:49 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 544,320 973,448 1.7884
05 Feb 2013 07:11:34 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 518,400 927,158 1.7885
04 Feb 2013 17:16:17 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 492,480 880,877 1.7887
04 Feb 2013 02:46:24 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 466,560 834,122 1.7878
03 Feb 2013 12:50:11 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 440,640 786,984 1.7860
21 Jan 2013 22:38:30 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 414,720 740,857 1.7864
21 Jan 2013 09:11:18 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 388,800 693,709 1.7842
20 Jan 2013 19:46:13 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 362,880 646,745 1.7823
20 Jan 2013 06:14:12 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 336,960 599,428 1.7789
19 Jan 2013 16:10:49 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 311,040 552,491 1.7763
19 Jan 2013 01:43:38 1233715 15517161 hadcm3n_o7cq_2140_40_008269308_2 285,120 506,282 1.7757


©2024 climateprediction.net