climateprediction.net home page
Task 15576541

Task 15576541

Name hadcm3n_o6qj_2140_40_008270147_4
Workunit 8425271
Created 31 Jan 2013, 22:56:53 UTC
Sent 31 Jan 2013, 22:57:59 UTC
Report deadline 3 May 2013, 6:25:10 UTC
Received 18 Mar 2013, 18:43:21 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1265083
Run time 11 days 23 hours 58 min 10 sec
CPU time 11 days 11 hours 46 min 2 sec
Validate state Invalid
Credit 9,331.20
Device peak FLOPS 3.33 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
09:25:56 (2400): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1524, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3648, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3416, iMonCtr=1
Model crash detected, will try to restart...
09:48:10 (1976): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3384, iMonCtr=1
Model crash detected, will try to restart...
MainError:	09:02:14 PM	No files match the supplied pattern.
MainError:	09:02:14 PM	No files match the supplied pattern.
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4556, iMonCtr=1
Model crash detected, will try to restart...
MainError:	11:17:14 PM	No files match the supplied pattern.
MainError:	11:17:14 PM	No files match the supplied pattern.
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4320, iMonCtr=1
Model crash detected, will try to restart...
MainError:	11:35:56 PM	No files match the supplied pattern.
MainError:	11:35:56 PM	No files match the supplied pattern.
MainError:	12:43:59 AM	No files match the supplied pattern.
MainError:	12:43:59 AM	No files match the supplied pattern.
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3272, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3116, iMonCtr=1
Model crash detected, will try to restart...
MainError:	06:29:19 PM	No files match the supplied pattern.
MainError:	06:29:19 PM	No files match the supplied pattern.
MainError:	09:13:35 PM	No files match the supplied pattern.
MainError:	09:13:35 PM	No files match the supplied pattern.
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3332, iMonCtr=1
Model crash detected, will try to restart...
MainError:	07:24:18 PM	No files match the supplied pattern.
MainError:	07:24:18 PM	No files match the supplied pattern.
Suspended CPDN Monitor - Suspend request from BOINC...
20:35:09 (3400): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
MainError:	06:10:58 AM	No files match the supplied pattern.
MainError:	06:10:58 AM	No files match the supplied pattern.
MainError:	03:11:28 PM	No files match the supplied pattern.
MainError:	03:11:28 PM	No files match the supplied pattern.
MainError:	12:12:31 AM	No files match the supplied pattern.
MainError:	12:12:31 AM	No files match the supplied pattern.
Error converting file to netcdf: dataout/o6qjka.ph11c10
Error converting file to netcdf: dataout/o6qjka.pg11c10
Error converting file to netcdf: dataout/o6qjka.pe11c10
MainError:	04:52:14 PM	No files match the supplied pattern.
MainError:	04:52:14 PM	No files match the supplied pattern.
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
18 Mar 2013 17:42:48 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 777,600 993,302 1.2774
15 Mar 2013 17:11:09 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 751,680 961,516 1.2792
14 Mar 2013 15:55:42 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 725,760 930,504 1.2821
14 Mar 2013 06:19:08 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 699,840 898,431 1.2838
13 Mar 2013 19:42:03 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 673,920 864,766 1.2832
12 Mar 2013 22:07:01 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 648,000 828,164 1.2780
11 Mar 2013 18:33:20 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 622,080 791,713 1.2727
08 Mar 2013 01:29:11 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 596,160 754,716 1.2660
06 Mar 2013 23:51:58 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 570,240 720,819 1.2641
05 Mar 2013 23:48:40 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 544,320 691,114 1.2697
04 Mar 2013 21:26:52 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 518,400 655,358 1.2642
01 Mar 2013 19:15:24 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 492,480 620,031 1.2590
27 Feb 2013 21:18:57 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 466,560 585,040 1.2539
26 Feb 2013 19:12:05 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 440,640 551,861 1.2524
26 Feb 2013 00:02:16 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 414,720 525,953 1.2682
22 Feb 2013 22:59:54 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 388,800 493,338 1.2689
21 Feb 2013 22:34:41 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 362,880 464,143 1.2791
20 Feb 2013 22:02:01 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 336,960 433,327 1.2860
20 Feb 2013 00:56:29 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 311,040 411,722 1.3237
19 Feb 2013 01:01:32 1265083 15576541 hadcm3n_o6qj_2140_40_008270147_4 285,120 381,177 1.3369


©2024 cpdn.org