climateprediction.net home page
Task 15557322

Task 15557322

Name hadcm3n_o4iw_2140_40_008269704_2
Workunit 8424828
Created 25 Jan 2013, 14:21:17 UTC
Sent 25 Jan 2013, 14:21:29 UTC
Report deadline 26 Apr 2013, 21:48:40 UTC
Received 10 Feb 2013, 22:38:50 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1242385
Run time 14 days 9 hours 43 min 53 sec
CPU time 12 days 3 hours 57 min 23 sec
Validate state Invalid
Credit 9,331.20
Device peak FLOPS 2.97 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.42</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6956, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6956, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7404, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5532, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5532, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6728, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6728, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1508, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6204, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5520, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5520, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
MainError:	10:02:31 AM	No files match the supplied pattern.
MainError:	10:02:31 AM	No files match the supplied pattern.
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7116, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7116, iMonCtr=1
Model crash detected, will try to restart...
MainError:	12:26:08 AM	No files match the supplied pattern.
MainError:	12:26:08 AM	No files match the supplied pattern.
MainError:	11:32:56 AM	No files match the supplied pattern.
MainError:	11:32:56 AM	No files match the supplied pattern.
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
13:40:55 (4212): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
MainError:	04:13:17 AM	No files match the supplied pattern.
MainError:	04:13:17 AM	No files match the supplied pattern.
10:00:43 (5164): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
MainError:	05:26:07 PM	No files match the supplied pattern.
MainError:	05:26:07 PM	No files match the supplied pattern.
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
MainError:	08:15:49 AM	No files match the supplied pattern.
MainError:	08:15:49 AM	No files match the supplied pattern.
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
07:59:56 (5916): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
10:43:28 (7224): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
10:43:29 (7224): No heartbeat from core client for 30 sec - exiting
10:43:30 (7224): No heartbeat from core client for 30 sec - exiting
10:43:31 (7224): No heartbeat from core client for 30 sec - exiting
10:43:32 (7224): No heartbeat from core client for 30 sec - exiting
10:43:33 (7224): No heartbeat from core client for 30 sec - exiting
10:43:34 (7224): No heartbeat from core client for 30 sec - exiting
10:43:35 (7224): No heartbeat from core client for 30 sec - exiting
10:43:36 (7224): No heartbeat from core client for 30 sec - exiting
10:43:37 (7224): No heartbeat from core client for 30 sec - exiting
10:43:38 (7224): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
MainError:	11:15:05 PM	No files match the supplied pattern.
MainError:	11:15:05 PM	No files match the supplied pattern.
Suspended CPDN Monitor - Suspend request from BOINC...
MainError:	09:56:03 AM	No files match the supplied pattern.
MainError:	09:56:03 AM	No files match the supplied pattern.
MainError:	08:46:55 PM	No files match the supplied pattern.
MainError:	08:46:55 PM	No files match the supplied pattern.
MainError:	07:41:03 AM	No files match the supplied pattern.
MainError:	07:41:03 AM	No files match the supplied pattern.
Error converting file to netcdf: dataout/o4iwka.ph11c10
Error converting file to netcdf: dataout/o4iwka.pg11c10
Error converting file to netcdf: dataout/o4iwka.pe11c10
MainError:	06:36:12 PM	No files match the supplied pattern.
MainError:	06:36:12 PM	No files match the supplied pattern.
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
10 Feb 2013 18:41:37 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 777,600 1,211,412 1.5579
10 Feb 2013 07:41:41 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 751,680 1,172,735 1.5602
09 Feb 2013 20:47:30 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 725,760 1,134,193 1.5628
09 Feb 2013 09:57:05 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 699,840 1,095,797 1.5658
08 Feb 2013 23:19:20 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 673,920 1,058,176 1.5702
08 Feb 2013 08:16:53 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 648,000 1,010,426 1.5593
07 Feb 2013 17:42:41 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 622,080 969,156 1.5579
07 Feb 2013 04:16:26 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 596,160 925,944 1.5532
06 Feb 2013 11:33:40 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 570,240 874,951 1.5344
06 Feb 2013 00:26:17 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 544,320 835,996 1.5359
05 Feb 2013 10:07:10 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 518,400 797,671 1.5387
04 Feb 2013 23:09:44 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 492,480 759,453 1.5421
04 Feb 2013 06:57:21 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 466,560 711,706 1.5254
03 Feb 2013 19:47:04 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 440,640 672,189 1.5255
03 Feb 2013 08:57:51 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 414,720 634,125 1.5290
02 Feb 2013 23:00:12 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 388,800 595,888 1.5326
01 Feb 2013 21:27:20 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 362,880 551,567 1.5200
01 Feb 2013 10:17:07 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 336,960 513,627 1.5243
31 Jan 2013 23:34:46 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 311,040 476,127 1.5308
31 Jan 2013 08:00:32 1242385 15557322 hadcm3n_o4iw_2140_40_008269704_2 285,120 432,265 1.5161


©2024 climateprediction.net