climateprediction.net home page
Task 15501912

Task 15501912

Name hadcm3n_o68t_2140_40_008269131_0
Workunit 8424255
Created 23 Dec 2012, 22:02:42 UTC
Sent 24 Dec 2012, 10:50:57 UTC
Report deadline 25 Mar 2013, 18:18:08 UTC
Received 23 Feb 2013, 22:41:54 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1238367
Run time 13 days 5 hours 10 min 7 sec
CPU time 11 days 20 hours 34 min 23 sec
Validate state Invalid
Credit 9,331.20
Device peak FLOPS 2.98 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
23:35:57 (9856): Can't acquire lockfile (32) - waiting 35s
23:36:23 (5888): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7136, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7056, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6128, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6428, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6428, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6428, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9196, iMonCtr=1
Model crash detected, will try to restart...
08:38:16 (6732): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1608, iMonCtr=1
Model crash detected, will try to restart...
08:31:20 (7312): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
00:12:58 (7016): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10156, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6784, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6664, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7200, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
MainError:	05:33:31 PM	No files match the supplied pattern.
MainError:	05:33:31 PM	No files match the supplied pattern.
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
MainError:	10:09:26 PM	No files match the supplied pattern.
MainError:	10:09:26 PM	No files match the supplied pattern.
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1172, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7244, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
MainError:	04:45:05 PM	No files match the supplied pattern.
MainError:	04:45:05 PM	No files match the supplied pattern.
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7028, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
MainError:	11:50:01 PM	No files match the supplied pattern.
MainError:	11:50:01 PM	No files match the supplied pattern.
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7700, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6840, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7736, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7736, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7736, iMonCtr=1
Model crash detected, will try to restart...
MainError:	01:28:35 PM	No files match the supplied pattern.
MainError:	01:28:35 PM	No files match the supplied pattern.
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7208, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
MainError:	06:21:33 PM	No files match the supplied pattern.
MainError:	06:21:33 PM	No files match the supplied pattern.
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8276, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
MainError:	09:53:50 PM	No files match the supplied pattern.
MainError:	09:53:50 PM	No files match the supplied pattern.
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9952, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7456, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
MainError:	05:26:46 PM	No files match the supplied pattern.
MainError:	05:26:46 PM	No files match the supplied pattern.
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7108, iMonCtr=1
Model crash detected, will try to restart...
MainError:	02:21:28 PM	No files match the supplied pattern.
MainError:	02:21:28 PM	No files match the supplied pattern.
Suspended CPDN Monitor - Suspend request from BOINC...
MainError:	12:28:30 AM	No files match the supplied pattern.
MainError:	12:28:30 AM	No files match the supplied pattern.
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Error converting file to netcdf: dataout/o68tka.ph11c10
Error converting file to netcdf: dataout/o68tka.pg11c10
Error converting file to netcdf: dataout/o68tka.pe11c10
MainError:	03:15:37 PM	No files match the supplied pattern.
MainError:	03:15:37 PM	No files match the supplied pattern.
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
23 Feb 2013 15:20:39 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 777,600 1,024,023 1.3169
22 Feb 2013 12:30:00 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 751,680 984,594 1.3099
21 Feb 2013 14:25:31 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 725,760 952,587 1.3125
20 Feb 2013 17:31:13 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 699,840 919,010 1.3132
18 Feb 2013 21:55:26 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 673,920 886,644 1.3157
15 Feb 2013 18:33:50 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 648,000 852,420 1.3155
13 Feb 2013 13:40:16 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 622,080 819,015 1.3166
11 Feb 2013 23:52:26 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 596,160 786,318 1.3190
10 Feb 2013 16:46:06 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 570,240 753,433 1.3213
09 Feb 2013 22:49:27 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 544,320 716,805 1.3169
07 Feb 2013 17:37:37 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 518,400 683,918 1.3193
06 Feb 2013 19:05:56 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 492,480 651,654 1.3232
06 Feb 2013 07:27:55 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 466,560 619,884 1.3286
05 Feb 2013 16:51:04 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 440,640 587,538 1.3334
04 Feb 2013 10:18:07 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 414,720 555,571 1.3396
03 Feb 2013 08:27:44 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 388,800 522,702 1.3444
27 Jan 2013 22:17:23 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 362,880 491,261 1.3538
24 Jan 2013 11:31:34 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 336,960 458,122 1.3596
24 Jan 2013 10:31:24 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 311,040 431,981 1.3888
23 Jan 2013 13:50:19 1238367 15501912 hadcm3n_o68t_2140_40_008269131_0 285,120 399,406 1.4008


©2024 cpdn.org