Task 12823761

Name	hadcm3n_p3cf_1900_40_007221919_0
Workunit	7420159
Created	26 Apr 2011, 15:24:12 UTC
Sent	1 May 2011, 5:01:46 UTC
Report deadline	31 Jul 2011, 12:28:57 UTC
Received	1 Jun 2011, 16:55:39 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1143693
Run time	8 days 8 hours 10 min 8 sec
CPU time	8 days 7 hours 43 min 45 sec
Validate state	Invalid
Credit	5,287.68
Device peak FLOPS	2.79 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.60</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5364, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1268, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4072, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5060, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3828, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3632, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1056, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... No Process Handle Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=372, selfPID=372, iMonCtr=1 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/p3cfko.pjb5c10 Error converting file to netcdf: dataout/p3cfko.pib5c10 Error converting file to netcdf: dataout/p3cfko.pfb5c10 Error converting file to netcdf: dataout/p3cfka.phb5c10 Error converting file to netcdf: dataout/p3cfka.pgb5c10 Error converting file to netcdf: dataout/p3cfka.peb5c10 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3760, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4612, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4612, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4612, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4612, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4612, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4612, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
01 Jun 2011 00:26:25	1143693	12823761	hadcm3n_p3cf_1900_40_007221919_0	440,640	704,649	1.5991
30 May 2011 23:15:36	1143693	12823761	hadcm3n_p3cf_1900_40_007221919_0	414,720	663,705	1.6004
29 May 2011 21:55:24	1143693	12823761	hadcm3n_p3cf_1900_40_007221919_0	388,800	622,385	1.6008
28 May 2011 02:12:41	1143693	12823761	hadcm3n_p3cf_1900_40_007221919_0	362,880	580,733	1.6003
26 May 2011 23:06:02	1143693	12823761	hadcm3n_p3cf_1900_40_007221919_0	336,960	539,777	1.6019
23 May 2011 17:36:22	1143693	12823761	hadcm3n_p3cf_1900_40_007221919_0	311,040	498,718	1.6034
22 May 2011 21:42:50	1143693	12823761	hadcm3n_p3cf_1900_40_007221919_0	285,120	456,741	1.6019
19 May 2011 19:20:58	1143693	12823761	hadcm3n_p3cf_1900_40_007221919_0	259,200	415,086	1.6014
16 May 2011 23:50:26	1143693	12823761	hadcm3n_p3cf_1900_40_007221919_0	233,280	373,172	1.5997
13 May 2011 22:16:19	1143693	12823761	hadcm3n_p3cf_1900_40_007221919_0	207,360	332,161	1.6019
11 May 2011 05:27:25	1143693	12823761	hadcm3n_p3cf_1900_40_007221919_0	181,440	290,346	1.6002
10 May 2011 04:43:23	1143693	12823761	hadcm3n_p3cf_1900_40_007221919_0	155,520	249,088	1.6016
09 May 2011 17:14:39	1143693	12823761	hadcm3n_p3cf_1900_40_007221919_0	129,600	207,753	1.6030
04 May 2011 19:50:04	1143693	12823761	hadcm3n_p3cf_1900_40_007221919_0	103,680	165,646	1.5977
03 May 2011 17:24:00	1143693	12823761	hadcm3n_p3cf_1900_40_007221919_0	77,760	124,231	1.5976
02 May 2011 22:08:24	1143693	12823761	hadcm3n_p3cf_1900_40_007221919_0	51,840	82,974	1.6006
02 May 2011 02:06:45	1143693	12823761	hadcm3n_p3cf_1900_40_007221919_0	25,920	41,221	1.5903