Task 15903958

Name	hadcm3n_n1lj_1880_40_008404035_0
Workunit	8554891
Created	23 Jul 2013, 17:24:17 UTC
Sent	23 Jul 2013, 20:55:12 UTC
Report deadline	23 Oct 2013, 4:22:23 UTC
Received	16 Aug 2013, 1:26:41 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1026538
Run time	14 days 19 hours 20 min 38 sec
CPU time	13 days 5 hours 26 min 1 sec
Validate state	Invalid
Credit	7,153.92
Device peak FLOPS	2.52 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.64</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/n1ljko.pj96c10 Error converting file to netcdf: dataout/n1ljko.pi96c10 Error converting file to netcdf: dataout/n1ljko.pf96c10 Error converting file to netcdf: dataout/n1ljka.ph96c10 Error converting file to netcdf: dataout/n1ljka.pg96c10 Error converting file to netcdf: dataout/n1ljka.pe96c10 Error converting file to netcdf: dataout/n1ljka.pd96c10 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4240, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4240, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4240, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4240, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2900, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2900, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
15 Aug 2013 01:33:03	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	596,160	1,117,695	1.8748
15 Aug 2013 01:33:03	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	570,240	1,069,457	1.8755
15 Aug 2013 01:33:03	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	544,320	1,028,273	1.8891
15 Aug 2013 01:33:03	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	518,400	980,159	1.8907
15 Aug 2013 01:33:03	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	492,480	929,081	1.8865
15 Aug 2013 01:33:03	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	466,560	880,550	1.8873
15 Aug 2013 01:33:03	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	440,640	833,949	1.8926
15 Aug 2013 01:33:03	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	414,720	781,187	1.8836
15 Aug 2013 01:33:03	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	388,800	730,051	1.8777
15 Aug 2013 01:33:03	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	362,880	681,967	1.8793
15 Aug 2013 01:33:03	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	336,960	633,057	1.8787
15 Aug 2013 01:33:03	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	311,040	583,904	1.8773
15 Aug 2013 01:33:03	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	285,120	534,842	1.8758
15 Aug 2013 01:33:03	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	259,200	487,279	1.8799
15 Aug 2013 01:33:03	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	233,280	438,838	1.8812
15 Aug 2013 01:33:03	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	207,360	388,689	1.8745
15 Aug 2013 01:33:03	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	181,440	341,278	1.8809
29 Jul 2013 14:23:59	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	155,520	293,041	1.8843
29 Jul 2013 14:23:59	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	129,600	245,992	1.8981
26 Jul 2013 20:47:08	1026538	15903958	hadcm3n_n1lj_1880_40_008404035_0	103,680	197,190	1.9019