Task 15800865

Name	hadcm3n_n4wp_1920_40_008365362_1
Workunit	8516221
Created	29 May 2013, 17:01:04 UTC
Sent	29 May 2013, 17:12:33 UTC
Report deadline	29 Aug 2013, 0:39:44 UTC
Received	18 Jul 2013, 13:22:38 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	25 (0x00000019) Unknown error code
Computer ID	1236068
Run time	30 days 18 hours 31 min 31 sec
CPU time	29 days 23 hours 37 min 51 sec
Validate state	Invalid
Credit	11,197.44
Device peak FLOPS	2.18 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The drive cannot locate a specific area or track on the disk. (0x19) - exit code 25 (0x19) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 18:02:38 (3504): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 09:25:54 (3832): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=960, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/n4wpko.pje7c10 Error converting file to netcdf: dataout/n4wpko.pie7c10 Error converting file to netcdf: dataout/n4wpko.pfe7c10 Error converting file to netcdf: dataout/n4wpka.phe7c10 Error converting file to netcdf: dataout/n4wpka.pge7c10 Error converting file to netcdf: dataout/n4wpka.pee7c10 Error converting file to netcdf: dataout/n4wpka.pde7c10 Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 09:40:04 (6544): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
23 Jul 2013 16:46:20	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	933,120	2,586,465	2.7718
23 Jul 2013 16:46:19	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	907,200	2,517,170	2.7747
10 Jul 2013 00:28:35	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	881,280	2,446,602	2.7762
09 Jul 2013 00:13:18	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	855,360	2,374,265	2.7757
08 Jul 2013 02:51:04	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	829,440	2,302,442	2.7759
07 Jul 2013 07:20:46	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	803,520	2,230,893	2.7764
06 Jul 2013 10:19:23	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	777,600	2,159,734	2.7774
06 Jul 2013 04:51:07	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	751,680	2,087,238	2.7768
04 Jul 2013 14:21:39	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	725,760	2,014,280	2.7754
03 Jul 2013 08:05:10	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	699,840	1,940,522	2.7728
02 Jul 2013 09:53:07	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	673,920	1,866,525	2.7697
28 Jun 2013 03:46:16	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	648,000	1,801,699	2.7804
27 Jun 2013 07:53:35	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	622,080	1,732,076	2.7843
26 Jun 2013 12:01:06	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	596,160	1,662,005	2.7879
25 Jun 2013 16:51:22	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	570,240	1,591,993	2.7918
24 Jun 2013 14:17:25	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	544,320	1,521,670	2.7955
23 Jun 2013 13:28:41	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	518,400	1,448,355	2.7939
22 Jun 2013 16:14:57	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	492,480	1,374,575	2.7911
21 Jun 2013 19:02:18	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	466,560	1,300,634	2.7877
20 Jun 2013 22:17:24	1236068	15800865	hadcm3n_n4wp_1920_40_008365362_1	440,640	1,226,650	2.7838