Task 15463843

Name	hadcm3n_zaxo_1880_40_008252482_3
Workunit	8407606
Created	26 Nov 2012, 21:24:37 UTC
Sent	26 Nov 2012, 21:24:44 UTC
Report deadline	26 Feb 2013, 4:51:55 UTC
Received	20 Jan 2013, 10:40:44 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	25 (0x00000019) Unknown error code
Computer ID	1255354
Run time	24 days 19 hours 57 min
CPU time	20 days 15 hours 48 min 7 sec
Validate state	Invalid
Credit	11,197.44
Device peak FLOPS	2.52 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The drive cannot locate a specific area or track on the disk. (0x19) - exit code 25 (0x19) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5884, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6012, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5544, iMonCtr=1 Model crash detected, will try to restart... 20:38:51 (4200): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5808, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4544, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4636, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5316, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4344, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5540, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5592, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5224, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5148, iMonCtr=1 Model crash detected, will try to restart... 17:52:38 (5428): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5608, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4680, iMonCtr=1 Model crash detected, will try to restart... 00:07:04 (5076): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:46:36 (3804): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:54:04 (1552): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:56:20 (3080): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4892, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4740, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5728, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3520, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 20:03:56 (768): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:07:14 (5428): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3132, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4284, iMonCtr=1 Model crash detected, will try to restart... 15:14:52 (4764): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5264, iMonCtr=1 Model crash detected, will try to restart... 19:37:16 (5172): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:25:27 (4924): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4112, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 21:39:24 (3516): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:03:16 (4784): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:57:33 (4044): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4972, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2476, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1852, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4820, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5060, iMonCtr=1 Model crash detected, will try to restart... 15:54:31 (5588): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5116, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4604, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5104, iMonCtr=1 Model crash detected, will try to restart... 16:05:11 (4128): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
19 Jan 2013 19:40:05	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	933,120	1,770,528	1.8974
19 Jan 2013 02:38:51	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	907,200	1,717,364	1.8930
17 Jan 2013 14:21:03	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	881,280	1,666,040	1.8905
16 Jan 2013 22:47:24	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	855,360	1,615,473	1.8886
14 Jan 2013 15:35:58	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	829,440	1,564,783	1.8866
13 Jan 2013 18:26:13	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	803,520	1,516,624	1.8875
13 Jan 2013 04:33:48	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	777,600	1,469,396	1.8897
12 Jan 2013 12:25:04	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	751,680	1,420,248	1.8894
11 Jan 2013 07:09:00	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	725,760	1,370,907	1.8889
10 Jan 2013 01:12:22	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	699,840	1,322,343	1.8895
09 Jan 2013 09:00:15	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	673,920	1,272,846	1.8887
05 Jan 2013 20:26:34	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	648,000	1,224,220	1.8892
05 Jan 2013 02:49:27	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	622,080	1,172,806	1.8853
03 Jan 2013 16:21:04	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	596,160	1,122,051	1.8821
03 Jan 2013 00:34:07	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	570,240	1,071,631	1.8793
01 Jan 2013 11:26:51	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	544,320	1,021,162	1.8760
31 Dec 2012 18:57:00	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	518,400	970,458	1.8720
29 Dec 2012 23:57:10	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	492,480	919,681	1.8674
28 Dec 2012 17:56:49	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	466,560	868,699	1.8619
24 Dec 2012 22:45:09	1255354	15463843	hadcm3n_zaxo_1880_40_008252482_3	440,640	818,084	1.8566