Task 11836175

Name	famous_vmqn_1799_200_006715237_2
Workunit	6918490
Created	26 Aug 2010, 17:52:47 UTC
Sent	4 Oct 2010, 16:23:12 UTC
Report deadline	3 Jan 2011, 23:50:23 UTC
Received	15 Oct 2010, 22:30:07 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	25 (0x00000019) Unknown error code
Computer ID	1087636
Run time	3 days 17 hours 51 min 19 sec
CPU time	2 days 8 hours 57 min 42 sec
Validate state	Invalid
Credit	957.42
Device peak FLOPS	1.77 GFLOPS
Application version	UK Met Office FAMOUS v6.11 windows_intelx86
Stderr	<core_client_version>6.10.58</core_client_version> <![CDATA[ <message> The drive cannot locate a specific area or track on the disk. (0x19) - exit code 25 (0x19) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 20:16:37 (344): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:16:39 (344): No heartbeat from core client for 30 sec - exiting 20:16:40 (344): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4200, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1956, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4656, iMonCtr=1 Model crash detected, will try to restart... 19:29:57 (4008): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CoController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3628, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5092, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=480, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1672, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 14:19:10 (5044): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3820, iMonCtr=1 Model crash detected, will try to restart... 11:17:36 (2672): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7144, iMonCtr=1 Model crash detected, will try to restart... 19:28:55 (3664): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 21:53:34 (4932): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:16:59 (4532): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 12:36:33 (4372): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4640, iMonCtr=1 Model crash detected, will try to restart... 17:49:24 (1032): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:49:26 (1032): No heartbeat from core client for 30 sec - exiting 17:49:27 (1032): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3428, iMonCtr=1 Model crash detected, will try to restart... 18:07:40 (4104): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 13:41:07 (4832): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3140, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4504, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5880, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6128, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 14:25:30 (4508): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6100, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3836, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5860, iMonCtr=1 Model crash detected, will try to restart... 23:29:05 (3232): called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
15 Oct 2010 17:54:03	1087636	11836175	famous_vmqn_1799_200_006715237_2	290,186	203,476	0.7012
15 Oct 2010 14:28:42	1087636	11836175	famous_vmqn_1799_200_006715237_2	280,826	196,562	0.6999
14 Oct 2010 20:51:55	1087636	11836175	famous_vmqn_1799_200_006715237_2	271,466	190,247	0.7008
14 Oct 2010 16:23:46	1087636	11836175	famous_vmqn_1799_200_006715237_2	262,106	184,015	0.7021
14 Oct 2010 13:36:34	1087636	11836175	famous_vmqn_1799_200_006715237_2	252,746	177,455	0.7021
13 Oct 2010 21:42:59	1087636	11836175	famous_vmqn_1799_200_006715237_2	243,386	171,229	0.7035
13 Oct 2010 17:58:25	1087636	11836175	famous_vmqn_1799_200_006715237_2	234,026	164,672	0.7036
13 Oct 2010 15:12:30	1087636	11836175	famous_vmqn_1799_200_006715237_2	224,666	158,012	0.7033
13 Oct 2010 10:35:35	1087636	11836175	famous_vmqn_1799_200_006715237_2	215,306	151,453	0.7034
12 Oct 2010 13:55:37	1087636	11836175	famous_vmqn_1799_200_006715237_2	205,946	145,075	0.7044
11 Oct 2010 20:46:55	1087636	11836175	famous_vmqn_1799_200_006715237_2	196,586	138,597	0.7050
11 Oct 2010 12:10:06	1087636	11836175	famous_vmqn_1799_200_006715237_2	187,226	132,120	0.7057
10 Oct 2010 14:24:44	1087636	11836175	famous_vmqn_1799_200_006715237_2	177,866	125,660	0.7065
10 Oct 2010 12:06:03	1087636	11836175	famous_vmqn_1799_200_006715237_2	168,506	118,853	0.7053
09 Oct 2010 22:30:38	1087636	11836175	famous_vmqn_1799_200_006715237_2	159,146	112,401	0.7063
09 Oct 2010 22:30:38	1087636	11836175	famous_vmqn_1799_200_006715237_2	149,786	105,643	0.7053
09 Oct 2010 22:30:38	1087636	11836175	famous_vmqn_1799_200_006715237_2	140,426	98,613	0.7022
09 Oct 2010 22:30:38	1087636	11836175	famous_vmqn_1799_200_006715237_2	131,066	92,158	0.7031
09 Oct 2010 22:30:38	1087636	11836175	famous_vmqn_1799_200_006715237_2	121,706	85,485	0.7024
09 Oct 2010 22:30:38	1087636	11836175	famous_vmqn_1799_200_006715237_2	112,346	78,698	0.7005