Task 11525374

Name	famous_uh0v_1999_200_006654906_3
Workunit	6858278
Created	10 Jun 2010, 14:12:57 UTC
Sent	23 Aug 2010, 21:54:09 UTC
Report deadline	23 Nov 2010, 5:21:20 UTC
Received	22 Oct 2010, 13:13:33 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1026045
Run time	9 days 16 hours 55 min 32 sec
CPU time	8 days 6 hours 31 min 42 sec
Validate state	Invalid
Credit	2,686.79
Device peak FLOPS	1.53 GFLOPS
Application version	UK Met Office FAMOUS v6.11 windows_intelx86
Stderr	<core_client_version>6.10.58</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2704, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 22:31:10 (1484): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1236, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2956, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5628, iMonCtr=1 Model crash detected, will try to restart... 16:45:18 (2860): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:45:19 (2860): No heartbeat from core client for 30 sec - exiting 16:45:20 (2860): No heartbeat from core client for 30 sec - exiting 16:45:21 (2860): No heartbeat from core client for 30 sec - exiting 16:45:22 (2860): No heartbeat from core client for 30 sec - exiting 16:48:48 (5500): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3772, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4380, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2544, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=920, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1416, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2264, iMonCtr=1 Model crash detected, will try to restart... 19:16:27 (2240): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:16:28 (2240): No heartbeat from core client for 30 sec - exiting 19:16:29 (2240): No heartbeat from core client for 30 sec - exiting 19:16:30 (2240): No heartbeat from core client for 30 sec - exiting 19:16:31 (2240): No heartbeat from core client for 30 sec - exiting 19:16:32 (2240): No heartbeat from core client for 30 sec - exiting 19:16:33 (2240): No heartbeat from core client for 30 sec - exiting 19:16:34 (2240): No heartbeat from core client for 30 sec - exiting 19:16:35 (2240): No heartbeat from core client for 30 sec - exiting 19:16:36 (2240): No heartbeat from core client for 30 sec - exiting 19:16:37 (2240): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=340, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=340, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=340, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=340, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=340, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=340, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( 00:47:17 (340): called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
21 Oct 2010 21:53:18	1026045	11525374	famous_uh0v_1999_200_006654906_3	814,346	708,368	0.8699
21 Oct 2010 19:22:16	1026045	11525374	famous_uh0v_1999_200_006654906_3	804,986	700,100	0.8697
21 Oct 2010 16:49:46	1026045	11525374	famous_uh0v_1999_200_006654906_3	795,626	692,042	0.8698
20 Oct 2010 19:29:33	1026045	11525374	famous_uh0v_1999_200_006654906_3	786,266	684,048	0.8700
20 Oct 2010 17:01:31	1026045	11525374	famous_uh0v_1999_200_006654906_3	776,906	676,102	0.8702
19 Oct 2010 21:42:37	1026045	11525374	famous_uh0v_1999_200_006654906_3	767,546	668,042	0.8704
19 Oct 2010 15:41:41	1026045	11525374	famous_uh0v_1999_200_006654906_3	758,186	659,842	0.8703
18 Oct 2010 19:23:30	1026045	11525374	famous_uh0v_1999_200_006654906_3	748,826	651,243	0.8697
18 Oct 2010 16:41:44	1026045	11525374	famous_uh0v_1999_200_006654906_3	739,466	643,231	0.8699
17 Oct 2010 22:01:07	1026045	11525374	famous_uh0v_1999_200_006654906_3	730,106	635,132	0.8699
17 Oct 2010 19:49:10	1026045	11525374	famous_uh0v_1999_200_006654906_3	720,746	627,293	0.8703
17 Oct 2010 17:37:07	1026045	11525374	famous_uh0v_1999_200_006654906_3	711,386	619,454	0.8708
17 Oct 2010 14:53:36	1026045	11525374	famous_uh0v_1999_200_006654906_3	702,026	610,960	0.8703
17 Oct 2010 12:40:26	1026045	11525374	famous_uh0v_1999_200_006654906_3	692,666	603,097	0.8707
16 Oct 2010 22:35:12	1026045	11525374	famous_uh0v_1999_200_006654906_3	683,306	594,838	0.8705
15 Oct 2010 22:48:32	1026045	11525374	famous_uh0v_1999_200_006654906_3	673,946	585,906	0.8694
15 Oct 2010 20:22:45	1026045	11525374	famous_uh0v_1999_200_006654906_3	664,586	577,650	0.8692
15 Oct 2010 14:59:19	1026045	11525374	famous_uh0v_1999_200_006654906_3	655,226	569,462	0.8691
14 Oct 2010 20:41:44	1026045	11525374	famous_uh0v_1999_200_006654906_3	645,866	561,348	0.8691
14 Oct 2010 16:55:47	1026045	11525374	famous_uh0v_1999_200_006654906_3	636,506	553,033	0.8689