Task 11494410

Name	famous_uc8w_1799_200_006648715_3
Workunit	6852087
Created	10 Jun 2010, 13:19:02 UTC
Sent	13 Aug 2010, 6:26:07 UTC
Report deadline	12 Nov 2010, 13:53:18 UTC
Received	22 Aug 2010, 18:58:47 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1057648
Run time	4 days 17 hours 19 min 25 sec
CPU time	4 days 15 hours 45 min 7 sec
Validate state	Invalid
Credit	4,292.63
Device peak FLOPS	2.86 GFLOPS
Application version	UK Met Office FAMOUS v6.11 windows_intelx86
Stderr	<core_client_version>6.10.56</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 17:35:08 (2252): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:41:14 (4532): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 10:04:52 (4604): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:05:05 (4604): No heartbeat from core client for 30 sec - exiting 10:06:44 (5000): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 21:14:03 (4940): No heartbeat from core client for 30 sec - exiting 21:14:04 (4940): No heartbeat from core client for 30 sec - exiting 21:14:05 (4940): No heartbeat from core client for 30 sec - exiting 21:14:06 (4940): No heartbeat from core client for 30 sec - exiting 21:14:07 (4940): No heartbeat from core client for 30 sec - exiting 21:14:38 (4940): No heartbeat from core client for 30 sec - exiting 21:14:39 (4940): No heartbeat from core client for 30 sec - exiting 21:14:40 (4940): No heartbeat from core client for 30 sec - exiting 21:14:41 (4940): No heartbeat from core client for 30 sec - exiting 21:14:42 (4940): No heartbeat from core client for 30 sec - exiting 21:14:43 (4940): No heartbeat from core client for 30 sec - exiting 21:14:44 (4940): No heartbeat from core client for 30 sec - exiting 21:14:45 (4940): No heartbeat from core client for 30 sec - exiting 21:14:46 (4940): No heartbeat from core client for 30 sec - exiting 21:14:47 (4940): No heartbeat from core client for 30 sec - exiting 21:14:48 (4940): No heartbeat from core client for 30 sec - exiting 21:14:49 (4940): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... 23:49:49 (5032): No heartbeat from core client for 30 sec - exiting 23:49:50 (5032): No heartbeat from core client for 30 sec - exiting 23:49:51 (5032): No heartbeat from core client for 30 sec - exiting 23:49:52 (5032): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... 04:49:32 (3668): No heartbeat from core client for 30 sec - exiting 04:49:33 (3668): No heartbeat from core client for 30 sec - exiting 04:49:34 (3668): No heartbeat from core client for 30 sec - exiting 04:49:35 (3668): No heartbeat from core client for 30 sec - exiting 04:49:36 (3668): No heartbeat from core client for 30 sec - exiting 04:49:37 (3668): No heartbeat from core client for 30 sec - exiting 04:49:38 (3668): No heartbeat from core client for 30 sec - exiting 04:49:39 (3668): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... 10:05:40 (3272): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:05:41 (3272): No heartbeat from core client for 30 sec - exiting 10:05:42 (3272): No heartbeat from core client for 30 sec - exiting 10:05:43 (3272): No heartbeat from core client for 30 sec - exiting 10:05:44 (3272): No heartbeat from core client for 30 sec - exiting 10:05:45 (3272): No heartbeat from core client for 30 sec - exiting 10:05:46 (3272): No heartbeat from core client for 30 sec - exiting 10:05:47 (3272): No heartbeat from core client for 30 sec - exiting 10:05:48 (3272): No heartbeat from core client for 30 sec - exiting 10:05:49 (3272): No heartbeat from core client for 30 sec - exiting 10:05:50 (3272): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 11:32:01 (4324): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:21:46 (628): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:08:32 (688): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:10:32 (4792): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:03:00 (3000): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:03:01 (3000): No heartbeat from core client for 30 sec - exiting 10:03:09 (3000): No heartbeat from core client for 30 sec - exiting 10:04:32 (1284): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:04:33 (1284): No heartbeat from core client for 30 sec - exiting 10:04:34 (1284): No heartbeat from core client for 30 sec - exiting 10:04:35 (1284): No heartbeat from core client for 30 sec - exiting 10:04:36 (1284): No heartbeat from core client for 30 sec - exiting 10:04:37 (1284): No heartbeat from core client for 30 sec - exiting 10:05:10 (3796): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:05:11 (3796): No heartbeat from core client for 30 sec - exiting 10:05:12 (3796): No heartbeat from core client for 30 sec - exiting 10:05:13 (3796): No heartbeat from core client for 30 sec - exiting 10:05:14 (3796): No heartbeat from core client for 30 sec - exiting 10:47:10 (4408): No heartbeat from core client for 30 sec - exiting 10:47:11 (4408): No heartbeat from core client for 30 sec - exiting 10:47:12 (4408): No heartbeat from core client for 30 sec - exiting 10:47:13 (4408): No heartbeat from core client for 30 sec - exiting 10:47:14 (4408): No heartbeat from core client for 30 sec - exiting 10:47:15 (4408): No heartbeat from core client for 30 sec - exiting 10:47:16 (4408): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 16:42:18 (4660): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 19:11:37 (4728): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:11:38 (4728): No heartbeat from core client for 30 sec - exiting 19:11:39 (4728): No heartbeat from core client for 30 sec - exiting 19:11:40 (4728): No heartbeat from core client for 30 sec - exiting 19:11:41 (4728): No heartbeat from core client for 30 sec - exiting 19:11:42 (4728): No heartbeat from core client for 30 sec - exiting 19:11:43 (4728): No heartbeat from core client for 30 sec - exiting 19:11:44 (4728): No heartbeat from core client for 30 sec - exiting 19:11:45 (4728): No heartbeat from core client for 30 sec - exiting 19:11:46 (4728): No heartbeat from core client for 30 sec - exiting 19:11:47 (4728): No heartbeat from core client for 30 sec - exiting cpdnmonitor: cannot open input file D:\BOINC\Data/projects/climateprediction.net/famous_uc8w_1799_200_006648715/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy cpdnmonitor: cannot open input file D:\BOINC\Data/projects/climateprediction.net/famous_uc8w_1799_200_006648715/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy cpdnmonitor: cannot open input file D:\BOINC\Data/projects/climateprediction.net/famous_uc8w_1799_200_006648715/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy cpdnmonitor: cannot open input file D:\BOINC\Data/projects/climateprediction.net/famous_uc8w_1799_200_006648715/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy cpdnmonitor: cannot open input file D:\BOINC\Data/projects/climateprediction.net/famous_uc8w_1799_200_006648715/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy cpdnmonitor: cannot open input file D:\BOINC\Data/projects/climateprediction.net/famous_uc8w_1799_200_006648715/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy Sorry, too many model crashes! :-( 19:14:42 (3840): called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
22 Aug 2010 17:30:08	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,301,066	400,307	0.3077
22 Aug 2010 16:47:15	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,291,706	397,602	0.3078
22 Aug 2010 15:58:50	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,282,346	394,891	0.3079
22 Aug 2010 15:14:22	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,272,986	392,180	0.3081
22 Aug 2010 14:26:09	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,263,626	389,440	0.3082
22 Aug 2010 13:46:59	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,254,266	386,872	0.3084
22 Aug 2010 12:13:52	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,244,906	383,877	0.3084
22 Aug 2010 11:26:08	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,235,546	381,046	0.3084
22 Aug 2010 10:38:44	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,226,186	378,194	0.3084
22 Aug 2010 09:56:41	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,216,826	375,701	0.3088
22 Aug 2010 08:58:23	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,207,466	372,518	0.3085
22 Aug 2010 08:16:01	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,198,106	369,707	0.3086
22 Aug 2010 07:23:36	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,188,746	366,904	0.3086
22 Aug 2010 06:36:04	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,179,386	364,091	0.3087
22 Aug 2010 05:52:46	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,170,026	361,279	0.3088
22 Aug 2010 05:08:33	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,160,666	358,526	0.3089
22 Aug 2010 04:20:14	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,151,306	355,652	0.3089
22 Aug 2010 03:28:51	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,141,946	352,852	0.3090
22 Aug 2010 02:41:14	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,132,586	350,034	0.3091
22 Aug 2010 01:56:39	1057648	11494410	famous_uc8w_1799_200_006648715_3	1,123,226	347,310	0.3092