Task 11645020

Name	hadsm3dhet2_u5dm_006670131_3
Workunit	6873385
Created	9 Aug 2010, 15:31:14 UTC
Sent	25 Sep 2010, 3:27:43 UTC
Report deadline	7 Sep 2011, 8:47:43 UTC
Received	3 Oct 2010, 23:36:28 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	977826
Run time	2 days 8 hours 35 min 48 sec
CPU time	1 days 15 hours 14 min 24 sec
Validate state	Invalid
Credit	1,488.65
Device peak FLOPS	2.89 GFLOPS
Application version	UK Met Office HadSM3 Slab Model v6.07 windows_intelx86
Stderr	<core_client_version>6.10.18</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=428, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=428, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2212, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2212, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2212, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2212, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2212, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2212, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
03 Oct 2010 23:07:02	977826	11645020	hadsm3dhet2_u5dm_006670131_3	162,030	136,254	0.8409
02 Oct 2010 23:02:15	977826	11645020	hadsm3dhet2_u5dm_006670131_3	151,228	127,341	0.8420
02 Oct 2010 03:07:03	977826	11645020	hadsm3dhet2_u5dm_006670131_3	140,426	118,216	0.8418
01 Oct 2010 23:20:35	977826	11645020	hadsm3dhet2_u5dm_006670131_3	129,624	108,803	0.8394
29 Sep 2010 23:12:06	977826	11645020	hadsm3dhet2_u5dm_006670131_3	118,822	99,760	0.8396
29 Sep 2010 04:10:24	977826	11645020	hadsm3dhet2_u5dm_006670131_3	108,020	90,559	0.8384
29 Sep 2010 02:58:50	977826	11645020	hadsm3dhet2_u5dm_006670131_3	97,218	81,225	0.8355
28 Sep 2010 05:29:36	977826	11645020	hadsm3dhet2_u5dm_006670131_3	86,416	71,718	0.8299
28 Sep 2010 03:39:04	977826	11645020	hadsm3dhet2_u5dm_006670131_3	75,614	62,594	0.8278
27 Sep 2010 23:03:52	977826	11645020	hadsm3dhet2_u5dm_006670131_3	64,812	53,607	0.8271
27 Sep 2010 03:09:09	977826	11645020	hadsm3dhet2_u5dm_006670131_3	54,010	44,590	0.8256
26 Sep 2010 23:03:02	977826	11645020	hadsm3dhet2_u5dm_006670131_3	43,208	35,601	0.8239
26 Sep 2010 23:03:02	977826	11645020	hadsm3dhet2_u5dm_006670131_3	32,406	26,636	0.8219
26 Sep 2010 03:40:43	977826	11645020	hadsm3dhet2_u5dm_006670131_3	21,604	17,857	0.8266
25 Sep 2010 23:00:20	977826	11645020	hadsm3dhet2_u5dm_006670131_3	10,802	9,032	0.8361