Task 13337162

Name	hadcm3n_o4ws_1900_40_007440160_1
Workunit	7637663
Created	5 Sep 2011, 18:19:10 UTC
Sent	5 Sep 2011, 22:52:26 UTC
Report deadline	6 Dec 2011, 6:19:37 UTC
Received	27 Oct 2011, 2:22:19 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	904548
Run time	35 days 5 hours 13 min 30 sec
CPU time	35 days 5 hours 13 min 30 sec
Validate state	Invalid
Credit	10,575.36
Device peak FLOPS	2.02 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.2.18</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 03:11:56 (5208): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2444, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish 07:34:40 (2444): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4056, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4056, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4056, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4056, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4056, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
17 Oct 2011 17:01:10	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	881,280	3,012,253	3.4180
16 Oct 2011 19:02:27	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	855,360	2,934,666	3.4309
15 Oct 2011 21:16:41	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	829,440	2,856,882	3.4444
14 Oct 2011 19:58:57	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	803,520	2,773,183	3.4513
11 Oct 2011 12:33:59	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	777,600	2,683,555	3.4511
09 Oct 2011 23:15:13	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	751,680	2,602,277	3.4619
08 Oct 2011 13:09:17	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	725,760	2,513,826	3.4637
07 Oct 2011 03:45:03	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	699,840	2,421,556	3.4602
05 Oct 2011 18:02:47	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	673,920	2,329,194	3.4562
04 Oct 2011 11:29:18	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	648,000	2,234,987	3.4491
03 Oct 2011 07:21:33	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	622,080	2,142,743	3.4445
02 Oct 2011 03:46:56	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	596,160	2,052,463	3.4428
01 Oct 2011 01:26:34	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	570,240	1,963,672	3.4436
29 Sep 2011 22:31:31	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	544,320	1,874,727	3.4442
28 Sep 2011 19:36:29	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	518,400	1,785,605	3.4445
27 Sep 2011 16:29:04	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	492,480	1,695,591	3.4430
26 Sep 2011 13:22:52	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	466,560	1,604,657	3.4393
25 Sep 2011 13:33:33	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	440,640	1,516,358	3.4413
25 Sep 2011 13:33:33	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	414,720	1,426,162	3.4389
25 Sep 2011 13:33:33	904548	13337162	hadcm3n_o4ws_1900_40_007440160_1	388,800	1,334,365	3.4320