Task 13333983

Name	hadcm3n_o0ed_1900_40_007438573_0
Workunit	7636076
Created	5 Sep 2011, 18:13:28 UTC
Sent	5 Sep 2011, 18:17:19 UTC
Report deadline	6 Dec 2011, 1:44:30 UTC
Received	26 Sep 2011, 15:35:59 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1140451
Run time	8 days 1 hours 42 min 20 sec
CPU time	7 days 22 hours 52 min 13 sec
Validate state	Invalid
Credit	11,197.44
Device peak FLOPS	4.44 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.58</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 22:23:13 (3664): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 21:01:48 (3872): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 18:50:24 (4080): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 13:12:20 (3252): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=916, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=916, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=916, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=916, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=916, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=916, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
25 Sep 2011 19:50:10	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	933,120	674,647	0.7230
25 Sep 2011 14:34:11	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	907,200	656,114	0.7232
25 Sep 2011 09:15:56	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	881,280	637,438	0.7233
25 Sep 2011 04:02:29	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	855,360	618,643	0.7233
24 Sep 2011 22:47:30	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	829,440	600,099	0.7235
24 Sep 2011 17:36:26	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	803,520	581,520	0.7237
24 Sep 2011 12:23:46	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	777,600	563,009	0.7240
24 Sep 2011 07:10:36	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	751,680	544,352	0.7242
24 Sep 2011 01:57:07	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	725,760	525,783	0.7245
23 Sep 2011 20:01:38	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	699,840	507,064	0.7245
23 Sep 2011 07:06:02	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	673,920	488,044	0.7242
23 Sep 2011 01:55:17	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	648,000	469,392	0.7244
22 Sep 2011 17:41:03	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	622,080	450,533	0.7242
22 Sep 2011 11:35:16	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	596,160	431,502	0.7238
22 Sep 2011 02:13:30	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	570,240	412,124	0.7227
21 Sep 2011 20:34:24	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	544,320	393,143	0.7223
21 Sep 2011 15:32:03	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	518,400	374,222	0.7219
21 Sep 2011 10:17:33	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	492,480	355,691	0.7222
21 Sep 2011 05:04:15	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	466,560	337,140	0.7226
20 Sep 2011 23:20:18	1140451	13333983	hadcm3n_o0ed_1900_40_007438573_0	440,640	318,452	0.7227