Task 14564183

Name	hadcm3n_u6x0_1980_40_007684002_3
Workunit	7839089
Created	23 Apr 2012, 0:21:29 UTC
Sent	23 Apr 2012, 0:22:04 UTC
Report deadline	23 Jul 2012, 7:49:15 UTC
Received	15 Jun 2012, 0:34:29 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1166383
Run time	19 days 22 hours 49 min 51 sec
CPU time	18 days 23 hours 58 min 16 sec
Validate state	Invalid
Credit	9,953.28
Device peak FLOPS	2.50 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.25</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> 08:08:11 (2444): No heartbeat from core client for 30 sec - exiting 08:08:12 (2444): No heartbeat from core client for 30 sec - exiting 08:08:13 (2444): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4772, iMonCtr=1 Model crash detected, will try to restart... 06:05:42 (2044): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:05:43 (2044): No heartbeat from core client for 30 sec - exiting 06:05:44 (2044): No heartbeat from core client for 30 sec - exiting 06:05:45 (2044): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 03:52:44 (2100): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2752, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2752, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2752, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2752, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2752, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2752, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
14 Jun 2012 15:47:13	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	829,440	1,613,925	1.9458
14 Jun 2012 07:25:43	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	803,520	1,582,889	1.9699
13 Jun 2012 22:05:46	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	777,600	1,551,985	1.9959
13 Jun 2012 13:35:08	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	751,680	1,521,023	2.0235
13 Jun 2012 04:42:16	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	725,760	1,489,895	2.0529
12 Jun 2012 20:02:02	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	699,840	1,458,867	2.0846
12 Jun 2012 11:24:29	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	673,920	1,427,898	2.1188
12 Jun 2012 03:26:17	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	648,000	1,396,780	2.1555
11 Jun 2012 18:23:03	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	622,080	1,365,920	2.1957
11 Jun 2012 09:41:14	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	596,160	1,334,909	2.2392
10 Jun 2012 22:14:31	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	570,240	1,293,023	2.2675
10 Jun 2012 04:43:38	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	544,320	1,236,472	2.2716
09 Jun 2012 12:10:51	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	518,400	1,176,896	2.2702
08 Jun 2012 18:25:37	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	492,480	1,117,519	2.2692
08 Jun 2012 01:05:10	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	466,560	1,058,094	2.2679
07 Jun 2012 07:58:32	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	440,640	998,673	2.2664
06 Jun 2012 14:44:07	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	414,720	939,231	2.2647
05 Jun 2012 21:37:18	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	388,800	880,020	2.2634
05 Jun 2012 04:00:14	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	362,880	820,731	2.2617
04 Jun 2012 09:09:46	1166383	14564183	hadcm3n_u6x0_1980_40_007684002_3	336,960	762,238	2.2621