Task 15092305

Name	hadcm3n_203l_1980_40_008085740_3
Workunit	8240854
Created	11 Aug 2012, 1:31:33 UTC
Sent	11 Aug 2012, 1:32:09 UTC
Report deadline	10 Nov 2012, 8:59:20 UTC
Received	28 Aug 2012, 6:04:20 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1241393
Run time	9 days 7 hours 23 min 46 sec
CPU time	9 days 3 hours 4 min 14 sec
Validate state	Invalid
Credit	8,709.12
Device peak FLOPS	4.00 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.12.34</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 03:34:59 (3160): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3808, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3924, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3940, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3940, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3940, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3940, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3940, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3940, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
23 Aug 2012 01:14:41	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	725,760	764,874	1.0539
22 Aug 2012 17:29:59	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	699,840	737,278	1.0535
22 Aug 2012 09:36:28	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	673,920	709,518	1.0528
22 Aug 2012 01:54:41	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	648,000	682,064	1.0526
21 Aug 2012 18:38:49	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	622,080	654,380	1.0519
21 Aug 2012 10:25:23	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	596,160	626,748	1.0513
21 Aug 2012 02:43:28	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	570,240	599,207	1.0508
20 Aug 2012 18:09:52	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	544,320	571,498	1.0499
20 Aug 2012 10:14:44	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	518,400	543,559	1.0485
20 Aug 2012 01:07:55	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	492,480	516,198	1.0482
19 Aug 2012 17:17:36	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	466,560	488,486	1.0470
19 Aug 2012 09:04:11	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	440,640	460,344	1.0447
19 Aug 2012 01:17:45	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	414,720	432,288	1.0424
18 Aug 2012 17:26:32	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	388,800	404,463	1.0403
18 Aug 2012 09:59:20	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	362,880	376,549	1.0377
18 Aug 2012 01:37:49	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	336,960	348,793	1.0351
17 Aug 2012 17:52:33	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	311,040	320,978	1.0320
17 Aug 2012 09:55:07	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	285,120	293,145	1.0281
17 Aug 2012 02:48:45	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	259,200	267,437	1.0318
16 Aug 2012 19:51:02	1150245	15092305	hadcm3n_203l_1980_40_008085740_3	233,280	242,688	1.0403