Task 15279132

Name	hadcm3n_zbuk_1880_40_008200305_0
Workunit	8355429
Created	13 Sep 2012, 6:11:24 UTC
Sent	14 Sep 2012, 4:51:47 UTC
Report deadline	14 Dec 2012, 12:18:58 UTC
Received	30 Oct 2012, 11:50:02 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1286177
Run time	19 days 0 hours 49 min 29 sec
CPU time	18 days 7 hours 43 min 27 sec
Validate state	Invalid
Credit	9,953.28
Device peak FLOPS	2.62 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.58</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 01:21:41 (3912): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 06:10:36 (4744): No heartbeat from core client for 30 sec - exiting 06:10:37 (4744): No heartbeat from core client for 30 sec - exiting 06:10:38 (4744): No heartbeat from core client for 30 sec - exiting 06:10:39 (4744): No heartbeat from core client for 30 sec - exiting 06:10:40 (4744): No heartbeat from core client for 30 sec - exiting 06:10:41 (4744): No heartbeat from core client for 30 sec - exiting 06:10:42 (4744): No heartbeat from core client for 30 sec - exiting 06:10:44 (4744): No heartbeat from core client for 30 sec - exiting 06:10:45 (4744): No heartbeat from core client for 30 sec - exiting 06:10:46 (4744): No heartbeat from core client for 30 sec - exiting 06:10:47 (4744): No heartbeat from core client for 30 sec - exiting 06:10:48 (4744): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:10:49 (4744): No heartbeat from core client for 30 sec - exiting 06:10:50 (4744): No heartbeat from core client for 30 sec - exiting 06:10:51 (4744): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4672, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4380, iMonCtr=1 Model crash detected, will try to restart... CSignal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5016, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5016, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5016, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4072, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4072, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4072, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
28 Oct 2012 21:30:23	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	829,440	1,542,570	1.8598
28 Oct 2012 02:28:17	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	803,520	1,502,495	1.8699
27 Oct 2012 11:09:35	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	777,600	1,448,441	1.8627
26 Oct 2012 16:03:59	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	751,680	1,404,208	1.8681
25 Oct 2012 21:26:53	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	725,760	1,359,905	1.8738
24 Oct 2012 10:36:57	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	699,840	1,313,292	1.8766
22 Oct 2012 19:39:11	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	673,920	1,262,529	1.8734
22 Oct 2012 00:16:18	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	648,000	1,212,313	1.8709
18 Oct 2012 03:26:10	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	622,080	1,162,085	1.8681
16 Oct 2012 05:28:54	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	596,160	1,111,460	1.8644
15 Oct 2012 15:21:35	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	570,240	1,060,825	1.8603
14 Oct 2012 01:30:20	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	544,320	1,005,245	1.8468
13 Oct 2012 01:46:20	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	518,400	954,124	1.8405
11 Oct 2012 14:01:39	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	492,480	904,804	1.8372
09 Oct 2012 20:38:10	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	466,560	856,113	1.8349
09 Oct 2012 00:43:50	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	440,640	805,556	1.8281
05 Oct 2012 09:52:45	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	414,720	754,672	1.8197
04 Oct 2012 12:08:54	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	388,800	703,228	1.8087
03 Oct 2012 05:30:54	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	362,880	651,647	1.7958
01 Oct 2012 21:38:43	1105888	15279132	hadcm3n_zbuk_1880_40_008200305_0	336,960	607,650	1.8033