Task 15904605

Name	hadcm3n_o0ut_1940_40_008404574_0
Workunit	8555430
Created	23 Jul 2013, 19:14:40 UTC
Sent	24 Jul 2013, 12:31:54 UTC
Report deadline	23 Oct 2013, 19:59:05 UTC
Received	13 Sep 2013, 5:14:41 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1285956
Run time	20 days 11 hours 41 min 28 sec
CPU time	18 days 17 hours 35 min 49 sec
Validate state	Invalid
Credit	8,087.04
Device peak FLOPS	1.95 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.64</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 10:37:16 (3000): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2616, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3024, iMonCtr=1 Model crash detected, will try to restart... 01:10:34 (3588): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:56:46 (3652): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1856, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1856, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1856, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1856, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1856, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1856, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
13 Sep 2013 05:15:14	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	673,920	1,584,106	2.3506
13 Sep 2013 05:15:14	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	648,000	1,532,176	2.3645
13 Sep 2013 05:15:14	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	622,080	1,480,196	2.3794
13 Sep 2013 05:15:14	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	596,160	1,427,988	2.3953
13 Sep 2013 05:15:14	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	570,240	1,375,737	2.4126
13 Sep 2013 05:15:14	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	544,320	1,323,306	2.4311
13 Sep 2013 05:15:14	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	518,400	1,270,896	2.4516
13 Sep 2013 05:15:14	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	492,480	1,219,345	2.4759
13 Sep 2013 05:15:14	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	466,560	1,169,999	2.5077
13 Sep 2013 05:15:14	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	440,640	1,120,371	2.5426
13 Sep 2013 05:15:14	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	414,720	1,070,618	2.5815
13 Sep 2013 05:15:14	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	388,800	1,021,061	2.6262
13 Sep 2013 05:15:14	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	362,880	968,757	2.6696
31 Aug 2013 00:11:06	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	336,960	914,512	2.7140
30 Aug 2013 12:11:43	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	311,040	848,093	2.7266
30 Aug 2013 12:11:43	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	285,120	779,548	2.7341
30 Aug 2013 12:11:43	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	259,200	707,991	2.7314
30 Aug 2013 12:11:43	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	233,280	636,136	2.7269
30 Aug 2013 12:11:43	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	207,360	566,809	2.7335
30 Aug 2013 12:11:43	1285956	15904605	hadcm3n_o0ut_1940_40_008404574_0	181,440	495,036	2.7284