Task 16661554

Name	hadcm3n_8c2d_1980_40_008725056_1
Workunit	8871034
Created	16 Jun 2014, 14:49:20 UTC
Sent	16 Jun 2014, 14:49:48 UTC
Report deadline	15 Sep 2014, 22:16:59 UTC
Received	25 Jun 2014, 21:53:07 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1314929
Run time	4 days 22 hours 49 min
CPU time	4 days 17 hours 12 min 42 sec
Validate state	Invalid
Credit	4,976.64
Device peak FLOPS	4.09 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.2.42</core_client_version> <![CDATA[ <message> O dispositivo não reconhece o comando. (0x16) - exit code 22 (0x16) </message> <stderr_txt> 18:34:17 (5932): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:34:18 (5932): No heartbeat from core client for 30 sec - exiting 18:34:19 (5932): No heartbeat from core client for 30 sec - exiting 18:34:20 (5932): No heartbeat from core client for 30 sec - exiting 18:34:21 (5932): No heartbeat from core client for 30 sec - exiting 18:34:22 (5932): No heartbeat from core client for 30 sec - exiting 18:34:23 (5932): No heartbeat from core client for 30 sec - exiting 18:34:24 (5932): No heartbeat from core client for 30 sec - exiting 18:34:25 (5932): No heartbeat from core client for 30 sec - exiting 18:34:26 (5932): No heartbeat from core client for 30 sec - exiting 18:34:27 (5932): No heartbeat from core client for 30 sec - exiting 07:28:45 (5224): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 08:11:04 (5316): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:55:07 (5500): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:25:08 (192): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:25:09 (192): No heartbeat from core client for 30 sec - exiting 08:25:10 (192): No heartbeat from core client for 30 sec - exiting 08:25:11 (192): No heartbeat from core client for 30 sec - exiting 08:25:12 (192): No heartbeat from core client for 30 sec - exiting 08:25:13 (192): No heartbeat from core client for 30 sec - exiting 08:25:14 (192): No heartbeat from core client for 30 sec - exiting 08:25:15 (192): No heartbeat from core client for 30 sec - exiting 08:25:58 (4368): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:25:59 (4368): No heartbeat from core client for 30 sec - exiting 08:26:00 (4368): No heartbeat from core client for 30 sec - exiting 08:26:01 (4368): No heartbeat from core client for 30 sec - exiting 08:26:02 (4368): No heartbeat from core client for 30 sec - exiting 08:26:03 (4368): No heartbeat from core client for 30 sec - exiting 08:26:04 (4368): No heartbeat from core client for 30 sec - exiting 08:26:05 (4368): No heartbeat from core client for 30 sec - exiting 07:48:29 (5644): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 06:50:41 (1428): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 11:41:25 (4792): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5820, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5748, iMonCtr=1 Model crash detected, will try to restart... 18:38:36 (5748): No heartbeat from core client for 30 sec - exiting 18:38:37 (5748): No heartbeat from core client for 30 sec - exiting 18:38:38 (5748): No heartbeat from core client for 30 sec - exiting 18:38:39 (5748): No heartbeat from core client for 30 sec - exiting 18:38:40 (5748): No heartbeat from core client for 30 sec - exiting 18:38:41 (5748): No heartbeat from core client for 30 sec - exiting 18:38:42 (5748): No heartbeat from core client for 30 sec - exiting 18:38:43 (5748): No heartbeat from core client for 30 sec - exiting 18:38:44 (5748): No heartbeat from core client for 30 sec - exiting 18:38:45 (5748): No heartbeat from core client for 30 sec - exiting 18:38:46 (5748): No heartbeat from core client for 30 sec - exiting 18:38:47 (5748): No heartbeat from core client for 30 sec - exiting 18:38:48 (5748): No heartbeat from core client for 30 sec - exiting 18:38:49 (5748): No heartbeat from core client for 30 sec - exiting 18:38:50 (5748): No heartbeat from core client for 30 sec - exiting 18:38:51 (5748): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6080, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6080, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6080, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7940, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
25 Jun 2014 11:20:40	1314929	16661554	hadcm3n_8c2d_1980_40_008725056_1	414,720	396,003	0.9549
24 Jun 2014 18:54:12	1314929	16661554	hadcm3n_8c2d_1980_40_008725056_1	388,800	372,524	0.9581
24 Jun 2014 01:06:11	1314929	16661554	hadcm3n_8c2d_1980_40_008725056_1	362,880	348,955	0.9616
23 Jun 2014 18:14:52	1314929	16661554	hadcm3n_8c2d_1980_40_008725056_1	336,960	325,172	0.9650
23 Jun 2014 13:01:57	1314929	16661554	hadcm3n_8c2d_1980_40_008725056_1	311,040	301,994	0.9709
23 Jun 2014 13:01:57	1314929	16661554	hadcm3n_8c2d_1980_40_008725056_1	285,120	278,336	0.9762
23 Jun 2014 13:01:57	1314929	16661554	hadcm3n_8c2d_1980_40_008725056_1	259,200	254,692	0.9826
21 Jun 2014 16:16:41	1314929	16661554	hadcm3n_8c2d_1980_40_008725056_1	233,280	228,961	0.9815
20 Jun 2014 23:09:02	1314929	16661554	hadcm3n_8c2d_1980_40_008725056_1	207,360	204,429	0.9859
20 Jun 2014 14:52:35	1314929	16661554	hadcm3n_8c2d_1980_40_008725056_1	181,440	179,154	0.9874
19 Jun 2014 22:19:14	1314929	16661554	hadcm3n_8c2d_1980_40_008725056_1	155,520	153,619	0.9878
19 Jun 2014 14:02:34	1314929	16661554	hadcm3n_8c2d_1980_40_008725056_1	129,600	128,128	0.9886
18 Jun 2014 20:44:09	1314929	16661554	hadcm3n_8c2d_1980_40_008725056_1	103,680	101,299	0.9770
18 Jun 2014 11:56:05	1314929	16661554	hadcm3n_8c2d_1980_40_008725056_1	77,760	74,678	0.9604
17 Jun 2014 19:12:23	1314929	16661554	hadcm3n_8c2d_1980_40_008725056_1	51,840	48,198	0.9297
17 Jun 2014 11:16:18	1314929	16661554	hadcm3n_8c2d_1980_40_008725056_1	25,920	24,076	0.9289