Task 15826528

Name	hadcm3n_4a0s_2020_40_008389478_0
Workunit	8540337
Created	3 Jun 2013, 23:58:35 UTC
Sent	4 Jun 2013, 0:07:21 UTC
Report deadline	3 Sep 2013, 7:34:32 UTC
Received	16 Jun 2013, 19:13:08 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1251428
Run time	11 days 10 hours 29 min 58 sec
CPU time	9 days 12 hours 25 min 45 sec
Validate state	Invalid
Credit	6,531.84
Device peak FLOPS	2.82 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.64</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> 15:12:12 (2924): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:18:11 (5848): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:20:59 (3284): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:33:00 (4432): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:50:34 (4924): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:51:52 (4924): No heartbeat from core client for 30 sec - exiting 17:51:53 (4924): No heartbeat from core client for 30 sec - exiting 17:51:54 (4924): No heartbeat from core client for 30 sec - exiting 18:39:39 (5948): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Ocean Restart file copy failed on 4a0sko.dam77j0 Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 22:31:56 (1100): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:34:16 (3332): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:35:19 (3332): No heartbeat from core client for 30 sec - exiting 22:35:20 (3332): No heartbeat from core client for 30 sec - exiting 22:35:21 (3332): No heartbeat from core client for 30 sec - exiting 22:35:22 (3332): No heartbeat from core client for 30 sec - exiting 22:35:23 (3332): No heartbeat from core client for 30 sec - exiting 20:38:50 (5828): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:09:11 (4808): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:12:17 (6592): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:12:18 (6592): No heartbeat from core client for 30 sec - exiting 23:12:19 (6592): No heartbeat from core client for 30 sec - exiting 23:12:20 (6592): No heartbeat from core client for 30 sec - exiting 23:12:21 (6592): No heartbeat from core client for 30 sec - exiting 23:12:22 (6592): No heartbeat from core client for 30 sec - exiting 23:12:23 (6592): No heartbeat from core client for 30 sec - exiting 23:12:24 (6592): No heartbeat from core client for 30 sec - exiting 23:12:25 (6592): No heartbeat from core client for 30 sec - exiting 23:12:26 (6592): No heartbeat from core client for 30 sec - exiting 23:12:27 (6592): No heartbeat from core client for 30 sec - exiting 00:02:58 (5868): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:02:59 (5868): No heartbeat from core client for 30 sec - exiting 00:03:00 (5868): No heartbeat from core client for 30 sec - exiting 00:03:01 (5868): No heartbeat from core client for 30 sec - exiting 00:03:02 (5868): No heartbeat from core client for 30 sec - exiting 00:03:03 (5868): No heartbeat from core client for 30 sec - exiting 00:03:04 (5868): No heartbeat from core client for 30 sec - exiting 00:03:05 (5868): No heartbeat from core client for 30 sec - exiting 00:03:06 (5868): No heartbeat from core client for 30 sec - exiting 00:03:07 (5868): No heartbeat from core client for 30 sec - exiting 00:03:08 (5868): No heartbeat from core client for 30 sec - exiting forrtl: Access is denied. 00:13:46 (7116): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:31:59 (4964): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:32:00 (4964): No heartbeat from core client for 30 sec - exiting 00:32:01 (4964): No heartbeat from core client for 30 sec - exiting 00:32:02 (4964): No heartbeat from core client for 30 sec - exiting 00:32:03 (4964): No heartbeat from core client for 30 sec - exiting 00:32:04 (4964): No heartbeat from core client for 30 sec - exiting 00:32:05 (4964): No heartbeat from core client for 30 sec - exiting 00:32:06 (4964): No heartbeat from core client for 30 sec - exiting 00:32:07 (4964): No heartbeat from core client for 30 sec - exiting 00:32:08 (4964): No heartbeat from core client for 30 sec - exiting 00:36:46 (6256): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:36:47 (6256): No heartbeat from core client for 30 sec - exiting 00:36:48 (6256): No heartbeat from core client for 30 sec - exiting 00:36:49 (6256): No heartbeat from core client for 30 sec - exiting 00:36:50 (6256): No heartbeat from core client for 30 sec - exiting 00:36:51 (6256): No heartbeat from core client for 30 sec - exiting 00:36:52 (6256): No heartbeat from core client for 30 sec - exiting 00:36:53 (6256): No heartbeat from core client for 30 sec - exiting 00:36:54 (6256): No heartbeat from core client for 30 sec - exiting 00:36:55 (6256): No heartbeat from core client for 30 sec - exiting 00:36:56 (6256): No heartbeat from core client for 30 sec - exiting 03:08:58 (6872): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold Ocean Restart file copy failed on 4a0sko.dan53e0 Ocean Restart file copy failed on 4a0sko.dan53f0 Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5220, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5220, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5220, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5220, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5220, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5220, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
16 Jun 2013 03:48:38	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	544,320	807,188	1.4829
16 Jun 2013 00:12:11	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	518,400	768,982	1.4834
15 Jun 2013 03:32:32	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	492,480	731,609	1.4856
14 Jun 2013 13:50:42	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	466,560	693,426	1.4863
14 Jun 2013 01:23:52	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	440,640	655,347	1.4873
13 Jun 2013 12:05:05	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	414,720	615,659	1.4845
12 Jun 2013 22:19:03	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	388,800	575,283	1.4796
12 Jun 2013 05:09:30	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	362,880	539,095	1.4856
11 Jun 2013 13:15:08	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	336,960	500,879	1.4865
11 Jun 2013 00:34:35	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	311,040	462,984	1.4885
10 Jun 2013 11:59:06	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	285,120	424,845	1.4901
09 Jun 2013 22:32:38	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	259,200	385,883	1.4887
09 Jun 2013 09:24:55	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	233,280	346,342	1.4847
08 Jun 2013 19:12:00	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	207,360	306,568	1.4784
07 Jun 2013 14:32:59	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	181,440	265,792	1.4649
07 Jun 2013 03:13:05	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	155,520	226,946	1.4593
06 Jun 2013 14:15:52	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	129,600	188,111	1.4515
06 Jun 2013 03:34:00	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	103,680	151,258	1.4589
05 Jun 2013 15:18:12	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	77,760	114,545	1.4731
05 Jun 2013 03:58:23	1251428	15826528	hadcm3n_4a0s_2020_40_008389478_0	51,840	76,694	1.4794