Task 18758732

Name	hadcm3n_xdpn_1940_40_010021554_1
Workunit	10019625
Created	7 Aug 2015, 17:42:12 UTC
Sent	7 Aug 2015, 17:42:18 UTC
Report deadline	7 Nov 2015, 1:09:29 UTC
Received	29 Aug 2015, 10:46:59 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1327724
Run time	14 days 6 hours 23 min 24 sec
CPU time	10 days 2 hours 43 min 28 sec
Validate state	Invalid
Credit	11,197.44
Device peak FLOPS	3.89 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.4.42</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3500, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3500, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3500, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3500, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3800, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1016, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1016, iMonCtr=1 Model crash detected, will try to restart... 11:05:47 (4544): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4692, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3568, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1280, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3168, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4644, iMonCtr=1 Model crash detected, will try to restart... 09:25:05 (3984): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4092, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4092, iMonCtr=1 Model crash detected, will try to restart... Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy 2048 Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy 2048 Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy 2048 Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy 2048 Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy 2048 Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy 2048 Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
29 Aug 2015 09:25:01	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	933,120	952,015	1.0202
28 Aug 2015 23:53:59	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	907,200	919,345	1.0134
28 Aug 2015 14:35:14	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	881,280	886,814	1.0063
27 Aug 2015 16:48:01	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	855,360	856,480	1.0013
26 Aug 2015 20:06:23	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	829,440	825,910	0.9957
26 Aug 2015 10:21:21	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	803,520	795,320	0.9898
25 Aug 2015 12:09:01	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	777,600	855,003	1.0995
25 Aug 2015 01:52:53	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	751,680	825,029	1.0976
24 Aug 2015 16:42:08	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	725,760	795,402	1.0960
23 Aug 2015 18:44:42	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	699,840	765,928	1.0944
23 Aug 2015 08:36:13	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	673,920	735,932	1.0920
22 Aug 2015 11:50:58	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	648,000	704,202	1.0867
21 Aug 2015 12:01:04	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	622,080	672,418	1.0809
20 Aug 2015 14:11:20	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	596,160	639,789	1.0732
19 Aug 2015 16:03:49	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	570,240	607,852	1.0660
19 Aug 2015 06:10:23	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	544,320	574,681	1.0558
18 Aug 2015 19:59:11	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	518,400	543,428	1.0483
18 Aug 2015 09:05:22	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	492,480	511,442	1.0385
17 Aug 2015 23:14:24	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	466,560	478,816	1.0263
17 Aug 2015 14:06:32	1327724	18758732	hadcm3n_xdpn_1940_40_010021554_1	440,640	449,745	1.0207