Task 13555307

Name	hadcm3n_ykx8_1900_40_007524791_0
Workunit	7722266
Created	28 Oct 2011, 13:33:54 UTC
Sent	30 Oct 2011, 16:54:08 UTC
Report deadline	30 Jan 2012, 0:21:19 UTC
Received	31 Dec 2011, 12:27:13 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	-1073741819 (0xC0000005) STATUS_ACCESS_VIOLATION
Computer ID	1019661
Run time	12 days 15 hours 22 min 39 sec
CPU time	10 days 19 hours 40 min 23 sec
Validate state	Invalid
Credit	6,220.80
Device peak FLOPS	2.24 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.6.38</core_client_version> <![CDATA[ <message> - exit code -1073741819 (0xc0000005) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1220, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2952, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4164, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5100, iMonCtr=1 Model crash detected, will try to restart... 18:34:47 (4972): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5912, iMonCtr=1 Model crash detected, will try to restart... 19:49:10 (4212): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:52:05 (3412): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... C20:00:55 (4372): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:57:22 (3592): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:13:25 (932): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:13:15 (4588): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4276, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 19:11:06 (4092): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... C12:01:33 (4712): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:14:12 (5088): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4140, iMonCtr=1 Model crash detected, will try to restart... 22:16:22 (2392): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4516, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2192, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2192, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2192, iMonCtr=1 Model crash detected, will try to restart... 16:53:35 (5524): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... C19:12:00 (2876): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CCPDN Monitor - Quit request from BOINC... 18:24:20 (2880): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4232, iMonCtr=1 Model crash detected, will try to restart... 14:30:59 (1252): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:36:15 (2144): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:12:07 (4720): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:01:09 (4680): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5140, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4528, iMonCtr=1 Model crash detected, will try to restart... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x776D1D6F read attempt to address 0xFFFFFFF8 Engaging BOINC Windows Runtime Debugger... </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
31 Dec 2011 10:16:48	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	518,400	934,576	1.8028
29 Dec 2011 21:11:01	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	492,480	888,298	1.8037
28 Dec 2011 11:20:22	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	466,560	842,063	1.8048
27 Dec 2011 11:48:13	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	440,640	795,269	1.8048
24 Dec 2011 23:51:35	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	414,720	749,606	1.8075
23 Dec 2011 09:11:23	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	388,800	702,798	1.8076
21 Dec 2011 09:48:10	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	362,880	655,698	1.8069
17 Dec 2011 16:40:38	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	336,960	609,000	1.8073
12 Dec 2011 22:42:08	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	311,040	561,942	1.8067
09 Dec 2011 10:23:02	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	285,120	516,657	1.8121
07 Dec 2011 12:33:01	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	259,200	469,173	1.8101
03 Dec 2011 16:23:12	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	233,280	421,021	1.8048
29 Nov 2011 20:47:33	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	207,360	372,573	1.7967
26 Nov 2011 11:20:13	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	181,440	326,117	1.7974
19 Nov 2011 18:11:36	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	155,520	279,462	1.7970
17 Nov 2011 19:49:33	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	129,600	232,106	1.7909
15 Nov 2011 17:41:26	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	103,680	184,919	1.7836
09 Nov 2011 20:38:32	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	77,760	137,502	1.7683
05 Nov 2011 13:44:16	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	51,840	90,646	1.7486
02 Nov 2011 20:51:49	1019661	13555307	hadcm3n_ykx8_1900_40_007524791_0	25,920	46,300	1.7863