Task 15899015

Name	hadcm3n_o0ti_1940_40_008383185_2
Workunit	8534044
Created	21 Jul 2013, 17:14:59 UTC
Sent	21 Jul 2013, 17:15:24 UTC
Report deadline	21 Oct 2013, 0:42:35 UTC
Received	9 Nov 2013, 8:41:09 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	-1073741819 (0xC0000005) STATUS_ACCESS_VIOLATION
Computer ID	1279396
Run time	27 days 4 hours 36 min 15 sec
CPU time	26 days 7 hours 44 min 54 sec
Validate state	Invalid
Credit	11,819.52
Device peak FLOPS	2.59 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.64</core_client_version> <![CDATA[ <message> (unknown error) - exit code -1073741819 (0xc0000005) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3784, iMonCtr=1 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3576, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3512, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3440, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4816, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2828, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4064, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not rController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3904, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3896, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3572, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3572, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3572, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3572, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3572, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3572, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish 21:56:00 (2932): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3280, iMonCtr=1 Model crash detected, will try to restart... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x779AFF6B write attempt to address 0x4334478F Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x76F471B8 read attempt to address 0xFFFFFFF8 Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77D63AC3 read attempt to address 0x00000000 Engaging BOINC Windows Runtime Debugger... </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
02 Oct 2013 01:45:37	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	984,960	1,267,515	1.2869
01 Oct 2013 16:14:43	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	959,040	1,234,490	1.2872
30 Sep 2013 22:36:08	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	933,120	1,201,849	1.2880
30 Sep 2013 13:13:03	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	907,200	1,169,048	1.2886
29 Sep 2013 11:19:43	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	881,280	1,135,865	1.2889
27 Sep 2013 14:09:45	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	855,360	1,102,582	1.2890
26 Sep 2013 10:32:51	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	829,440	1,069,743	1.2897
25 Sep 2013 17:25:52	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	803,520	1,036,914	1.2905
23 Sep 2013 23:19:13	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	777,600	1,003,763	1.2908
23 Sep 2013 12:13:23	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	751,680	970,224	1.2907
20 Sep 2013 19:46:27	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	725,760	936,646	1.2906
20 Sep 2013 00:06:02	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	699,840	902,710	1.2899
19 Sep 2013 11:12:55	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	673,920	869,452	1.2901
17 Sep 2013 15:23:08	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	648,000	836,689	1.2912
15 Sep 2013 14:00:35	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	622,080	803,694	1.2919
12 Sep 2013 22:13:36	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	596,160	770,668	1.2927
11 Sep 2013 14:02:25	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	570,240	737,769	1.2938
10 Sep 2013 03:52:26	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	544,320	704,947	1.2951
09 Sep 2013 18:23:21	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	518,400	672,165	1.2966
08 Sep 2013 23:31:01	1279396	15899015	hadcm3n_o0ti_1940_40_008383185_2	492,480	639,079	1.2977