Task 13144646

Name	hadcm3n_yc7e_1900_40_007348676_2
Workunit	7546106
Created	18 Jul 2011, 5:41:41 UTC
Sent	18 Jul 2011, 5:50:01 UTC
Report deadline	17 Oct 2011, 13:17:12 UTC
Received	26 Aug 2011, 6:14:05 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	-1073741819 (0xC0000005) STATUS_ACCESS_VIOLATION
Computer ID	1115924
Run time	9 days 4 hours 7 min 58 sec
CPU time	8 days 19 hours 9 min 59 sec
Validate state	Invalid
Credit	6,220.80
Device peak FLOPS	2.71 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.58</core_client_version> <![CDATA[ <message> - exit code -1073741819 (0xc0000005) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4524, iMonCtr=1 Model crash detected, will try to restart... 07:49:47 (5548): No heartbeat from core client for 30 sec - exiting 07:49:48 (5548): No heartbeat from core client for 30 sec - exiting 07:49:49 (5548): No heartbeat from core client for 30 sec - exiting 07:49:50 (5548): No heartbeat from core client for 30 sec - exiting 07:49:51 (5548): No heartbeat from core client for 30 sec - exiting 07:49:52 (5548): No heartbeat from core client for 30 sec - exiting 07:49:53 (5548): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4684, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1248, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1248, iMonCtr=1 Model crash detected, will try to restart... 07:53:40 (5376): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7248, iMonCtr=1 Model crash detected, will try to restart... 07:52:14 (956): No heartbeat from core client for 30 sec - exiting 07:52:15 (956): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4924, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4516, iMonCtr=1 Model crash detected, will try to restart... 07:51:32 (5208): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 08:20:33 (4236): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:49:55 (4140): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:32:11 (4432): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:52:50 (4944): No heartbeat from core client for 30 sec - exiting 07:52:51 (4944): No heartbeat from core client for 30 sec - exiting 07:52:52 (4944): No heartbeat from core client for 30 sec - exiting 07:52:53 (4944): No heartbeat from core client for 30 sec - exiting 07:52:54 (4944): No heartbeat from core client for 30 sec - exiting 07:52:55 (4944): No heartbeat from core client for 30 sec - exiting 07:52:56 (4944): No heartbeat from core client for 30 sec - exiting 07:52:57 (4944): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=544, iMonCtr=1 Model crash detected, will try to restart... 16:03:22 (4212): No heartbeat from core client for 30 sec - exiting 16:03:23 (4212): No heartbeat from core client for 30 sec - exiting 16:03:24 (4212): No heartbeat from core client for 30 sec - exiting 16:03:25 (4212): No heartbeat from core client for 30 sec - exiting 16:03:26 (4212): No heartbeat from core client for 30 sec - exiting 16:03:27 (4212): No heartbeat from core client for 30 sec - exiting 16:03:28 (4212): No heartbeat from core client for 30 sec - exiting 16:03:29 (4212): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5344, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5344, iMonCtr=1 Model crash detected, will try to restart... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77B26E0F read attempt to address 0x00000000 Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77B06E0F read attempt to address 0x00000000 Engaging BOINC Windows Runtime Debugger... </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
25 Aug 2011 13:36:47	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	518,400	751,527	1.4497
24 Aug 2011 10:30:42	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	492,480	713,457	1.4487
23 Aug 2011 08:50:08	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	466,560	675,822	1.4485
19 Aug 2011 12:16:48	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	440,640	637,710	1.4472
18 Aug 2011 09:23:13	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	414,720	600,257	1.4474
17 Aug 2011 07:31:50	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	388,800	562,530	1.4468
15 Aug 2011 07:55:20	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	362,880	524,355	1.4450
12 Aug 2011 11:37:01	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	336,960	487,072	1.4455
11 Aug 2011 08:39:22	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	311,040	449,310	1.4445
10 Aug 2011 06:52:32	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	285,120	411,553	1.4434
08 Aug 2011 13:38:48	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	259,200	373,847	1.4423
05 Aug 2011 07:32:26	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	233,280	336,026	1.4404
04 Aug 2011 06:14:44	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	207,360	298,519	1.4396
29 Jul 2011 11:24:22	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	181,440	260,839	1.4376
27 Jul 2011 14:07:22	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	155,520	223,220	1.4353
26 Jul 2011 11:40:24	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	129,600	187,036	1.4432
25 Jul 2011 22:53:59	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	103,680	150,916	1.4556
25 Jul 2011 19:12:46	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	77,760	112,920	1.4522
25 Jul 2011 18:54:35	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	51,840	75,344	1.4534
25 Jul 2011 17:55:15	1115924	13144646	hadcm3n_yc7e_1900_40_007348676_2	25,920	37,649	1.4525