Task 15803134

Name	hadcm3n_o3dq_2020_40_008374532_0
Workunit	8525391
Created	29 May 2013, 21:49:09 UTC
Sent	29 May 2013, 21:50:48 UTC
Report deadline	29 Aug 2013, 5:17:59 UTC
Received	16 Aug 2013, 15:24:06 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	-1073741819 (0xC0000005) STATUS_ACCESS_VIOLATION
Computer ID	1053145
Run time	18 days 0 hours 23 min 51 sec
CPU time	14 days 3 hours 52 min 22 sec
Validate state	Invalid
Credit	6,220.80
Device peak FLOPS	2.64 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.64</core_client_version> <![CDATA[ <message> (unknown error) - exit code -1073741819 (0xc0000005) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4208, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 11:04:42 (3912): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 12:25:42 (4440): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2504, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4164, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4164, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4164, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4164, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4164, iMonCtr=1 Model crash detected, will try to restart... 10:40:27 (3432): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:44:48 (4876): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4712, iMonCtr=1 Model crash detected, will try to restart... Atmos Hold Restart file rename failed on atmos_restart.hold Atmos Hold Restart file rename failed on atmos_restart.hold Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3816, iMonCtr=1 Model crash detected, will try to restart... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3868, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4596, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2128, iMonCtr=1 Model crash detected, will try to restart... 00:11:01 (5108): No heartbeat from core client for 30 sec - exiting 00:11:02 (5108): No heartbeat from core client for 30 sec - exiting 00:11:03 (5108): No heartbeat from core client for 30 sec - exiting 00:11:04 (5108): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4468, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3488, iMonCtr=1 Model crash detected, will try to restart... 12:10:53 (5004): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3204, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4656, iMonCtr=1 Model crash detected, will try to restart... Ocean Restart file copy failed on o3dqko.dan98f0 CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=924, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=924, iMonCtr=1 Model crash detected, will try to restart... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x7756FF2B write attempt to address 0xFFFFFFF8 Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77823AB3 read attempt to address 0x00000000 Engaging BOINC Windows Runtime Debugger... </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
16 Aug 2013 15:25:54	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	518,400	1,214,659	2.3431
16 Aug 2013 15:25:54	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	492,480	1,153,697	2.3426
26 Jul 2013 18:25:11	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	466,560	1,094,278	2.3454
25 Jul 2013 13:17:10	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	440,640	1,032,732	2.3437
23 Jul 2013 22:23:40	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	414,720	971,441	2.3424
23 Jul 2013 21:44:59	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	388,800	911,111	2.3434
23 Jul 2013 20:30:58	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	362,880	848,274	2.3376
23 Jul 2013 20:30:58	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	336,960	789,023	2.3416
23 Jul 2013 20:30:58	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	311,040	727,685	2.3395
23 Jul 2013 20:30:58	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	285,120	669,549	2.3483
08 Jul 2013 12:13:16	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	259,200	604,769	2.3332
06 Jul 2013 06:14:55	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	233,280	544,903	2.3358
02 Jul 2013 11:33:27	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	207,360	484,375	2.3359
02 Jul 2013 09:44:30	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	181,440	423,439	2.3338
26 Jun 2013 18:38:55	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	155,520	360,986	2.3212
25 Jun 2013 09:31:29	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	129,600	299,910	2.3141
19 Jun 2013 15:44:29	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	103,680	241,772	2.3319
14 Jun 2013 18:49:57	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	77,760	179,866	2.3131
05 Jun 2013 18:06:03	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	51,840	119,144	2.2983
31 May 2013 04:19:27	1053145	15803134	hadcm3n_o3dq_2020_40_008374532_0	25,920	59,467	2.2943