Task 15855049

Name	hadcm3n_3lb9_2020_40_008393420_0
Workunit	8544279
Created	21 Jun 2013, 22:18:56 UTC
Sent	23 Jun 2013, 17:36:44 UTC
Report deadline	23 Sep 2013, 1:03:55 UTC
Received	17 Aug 2013, 11:04:56 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	-1073741819 (0xC0000005) STATUS_ACCESS_VIOLATION
Computer ID	1103079
Run time	7 days 15 hours 43 min 36 sec
CPU time	6 days 3 hours 10 min 18 sec
Validate state	Invalid
Credit	3,110.40
Device peak FLOPS	2.48 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.64</core_client_version> <![CDATA[ <message> (unknown error) - exit code -1073741819 (0xc0000005) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=61608, iMonCtr=1 Model crash detected, will try to restart... 19:24:57 (888): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4008, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6136, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6028, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5580, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4852, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3516, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4836, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1396, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5580, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4540, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3512, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/3lb9ko.pjm5c10 Error converting file to netcdf: dataout/3lb9ko.pim5c10 Error converting file to netcdf: dataout/3lb9ko.pfm5c10 Error converting file to netcdf: dataout/3lb9ka.phm5c10 Error converting file to netcdf: dataout/3lb9ka.pgm5c10 Error converting file to netcdf: dataout/3lb9ka.pem5c10 Error converting file to netcdf: dataout/3lb9ka.pdm5c10 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5004, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3616, iMonCtr=1 Model crash detected, will try to restart... 18:06:03 (5540): No heartbeat from core client for 30 sec - exiting 18:06:04 (5540): No heartbeat from core client for 30 sec - exiting 18:06:05 (5540): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... C17:31:08 (4540): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4864, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5524, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5452, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4836, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5412, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5452, iMonCtr=1 Model crash detected, will try to restart... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77203AB3 read attempt to address 0x40C2931E Engaging BOINC Windows Runtime Debugger... </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
17 Aug 2013 11:10:01	1103079	15855049	hadcm3n_3lb9_2020_40_008393420_0	259,200	529,802	2.0440
17 Aug 2013 11:10:01	1103079	15855049	hadcm3n_3lb9_2020_40_008393420_0	233,280	477,051	2.0450
26 Jul 2013 12:58:32	1103079	15855049	hadcm3n_3lb9_2020_40_008393420_0	207,360	424,812	2.0487
23 Jul 2013 19:00:05	1103079	15855049	hadcm3n_3lb9_2020_40_008393420_0	181,440	371,673	2.0485
11 Jul 2013 20:49:55	1103079	15855049	hadcm3n_3lb9_2020_40_008393420_0	155,520	319,444	2.0540
06 Jul 2013 05:38:38	1103079	15855049	hadcm3n_3lb9_2020_40_008393420_0	129,600	267,276	2.0623
02 Jul 2013 12:03:45	1103079	15855049	hadcm3n_3lb9_2020_40_008393420_0	103,680	215,807	2.0815
02 Jul 2013 11:22:39	1103079	15855049	hadcm3n_3lb9_2020_40_008393420_0	77,760	161,045	2.0711
02 Jul 2013 10:16:19	1103079	15855049	hadcm3n_3lb9_2020_40_008393420_0	51,840	105,576	2.0366
02 Jul 2013 09:49:26	1103079	15855049	hadcm3n_3lb9_2020_40_008393420_0	25,920	52,918	2.0416