Task 13925199

Name	hadcm3n_u0uk_1980_40_007682869_1
Workunit	7837956
Created	16 Jan 2012, 2:14:47 UTC
Sent	16 Jan 2012, 2:18:36 UTC
Report deadline	16 Apr 2012, 9:45:47 UTC
Received	21 Jun 2012, 23:50:54 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1092111
Run time	24 days 18 hours 20 min 17 sec
CPU time	20 days 15 hours 53 min 15 sec
Validate state	Invalid
Credit	6,220.80
Device peak FLOPS	1.56 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.58</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6044, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5844, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4640, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4952, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4208, iMonCtr=1 Model crash detected, will try to restart... CCPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4580, iMonCtr=1 Model crash detected, will try to restart... 02:19:57 (2928): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:19:58 (2928): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 19:17:14 (3712): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3852, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5104, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4384, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN procCPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2852, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CCPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5160, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3876, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2820, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4392, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 21:32:30 (1348): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CCPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5132, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5636, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5644, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5160, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3616, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6120, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3976, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5872, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1968, iMonCtr=1 Model crash detected, will try to restart... CCPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1116, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3632, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5096, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x771A6E5F read attempt to address 0x40C6D1EC Engaging BOINC Windows Runtime Debugger... No Process Handle Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=544, selfPID=544, iMonCtr=1 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x771A8851 read attempt to address 0xFFFFFFF8 Engaging BOINC Windows Runtime Debugger... Cannot serialize file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_u0uk_1980_40_007682869/dataout/shmem_restart.day Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
21 Jun 2012 21:17:15	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	518,400	1,784,174	3.4417
12 Jun 2012 00:48:09	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	492,480	1,697,477	3.4468
04 Jun 2012 16:57:16	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	466,560	1,611,669	3.4544
28 May 2012 20:38:38	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	440,640	1,523,202	3.4568
20 May 2012 20:36:20	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	414,720	1,435,791	3.4621
13 May 2012 03:11:12	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	388,800	1,347,523	3.4659
03 May 2012 03:54:59	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	362,880	1,258,143	3.4671
27 Apr 2012 03:45:07	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	336,960	1,167,350	3.4644
12 Apr 2012 03:33:34	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	311,040	1,076,944	3.4624
04 Apr 2012 00:18:09	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	285,120	988,929	3.4685
22 Mar 2012 01:33:17	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	259,200	893,617	3.4476
06 Mar 2012 21:01:04	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	233,280	805,932	3.4548
01 Mar 2012 22:09:38	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	207,360	720,775	3.4760
25 Feb 2012 02:33:53	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	181,440	632,725	3.4872
19 Feb 2012 17:44:10	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	155,520	537,842	3.4583
17 Feb 2012 01:32:42	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	129,600	445,830	3.4400
09 Feb 2012 03:09:46	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	103,680	355,816	3.4319
05 Feb 2012 23:04:16	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	77,760	267,111	3.4351
28 Jan 2012 01:36:03	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	51,840	179,077	3.4544
21 Jan 2012 22:23:41	1092111	13925199	hadcm3n_u0uk_1980_40_007682869_1	25,920	87,150	3.3623