Task 15480843

Name	hadcm3n_o2e9_2100_40_008256209_0
Workunit	8411333
Created	16 Dec 2012, 20:21:01 UTC
Sent	16 Dec 2012, 20:21:06 UTC
Report deadline	18 Mar 2013, 3:48:17 UTC
Received	25 Jan 2013, 21:06:34 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	-529697949 (0xE06D7363) Unknown error code
Computer ID	1075479
Run time	5 days 0 hours 56 min 21 sec
CPU time	4 days 10 hours 42 min 44 sec
Validate state	Invalid
Credit	3,110.40
Device peak FLOPS	3.12 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.12.33</core_client_version> <![CDATA[ <message> - exit code -529697949 (0xe06d7363) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5648, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=924, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6052, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1772, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3592, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4136, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3756, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5180, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5476, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2396, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1392, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5632, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5148, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4132, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4308, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3948, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5280, iMonCtr=1 Model crash detected, will try to restart... 10:50:32 (5612): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5496, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6080, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4452, iMonCtr=1 Model crash detected, will try to restart... 09:23:02 (4220): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5856, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5488, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3128, iMonCtr=1 Model crash detected, will try to restart... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77A26737 read attempt to address 0xFFFFFFF8 Engaging BOINC Windows Runtime Debugger... Cannot serialize file G:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_o2e9_2100_40_008256209/dataout/shmem_restart.day Signal 11 received, exiting... Called boinc_finish Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77A263C6 read attempt to address 0xFFFFFFF8 Engaging BOINC Windows Runtime Debugger... </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
25 Jan 2013 21:07:14	1075479	15480843	hadcm3n_o2e9_2100_40_008256209_0	259,200	384,157	1.4821
23 Jan 2013 22:08:42	1075479	15480843	hadcm3n_o2e9_2100_40_008256209_0	233,280	342,974	1.4702
19 Jan 2013 21:06:36	1075479	15480843	hadcm3n_o2e9_2100_40_008256209_0	207,360	303,856	1.4654
14 Jan 2013 20:54:47	1075479	15480843	hadcm3n_o2e9_2100_40_008256209_0	181,440	262,632	1.4475
10 Jan 2013 20:51:11	1075479	15480843	hadcm3n_o2e9_2100_40_008256209_0	155,520	225,688	1.4512
06 Jan 2013 20:01:17	1075479	15480843	hadcm3n_o2e9_2100_40_008256209_0	129,600	188,485	1.4544
03 Jan 2013 18:21:50	1075479	15480843	hadcm3n_o2e9_2100_40_008256209_0	103,680	151,019	1.4566
30 Dec 2012 22:21:02	1075479	15480843	hadcm3n_o2e9_2100_40_008256209_0	77,760	113,998	1.4660
27 Dec 2012 21:36:25	1075479	15480843	hadcm3n_o2e9_2100_40_008256209_0	51,840	75,350	1.4535
21 Dec 2012 20:02:37	1075479	15480843	hadcm3n_o2e9_2100_40_008256209_0	25,920	37,236	1.4366