Task 15448507

Name	hadcm3n_zdgp_1880_40_008249096_1
Workunit	8404220
Created	21 Nov 2012, 20:34:50 UTC
Sent	21 Nov 2012, 20:34:53 UTC
Report deadline	21 Feb 2013, 4:02:04 UTC
Received	7 Feb 2013, 17:02:56 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1179495
Run time	28 days 18 hours 39 min 5 sec
CPU time	24 days 21 hours 16 min 7 sec
Validate state	Invalid
Credit	12,441.60
Device peak FLOPS	2.48 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.28</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 04:09:12 (5164): No heartbeat from core client for 30 sec - exiting 04:09:13 (5164): No heartbeat from core client for 30 sec - exiting 04:09:14 (5164): No heartbeat from core client for 30 sec - exiting 04:09:15 (5164): No heartbeat from core client for 30 sec - exiting 04:09:16 (5164): No heartbeat from core client for 30 sec - exiting 04:09:17 (5164): No heartbeat from core client for 30 sec - exiting 04:09:18 (5164): No heartbeat from core client for 30 sec - exiting 04:09:19 (5164): No heartbeat from core client for 30 sec - exiting 04:09:20 (5164): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 04:09:21 (5164): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3360, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3344, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1436, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2092, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3116, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2204, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1636, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3000, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3000, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3188, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3352, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3548, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3548, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3008, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2620, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2268, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/zdgpko.pjb9c10 Error converting file to netcdf: dataout/zdgpko.pib9c10 Error converting file to netcdf: dataout/zdgpko.pfb9c10 Error converting file to netcdf: dataout/zdgpka.phb9c10 Error converting file to netcdf: dataout/zdgpka.pgb9c10 Error converting file to netcdf: dataout/zdgpka.peb9c10 Error converting file to netcdf: dataout/zdgpka.pdb9c10 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2868, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2868, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77C27215 read attempt to address 0x40A8F5E7 Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77C27373 read attempt to address 0xFFFFFFF8 Engaging BOINC Windows Runtime Debugger... Cannot serialize file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_zdgp_1880_40_008249096/dataout/shmem_restart.day Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
06 Feb 2013 21:33:25	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	1,036,800	2,149,282	2.0730
03 Feb 2013 21:08:58	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	1,010,880	2,090,853	2.0683
03 Feb 2013 01:00:51	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	984,960	2,030,066	2.0611
01 Feb 2013 20:56:48	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	959,040	1,970,107	2.0542
31 Jan 2013 01:44:27	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	933,120	1,918,500	2.0560
30 Jan 2013 01:07:10	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	907,200	1,876,045	2.0680
27 Jan 2013 22:47:33	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	881,280	1,836,171	2.0835
27 Jan 2013 11:26:34	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	855,360	1,796,521	2.1003
26 Jan 2013 22:28:43	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	829,440	1,753,115	2.1136
26 Jan 2013 00:38:34	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	803,520	1,694,526	2.1089
24 Jan 2013 22:28:09	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	777,600	1,646,367	2.1172
23 Jan 2013 17:36:07	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	751,680	1,603,085	2.1327
21 Jan 2013 22:08:18	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	725,760	1,558,030	2.1468
20 Jan 2013 09:44:48	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	699,840	1,508,128	2.1550
18 Jan 2013 22:07:34	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	673,920	1,446,662	2.1466
16 Jan 2013 17:21:51	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	648,000	1,388,754	2.1431
14 Jan 2013 19:12:43	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	622,080	1,332,497	2.1420
13 Jan 2013 05:33:58	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	596,160	1,272,408	2.1343
12 Jan 2013 04:03:24	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	570,240	1,218,605	2.1370
10 Jan 2013 17:36:13	1179495	15448507	hadcm3n_zdgp_1880_40_008249096_1	544,320	1,167,827	2.1455