Task 16053363

Name	hadcm3n_oc07_1900_40_008470682_1
Workunit	8621521
Created	2 Oct 2013, 7:17:04 UTC
Sent	2 Oct 2013, 8:26:08 UTC
Report deadline	1 Jan 2014, 15:53:19 UTC
Received	26 Nov 2013, 15:05:53 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1157386
Run time	16 days 0 hours 10 min 37 sec
CPU time	15 days 21 hours 57 min 48 sec
Validate state	Invalid
Credit	12,441.60
Device peak FLOPS	2.74 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.2.28</core_client_version> <![CDATA[ <message> (unknown error) - exit code 193 (0xc1) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5000, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5000, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5000, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4824, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2540, iMonCtr=1 Model crash detected, will try to restart... 22:11:24 (528): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3576, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4452, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3880, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/oc07ko.pjc6c10 Error converting file to netcdf: dataout/oc07ko.pic6c10 Error converting file to netcdf: dataout/oc07ko.pfc6c10 Error converting file to netcdf: dataout/oc07ka.phc6c10 Error converting file to netcdf: dataout/oc07ka.pgc6c10 Error converting file to netcdf: dataout/oc07ka.pec6c10 Error converting file to netcdf: dataout/oc07ka.pdc6c10 CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4628, iMonCtr=1 Model crash detected, wilCPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1944, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3324, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 23:05:59 (4876): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x777A7383 read attempt to address 0xFFFFFFF8 Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x779A3AC3 read attempt to address 0x00000000 Engaging BOINC Windows Runtime Debugger... Cannot serialize file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_oc07_1900_40_008470682/dataout/shmem_restart.day Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
25 Nov 2013 13:52:36	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	1,036,800	1,374,639	1.3258
23 Nov 2013 11:47:48	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	1,010,880	1,338,832	1.3244
22 Nov 2013 13:31:59	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	984,960	1,303,531	1.3234
20 Nov 2013 13:33:05	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	959,040	1,271,995	1.3263
19 Nov 2013 14:39:59	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	933,120	1,240,625	1.3295
19 Nov 2013 00:27:21	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	907,200	1,209,307	1.3330
18 Nov 2013 02:33:14	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	881,280	1,177,957	1.3366
15 Nov 2013 02:34:31	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	855,360	1,144,566	1.3381
13 Nov 2013 13:12:10	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	829,440	1,113,144	1.3420
11 Nov 2013 14:18:55	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	803,520	1,081,114	1.3455
10 Nov 2013 11:23:33	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	777,600	1,046,331	1.3456
07 Nov 2013 14:32:17	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	751,680	1,012,470	1.3469
06 Nov 2013 00:39:51	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	725,760	979,429	1.3495
04 Nov 2013 12:09:53	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	699,840	944,886	1.3501
03 Nov 2013 07:47:48	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	673,920	910,970	1.3517
30 Oct 2013 07:52:25	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	648,000	877,085	1.3535
29 Oct 2013 15:15:29	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	622,080	844,131	1.3569
29 Oct 2013 00:54:19	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	596,160	811,023	1.3604
27 Oct 2013 16:24:33	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	570,240	777,210	1.3630
27 Oct 2013 06:47:31	1157386	16053363	hadcm3n_oc07_1900_40_008470682_1	544,320	742,857	1.3647