Task 13680502

Name	hadcm3n_o4zi_1940_40_007548444_3
Workunit	7745676
Created	1 Dec 2011, 8:21:31 UTC
Sent	1 Dec 2011, 8:27:16 UTC
Report deadline	1 Mar 2012, 15:54:27 UTC
Received	29 Jan 2012, 9:43:08 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	925017
Run time	7 days 8 hours 57 min 29 sec
CPU time	5 days 23 hours 3 min 30 sec
Validate state	Invalid
Credit	3,110.40
Device peak FLOPS	2.78 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.18</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold 08:26:45 (4272): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3164, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3732, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3732, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3732, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3732, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3732, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3732, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3732, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3732, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3732, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3732, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3732, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3732, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/o4ziko.pje4c10 Error converting file to netcdf: dataout/o4ziko.pie4c10 Error converting file to netcdf: dataout/o4ziko.pfe4c10 Error converting file to netcdf: dataout/o4zika.phe4c10 Error converting file to netcdf: dataout/o4zika.pge4c10 Error converting file to netcdf: dataout/o4zika.pee4c10 Error converting file to netcdf: dataout/o4zika.pde4c10 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3884, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/o4ziko.pje8c10 Error converting file to netcdf: dataout/o4ziko.pie8c10 Error converting file to netcdf: dataout/o4ziko.pfe8c10 Error converting file to netcdf: dataout/o4zika.phe8c10 Error converting file to netcdf: dataout/o4zika.pge8c10 Error converting file to netcdf: dataout/o4zika.pee8c10 Error converting file to netcdf: dataout/o4zika.pde8c10 17:57:05 (4868): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:12:37 (3604): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:02:52 (1324): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:25:04 (4084): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:48:03 (3140): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:39:56 (4148): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77E43800 read attempt to address 0x404CE638 Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77492AB1 read attempt to address 0x404CE638 Engaging BOINC Windows Runtime Debugger... Cannot serialize file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_o4zi_1940_40_007548444/dataout/shmem_restart.day Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
27 Jan 2012 10:56:06	925017	13680502	hadcm3n_o4zi_1940_40_007548444_3	259,200	514,994	1.9869
21 Jan 2012 12:56:41	925017	13680502	hadcm3n_o4zi_1940_40_007548444_3	233,280	470,585	2.0173
17 Jan 2012 19:35:34	925017	13680502	hadcm3n_o4zi_1940_40_007548444_3	207,360	426,427	2.0565
06 Jan 2012 20:59:25	925017	13680502	hadcm3n_o4zi_1940_40_007548444_3	181,440	373,242	2.0571
03 Jan 2012 20:27:20	925017	13680502	hadcm3n_o4zi_1940_40_007548444_3	155,520	319,264	2.0529
01 Jan 2012 18:20:29	925017	13680502	hadcm3n_o4zi_1940_40_007548444_3	129,600	265,532	2.0489
23 Dec 2011 18:51:02	925017	13680502	hadcm3n_o4zi_1940_40_007548444_3	103,680	212,335	2.0480
18 Dec 2011 12:38:17	925017	13680502	hadcm3n_o4zi_1940_40_007548444_3	77,760	157,736	2.0285
11 Dec 2011 12:17:38	925017	13680502	hadcm3n_o4zi_1940_40_007548444_3	51,840	105,523	2.0356
04 Dec 2011 14:46:24	925017	13680502	hadcm3n_o4zi_1940_40_007548444_3	25,920	52,996	2.0446