Task 14723983

Name	hadcm3n_206c_1980_40_007954694_4
Workunit	8109806
Created	20 May 2012, 5:18:26 UTC
Sent	20 May 2012, 5:19:16 UTC
Report deadline	19 Aug 2012, 12:46:27 UTC
Received	15 Jun 2012, 21:34:55 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1084876
Run time	10 days 16 hours 52 min
CPU time	8 days 21 hours 44 min 10 sec
Validate state	Invalid
Credit	6,220.80
Device peak FLOPS	3.04 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.56</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3516, iMonCtr=1 Model crash detected, will try to restart... 06:24:41 (2436): No heartbeat from core client for 30 sec - exiting 06:24:42 (2436): No heartbeat from core client for 30 sec - exiting 06:24:43 (2436): No heartbeat from core client for 30 sec - exiting 06:24:44 (2436): No heartbeat from core client for 30 sec - exiting 06:24:45 (2436): No heartbeat from core client for 30 sec - exiting 06:24:46 (2436): No heartbeat from core client for 30 sec - exiting 06:24:47 (2436): No heartbeat from core client for 30 sec - exiting 06:24:48 (2436): No heartbeat from core client for 30 sec - exiting 06:24:49 (2436): No heartbeat from core client for 30 sec - exiting 06:24:50 (2436): No heartbeat from core client for 30 sec - exiting 06:24:51 (2436): No heartbeat from core client for 30 sec - exiting 06:24:52 (2436): No heartbeat from core client for 30 sec - exiting 06:24:53 (2436): No heartbeat from core client for 30 sec - exiting 06:24:54 (2436): No heartbeat from core client for 30 sec - exiting 06:24:55 (2436): No heartbeat from core client for 30 sec - exiting 06:24:56 (2436): No heartbeat from core client for 30 sec - exiting 06:24:57 (2436): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 15:05:28 (3964): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=992, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3540, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - QuNo Process Handle Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3416, selfPID=3416, iMonCtr=1 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3640, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3640, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3632, iMonCtr=1 Model crash detected, will try to restart... CSuspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3548, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/206cko.pjj4c10 Error converting file to netcdf: dataout/206cko.pij4c10 Error converting file to netcdf: dataout/206cko.pfj4c10 Error converting file to netcdf: dataout/206cka.phj4c10 Error converting file to netcdf: dataout/206cka.pgj4c10 Error converting file to netcdf: dataout/206cka.pej4c10 Error converting file to netcdf: dataout/206cka.pdj4c10 Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3564, iMonCtr=1 Model crash detected, will try to restart... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3344, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3404, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3476, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3404, iMonCtr=1 Model crash detected, will try to restart... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x773D5B44 read attempt to address 0x00000000 Engaging BOINC Windows Runtime Debugger... Cannot serialize file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_206c_1980_40_007954694/dataout/shmem_restart.day Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
15 Jun 2012 17:26:11	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	518,400	769,441	1.4843
14 Jun 2012 10:03:21	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	492,480	730,724	1.4838
13 Jun 2012 14:30:56	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	466,560	691,418	1.4819
12 Jun 2012 07:51:02	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	440,640	652,110	1.4799
11 Jun 2012 04:00:44	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	414,720	612,850	1.4777
10 Jun 2012 15:20:46	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	388,800	573,881	1.4760
06 Jun 2012 04:56:06	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	362,880	534,392	1.4726
05 Jun 2012 15:45:18	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	336,960	493,603	1.4649
03 Jun 2012 13:18:39	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	311,040	455,341	1.4639
02 Jun 2012 23:45:15	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	285,120	416,304	1.4601
02 Jun 2012 08:58:30	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	259,200	377,720	1.4573
01 Jun 2012 20:21:32	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	233,280	339,602	1.4558
30 May 2012 21:13:26	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	207,360	300,988	1.4515
26 May 2012 23:57:49	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	181,440	262,290	1.4456
26 May 2012 03:27:58	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	155,520	224,764	1.4452
23 May 2012 23:37:31	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	129,600	187,793	1.4490
23 May 2012 11:45:37	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	103,680	150,190	1.4486
22 May 2012 15:23:14	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	77,760	112,701	1.4493
21 May 2012 16:58:31	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	51,840	75,057	1.4479
20 May 2012 20:56:53	1084876	14723983	hadcm3n_206c_1980_40_007954694_4	25,920	37,781	1.4576