Task 13401298

Name	hadcm3n_o3dt_1940_40_007443315_3
Workunit	7640818
Created	19 Sep 2011, 18:25:06 UTC
Sent	19 Sep 2011, 18:28:10 UTC
Report deadline	20 Dec 2011, 1:55:21 UTC
Received	7 Jan 2012, 10:53:41 UTC
Server state	Over
Outcome	Computation error
Client state	Aborted by user
Exit status	-197 (0xFFFFFF3B) ERR_ABORTED_VIA_GUI
Computer ID	804158
Run time	11 days 16 hours 36 min 43 sec
CPU time	8 days 21 hours 22 min 50 sec
Validate state	Invalid
Credit	4,043.52
Device peak FLOPS	2.33 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.58</core_client_version> <![CDATA[ <message> aborted by user </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5488, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 12:47:08 (5772): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5984, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/o3dtko.pje1c10 Error converting file to netcdf: dataout/o3dtko.pie1c10 Error converting file to netcdf: dataout/o3dtko.pfe1c10 Error converting file to netcdf: dataout/o3dtka.phe1c10 Error converting file to netcdf: dataout/o3dtka.pge1c10 Error converting file to netcdf: dataout/o3dtka.pee1c10 Error converting file to netcdf: dataout/o3dtka.pde1c10 19:03:48 (6012): No heartbeat from core client for 30 sec - exiting 19:03:49 (6012): No heartbeat from core client for 30 sec - exiting 19:03:51 (6012): No heartbeat from core client for 30 sec - exiting 19:03:52 (6012): No heartbeat from core client for 30 sec - exiting 19:03:53 (6012): No heartbeat from core client for 30 sec - exiting 19:03:54 (6012): No heartbeat from core client for 30 sec - exiting 19:03:55 (6012): No heartbeat from core client for 30 sec - exiting 19:03:56 (6012): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4380, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5156, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6048, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6048, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 20:00:50 (5432): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5904, iMonCtr=1 Model crash detected, will try to restart... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5428, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4300, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3804, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5052, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5348, iMonCtr=1 Model crash detected, will try to restart... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5212, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5972, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 23:03:17 (6012): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CCPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4408, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 20:52:49 (4172): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5244, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5920, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2524, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5876, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4964, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5548, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5048, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5612, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Abort request from BOINC... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
04 Jan 2012 00:03:55	804158	13401298	hadcm3n_o3dt_1940_40_007443315_3	336,960	720,501	2.1382
25 Dec 2011 21:22:43	804158	13401298	hadcm3n_o3dt_1940_40_007443315_3	311,040	669,081	2.1511
24 Dec 2011 20:05:12	804158	13401298	hadcm3n_o3dt_1940_40_007443315_3	285,120	619,029	2.1711
13 Dec 2011 21:09:29	804158	13401298	hadcm3n_o3dt_1940_40_007443315_3	259,200	562,756	2.1711
06 Dec 2011 20:30:55	804158	13401298	hadcm3n_o3dt_1940_40_007443315_3	233,280	502,616	2.1546
22 Nov 2011 21:54:37	804158	13401298	hadcm3n_o3dt_1940_40_007443315_3	207,360	444,879	2.1454
15 Nov 2011 18:44:18	804158	13401298	hadcm3n_o3dt_1940_40_007443315_3	181,440	388,884	2.1433
15 Nov 2011 18:44:18	804158	13401298	hadcm3n_o3dt_1940_40_007443315_3	155,520	332,557	2.1384
05 Nov 2011 14:26:31	804158	13401298	hadcm3n_o3dt_1940_40_007443315_3	129,600	275,031	2.1222
31 Oct 2011 16:45:28	804158	13401298	hadcm3n_o3dt_1940_40_007443315_3	103,680	219,355	2.1157
17 Oct 2011 22:06:52	804158	13401298	hadcm3n_o3dt_1940_40_007443315_3	77,760	162,767	2.0932
10 Oct 2011 18:39:56	804158	13401298	hadcm3n_o3dt_1940_40_007443315_3	51,840	107,492	2.0735
28 Sep 2011 06:12:47	804158	13401298	hadcm3n_o3dt_1940_40_007443315_3	25,920	53,047	2.0466