Task 13358948

Name	hadcm3n_t2gc_1940_40_007447327_1
Workunit	7644830
Created	9 Sep 2011, 17:44:12 UTC
Sent	16 Sep 2011, 7:57:19 UTC
Report deadline	16 Dec 2011, 15:24:30 UTC
Received	26 Oct 2011, 16:02:35 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1102379
Run time	11 days 17 hours 3 min 58 sec
CPU time	11 days 7 hours 7 min 23 sec
Validate state	Invalid
Credit	6,220.80
Device peak FLOPS	2.91 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.60</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4204, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3468, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4608, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3456, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 00:01:46 (3768): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:02:47 (2480): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:15:01 (1836): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:53:51 (3516): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:54:52 (2324): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:01:57 (3872): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:02:34 (3940): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:27:04 (4780): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:46:59 (3724): No heartbeat from core client for 30 sec - exiting 07:47:00 (3724): No heartbeat from core client for 30 sec - exiting 07:48:43 (3652): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:49:43 (3768): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3972, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4068, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3692, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3476, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3564, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
31 Oct 2011 16:33:59	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	518,400	976,037	1.8828
19 Oct 2011 09:22:52	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	492,480	927,158	1.8826
18 Oct 2011 17:29:40	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	466,560	878,150	1.8822
17 Oct 2011 13:26:27	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	440,640	829,193	1.8818
16 Oct 2011 21:04:58	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	414,720	780,152	1.8812
14 Oct 2011 18:47:30	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	388,800	730,867	1.8798
13 Oct 2011 20:17:16	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	362,880	681,781	1.8788
12 Oct 2011 18:05:37	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	336,960	633,639	1.8805
11 Oct 2011 18:24:27	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	311,040	584,578	1.8794
10 Oct 2011 18:56:49	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	285,120	535,466	1.8780
07 Oct 2011 20:09:39	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	259,200	486,390	1.8765
05 Oct 2011 23:54:43	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	233,280	437,366	1.8749
05 Oct 2011 00:06:14	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	207,360	388,582	1.8739
04 Oct 2011 02:10:26	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	181,440	340,481	1.8765
03 Oct 2011 03:42:20	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	155,520	290,098	1.8653
30 Sep 2011 20:57:07	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	129,600	241,268	1.8616
30 Sep 2011 06:54:17	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	103,680	192,225	1.8540
28 Sep 2011 20:53:02	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	77,760	143,725	1.8483
26 Sep 2011 21:04:28	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	51,840	95,363	1.8396
24 Sep 2011 21:05:37	1102379	13358948	hadcm3n_t2gc_1940_40_007447327_1	25,920	47,724	1.8412