Task 13813069

Name	hadcm3n_t71v_1940_40_007617539_2
Workunit	7795669
Created	23 Dec 2011, 14:38:09 UTC
Sent	23 Dec 2011, 14:46:10 UTC
Report deadline	23 Mar 2012, 22:13:21 UTC
Received	28 Jan 2012, 12:48:29 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1112451
Run time	14 days 11 hours 40 min 2 sec
CPU time	11 days 23 hours 19 min 47 sec
Validate state	Invalid
Credit	9,331.20
Device peak FLOPS	2.80 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.12.34</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2620, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4528, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CSuspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4204, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 02:23:03 (4860): No heartbeat from core client for 30 sec - exiting 02:23:04 (4860): No heartbeat from core client for 30 sec - exiting 02:23:05 (4860): No heartbeat from core client for 30 sec - exiting 02:23:06 (4860): No heartbeat from core client for 30 sec - exiting 02:23:07 (4860): No heartbeat from core client for 30 sec - exiting 02:23:09 (4860): No heartbeat from core client for 30 sec - exiting 02:23:10 (4860): No heartbeat from core client for 30 sec - exiting 02:23:11 (4860): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4928, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4700, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4408, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4576, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3892, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4740, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... C17:52:48 (4284): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1040, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 19:21:14 (4904): No heartbeat from core client for 30 sec - exiting 19:21:15 (4904): No heartbeat from core client for 30 sec - exiting 19:21:16 (4904): No heartbeat from core client for 30 sec - exiting 19:21:17 (4904): No heartbeat from core client for 30 sec - exiting 19:21:18 (4904): No heartbeat from core client for 30 sec - exiting 19:21:19 (4904): No heartbeat from core client for 30 sec - exiting 19:21:20 (4904): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... C20:08:17 (4808): No heartbeat from core client for 30 sec - exiting 20:08:19 (4808): No heartbeat from core client for 30 sec - exiting 20:08:20 (4808): No heartbeat from core client for 30 sec - exiting 20:08:21 (4808): No heartbeat from core client for 30 sec - exiting 20:08:22 (4808): No heartbeat from core client for 30 sec - exiting 20:08:23 (4808): No heartbeat from core client for 30 sec - exiting 20:08:24 (4808): No heartbeat from core client for 30 sec - exiting 20:08:25 (4808): No heartbeat from core client for 30 sec - exiting 20:08:26 (4808): No heartbeat from core client for 30 sec - exiting 20:08:27 (4808): No heartbeat from core client for 30 sec - exiting 20:08:29 (4808): No heartbeat from core client for 30 sec - exiting 20:08:30 (4808): No heartbeat from core client for 30 sec - exiting 20:08:31 (4808): No heartbeat from core client for 30 sec - exiting 20:08:32 (4808): No heartbeat from core client for 30 sec - exiting 20:08:33 (4808): No heartbeat from core client for 30 sec - exiting 20:08:34 (4808): No heartbeat from core client for 30 sec - exiting 20:08:35 (4808): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:10:02 (4440): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 19:31:37 (4652): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4988, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4092, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
28 Jan 2012 12:51:56	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	777,600	1,034,379	1.3302
28 Jan 2012 12:51:56	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	751,680	999,400	1.3296
25 Jan 2012 23:27:41	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	725,760	965,074	1.3297
24 Jan 2012 02:27:19	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	699,840	931,863	1.3315
23 Jan 2012 01:33:40	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	673,920	898,119	1.3327
22 Jan 2012 09:41:18	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	648,000	863,164	1.3320
20 Jan 2012 14:56:16	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	622,080	828,056	1.3311
20 Jan 2012 14:56:16	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	596,160	793,242	1.3306
18 Jan 2012 00:27:11	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	570,240	758,772	1.3306
16 Jan 2012 04:23:41	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	544,320	723,993	1.3301
15 Jan 2012 05:34:46	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	518,400	688,906	1.3289
15 Jan 2012 03:14:19	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	492,480	654,304	1.3286
15 Jan 2012 03:14:19	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	466,560	618,943	1.3266
13 Jan 2012 01:41:12	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	440,640	583,639	1.3245
13 Jan 2012 01:41:12	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	414,720	549,081	1.3240
13 Jan 2012 01:41:12	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	388,800	514,825	1.3241
08 Jan 2012 12:20:42	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	362,880	479,962	1.3226
08 Jan 2012 12:20:42	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	336,960	445,539	1.3222
07 Jan 2012 08:17:07	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	311,040	410,962	1.3213
05 Jan 2012 00:57:32	1112451	13813069	hadcm3n_t71v_1940_40_007617539_2	285,120	376,389	1.3201