Task 13655843

Name	hadcm3n_yhsm_1900_40_007515941_2
Workunit	7713416
Created	23 Nov 2011, 9:19:17 UTC
Sent	23 Nov 2011, 9:20:26 UTC
Report deadline	22 Feb 2012, 16:47:37 UTC
Received	6 Dec 2011, 15:31:15 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1181124
Run time	5 days 21 hours 42 min 5 sec
CPU time	5 days 21 hours 42 min 5 sec
Validate state	Invalid
Credit	3,110.40
Device peak FLOPS	3.02 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.17</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4256, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4328, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3932, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3272, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7632, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3508, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 03:07:48 (4104): No heartbeat from core client for 30 sec - exiting 03:07:49 (4104): No heartbeat from core client for 30 sec - exiting 03:07:50 (4104): No heartbeat from core client for 30 sec - exiting 03:07:51 (4104): No heartbeat from core client for 30 sec - exiting 03:07:52 (4104): No heartbeat from core client for 30 sec - exiting 03:07:53 (4104): No heartbeat from core client for 30 sec - exiting 03:07:54 (4104): No heartbeat from core client for 30 sec - exiting 03:07:55 (4104): No heartbeat from core client for 30 sec - exiting 03:07:56 (4104): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:07:57 (4104): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4268, iMonCtr=1 Model crash detected, will try to restart... 18:26:51 (5004): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:15:25 (3596): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2600, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3912, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4216, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CCController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3724, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3996, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4260, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
06 Dec 2011 15:32:04	1181124	13655843	hadcm3n_yhsm_1900_40_007515941_2	259,200	510,076	1.9679
05 Dec 2011 08:30:51	1181124	13655843	hadcm3n_yhsm_1900_40_007515941_2	233,280	459,706	1.9706
04 Dec 2011 08:14:48	1181124	13655843	hadcm3n_yhsm_1900_40_007515941_2	207,360	409,791	1.9762
03 Dec 2011 22:57:19	1181124	13655843	hadcm3n_yhsm_1900_40_007515941_2	181,440	360,970	1.9895
30 Nov 2011 09:41:51	1181124	13655843	hadcm3n_yhsm_1900_40_007515941_2	155,520	310,668	1.9976
29 Nov 2011 04:04:36	1181124	13655843	hadcm3n_yhsm_1900_40_007515941_2	129,600	255,831	1.9740
27 Nov 2011 16:52:52	1181124	13655843	hadcm3n_yhsm_1900_40_007515941_2	103,680	204,310	1.9706
26 Nov 2011 14:05:52	1181124	13655843	hadcm3n_yhsm_1900_40_007515941_2	77,760	153,695	1.9765
25 Nov 2011 11:51:47	1181124	13655843	hadcm3n_yhsm_1900_40_007515941_2	51,840	99,993	1.9289
24 Nov 2011 08:31:31	1181124	13655843	hadcm3n_yhsm_1900_40_007515941_2	25,920	50,495	1.9481