Task 14068739

Name	hadcm3n_ylqy_1940_40_007751638_0
Workunit	7906747
Created	6 Feb 2012, 7:38:01 UTC
Sent	6 Feb 2012, 7:41:51 UTC
Report deadline	7 May 2012, 15:09:02 UTC
Received	29 Feb 2012, 16:29:11 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1163640
Run time	14 days 19 hours 8 min 45 sec
CPU time	13 days 21 hours 54 min 48 sec
Validate state	Invalid
Credit	6,220.80
Device peak FLOPS	2.11 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.12.34</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5016, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3400, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2356, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2408, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3908, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3908, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3908, iMonCtr=1 Model crash detected, will try to restart... 07:44:06 (156): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2964, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1664, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2628, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1724, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1724, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3716, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3716, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2340, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2340, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2340, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
29 Feb 2012 15:30:26	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	518,400	1,202,084	2.3188
28 Feb 2012 14:19:28	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	492,480	1,140,858	2.3166
27 Feb 2012 10:59:00	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	466,560	1,080,121	2.3151
25 Feb 2012 19:22:39	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	440,640	1,019,334	2.3133
24 Feb 2012 17:08:33	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	414,720	959,680	2.3140
23 Feb 2012 15:02:50	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	388,800	899,587	2.3138
22 Feb 2012 12:22:20	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	362,880	838,827	2.3116
21 Feb 2012 10:03:36	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	336,960	777,791	2.3083
20 Feb 2012 07:46:08	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	311,040	718,482	2.3099
18 Feb 2012 19:59:59	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	285,120	657,783	2.3070
17 Feb 2012 14:32:00	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	259,200	597,941	2.3069
16 Feb 2012 12:33:03	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	233,280	538,599	2.3088
14 Feb 2012 23:29:02	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	207,360	478,219	2.3062
14 Feb 2012 00:02:07	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	181,440	418,578	2.3070
12 Feb 2012 21:24:47	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	155,520	359,513	2.3117
11 Feb 2012 19:18:23	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	129,600	300,678	2.3200
10 Feb 2012 12:37:25	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	103,680	240,646	2.3210
09 Feb 2012 09:46:46	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	77,760	180,868	2.3260
08 Feb 2012 09:49:13	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	51,840	120,395	2.3224
07 Feb 2012 09:42:09	1163640	14068739	hadcm3n_ylqy_1940_40_007751638_0	25,920	59,897	2.3108