Task 13708561

Name	hadam3p_saf_7k1h_2001_1_007582910_1
Workunit	7761040
Created	5 Dec 2011, 1:25:39 UTC
Sent	5 Dec 2011, 1:31:14 UTC
Report deadline	16 Nov 2012, 6:51:14 UTC
Received	21 Mar 2012, 22:22:36 UTC
Server state	Over
Outcome	Success
Client state	Done
Exit status	0 (0x00000000)
Computer ID	1039827
Run time	10 days 11 hours 33 min 3 sec
CPU time	8 days 10 hours 54 min 24 sec
Validate state	Workunit error - check skipped
Credit	2,244.09
Device peak FLOPS	1.57 GFLOPS
Application version	UK Met Office HadAM3P-HadRM3P Southern Africa v6.09 windows_intelx86
Stderr	<core_client_version>6.10.18</core_client_version> <![CDATA[ <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3092, selfPID=3364, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1320, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1420, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3384, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=388, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3352, selfPID=1332, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3880, selfPID=940, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=956, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1000, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1144, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=872, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2180, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2388, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=288, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3568, selfPID=3224, iMonCtr=1 Model crash detected, will try to restart... 18:39:03 (2288): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1560, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1632, iMonCtr=2 Model crash detected, will try to restart... C13:17:00 (980): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:56:35 (3024): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=828, selfPID=828, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3016, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3156, selfPID=3324, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3696, selfPID=848, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2556, iMonCtr=2 Model crash detected, will try to restart... 19:48:40 (3708): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:41:06 (3476): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:47:47 (3316): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2444, selfPID=2444, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1040, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3028, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3980, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3768, selfPID=4028, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2660, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3608, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2876, iMonCtr=2 18:41:15 (2720): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:50:17 (2780): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:50:18 (2780): No heartbeat from core client for 30 sec - exiting 18:50:19 (2780): No heartbeat from core client for 30 sec - exiting 18:50:20 (2780): No heartbeat from core client for 30 sec - exiting 18:50:21 (2780): No heartbeat from core client for 30 sec - exiting 18:50:22 (2780): No heartbeat from core client for 30 sec - exiting 18:50:23 (2780): No heartbeat from core client for 30 sec - exiting 18:50:24 (2780): No heartbeat from core client for 30 sec - exiting 18:50:25 (2780): No heartbeat from core client for 30 sec - exiting 18:50:26 (2780): No heartbeat from core client for 30 sec - exiting 18:50:28 (2780): No heartbeat from core client for 30 sec - exiting 18:50:29 (2780): No heartbeat from core client for 30 sec - exiting 18:50:30 (2780): No heartbeat from core client for 30 sec - exiting 18:50:31 (2780): No heartbeat from core client for 30 sec - exiting 18:50:32 (2780): No heartbeat from core client for 30 sec - exiting 18:50:33 (2780): No heartbeat from core client for 30 sec - exiting 18:50:34 (2780): No heartbeat from core client for 30 sec - exiting 18:50:35 (2780): No heartbeat from core client for 30 sec - exiting 18:50:36 (2780): No heartbeat from core client for 30 sec - exiting 18:50:37 (2780): No heartbeat from core client for 30 sec - exiting 18:50:38 (2780): No heartbeat from core client for 30 sec - exiting 18:55:30 (2104): No heartbeat from core client for 30 sec - exiting 18:55:31 (2104): No heartbeat from core client for 30 sec - exiting 18:55:32 (2104): No heartbeat from core client for 30 sec - exiting 18:55:33 (2104): No heartbeat from core client for 30 sec - exiting 18:55:34 (2104): No heartbeat from core client for 30 sec - exiting 18:55:35 (2104): No heartbeat from core client for 30 sec - exiting 18:55:36 (2104): No heartbeat from core client for 30 sec - exiting 18:55:37 (2104): No heartbeat from core client for 30 sec - exiting 18:55:38 (2104): No heartbeat from core client for 30 sec - exiting 18:55:39 (2104): No heartbeat from core client for 30 sec - exiting 18:55:40 (2104): No heartbeat from core client for 30 sec - exiting 18:55:42 (2104): No heartbeat from core client for 30 sec - exiting 18:55:43 (2104): No heartbeat from core client for 30 sec - exiting 18:55:44 (2104): No heartbeat from core client for 30 sec - exiting 18:55:45 (2104): No heartbeat from core client for 30 sec - exiting 18:55:46 (2104): No heartbeat from core client for 30 sec - exiting 18:55:47 (2104): No heartbeat from core client for 30 sec - exiting 18:55:48 (2104): No heartbeat from core client for 30 sec - exiting 18:55:49 (2104): No heartbeat from core client for 30 sec - exiting 18:55:50 (2104): No heartbeat from core client for 30 sec - exiting 18:55:52 (2104): No heartbeat from core client for 30 sec - exiting 18:55:53 (2104): No heartbeat from core client for 30 sec - exiting 18:55:54 (2104): No heartbeat from core client for 30 sec - exiting 18:55:55 (2104): No heartbeat from core client for 30 sec - exiting 18:55:56 (2104): No heartbeat from core client for 30 sec - exiting 18:55:57 (2104): No heartbeat from core client for 30 sec - exiting 18:55:58 (2104): No heartbeat from core client for 30 sec - exiting 18:55:59 (2104): No heartbeat from core client for 30 sec - exiting 18:56:00 (2104): No heartbeat from core client for 30 sec - exiting 18:56:01 (2104): No heartbeat from core client for 30 sec - exiting 18:56:03 (2104): No heartbeat from core client for 30 sec - exiting 18:56:04 (2104): No heartbeat from core client for 30 sec - exiting 18:56:05 (2104): No heartbeat from core client for 30 sec - exiting 18:56:06 (2104): No heartbeat from core client for 30 sec - exiting 18:56:07 (2104): No heartbeat from core client for 30 sec - exiting 18:56:08 (2104): No heartbeat from core client for 30 sec - exiting 18:56:09 (2104): No heartbeat from core client for 30 sec - exiting 18:56:10 (2104): No heartbeat from core client for 30 sec - exiting 18:56:11 (2104): No heartbeat from core client for 30 sec - exiting 18:56:12 (2104): No heartbeat from core client for 30 sec - exiting 18:56:13 (2104): No heartbeat from core client for 30 sec - exiting 18:56:15 (2104): No heartbeat from core client for 30 sec - exiting 18:56:16 (2104): No heartbeat from core client for 30 sec - exiting 18:56:17 (2104): No heartbeat from core client for 30 sec - exiting 18:56:18 (2104): No heartbeat from core client for 30 sec - exiting 18:56:19 (2104): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3100, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3184, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3004, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1756, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4076, selfPID=2572, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3600, selfPID=2980, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1944, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1276, selfPID=3496, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2836, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1952, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1288, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2204, iMonCtr=2 Model crash detected, will try to restart... CSuspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3500, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1796, selfPID=2636, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2660, selfPID=2236, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2908, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=200, selfPID=3340, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3344, iMonCtr=2 19:07:53 (2288): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
19 Mar 2012 19:26:10	1039827	13708561	hadam3p_saf_7k1h_2001_1_007582910_1	138,336	729,190	5.2712
04 Mar 2012 23:27:37	1039827	13708561	hadam3p_saf_7k1h_2001_1_007582910_1	126,816	667,578	5.2641
25 Feb 2012 18:08:27	1039827	13708561	hadam3p_saf_7k1h_2001_1_007582910_1	115,296	606,942	5.2642
19 Feb 2012 05:28:40	1039827	13708561	hadam3p_saf_7k1h_2001_1_007582910_1	103,776	545,437	5.2559
13 Feb 2012 01:06:06	1039827	13708561	hadam3p_saf_7k1h_2001_1_007582910_1	92,256	485,972	5.2676
06 Feb 2012 01:01:19	1039827	13708561	hadam3p_saf_7k1h_2001_1_007582910_1	80,736	426,149	5.2783
19 Jan 2012 03:15:24	1039827	13708561	hadam3p_saf_7k1h_2001_1_007582910_1	69,216	365,841	5.2855
10 Jan 2012 02:11:52	1039827	13708561	hadam3p_saf_7k1h_2001_1_007582910_1	57,696	305,443	5.2940
06 Jan 2012 19:28:45	1039827	13708561	hadam3p_saf_7k1h_2001_1_007582910_1	46,176	245,219	5.3105
01 Jan 2012 01:25:48	1039827	13708561	hadam3p_saf_7k1h_2001_1_007582910_1	34,656	183,923	5.3071
26 Dec 2011 21:19:19	1039827	13708561	hadam3p_saf_7k1h_2001_1_007582910_1	23,136	123,344	5.3313
17 Dec 2011 01:22:01	1039827	13708561	hadam3p_saf_7k1h_2001_1_007582910_1	11,616	62,472	5.3781