Task 13402968

Name	hadam3p_saf_2fhj_1964_1_007149754_2
Workunit	7334534
Created	20 Sep 2011, 20:23:59 UTC
Sent	20 Sep 2011, 20:24:14 UTC
Report deadline	2 Sep 2012, 1:44:14 UTC
Received	24 Dec 2011, 13:09:33 UTC
Server state	Over
Outcome	Success
Client state	Done
Exit status	0 (0x00000000)
Computer ID	725427
Run time	10 days 5 hours 45 min 13 sec
CPU time	7 days 2 hours 29 min 48 sec
Validate state	Workunit error - check skipped
Credit	2,244.09
Device peak FLOPS	2.18 GFLOPS
Application version	UK Met Office HadAM3P-HadRM3P Southern Africa v6.09 windows_intelx86
Stderr	<core_client_version>6.6.36</core_client_version> <![CDATA[ <stderr_txt> Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2612, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8040, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 08:37:20 (7368): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6768, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5428, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5008, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3088, selfPID=4116, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1116, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3592, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7484, selfPID=7804, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7920, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6680, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9944, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5804, iMonCtr=2 Model crash detected, will try to restart... 08:09:24 (6532): No heartbeat from core client for 30 sec - exiting 08:09:25 (6532): No heartbeat from core client for 30 sec - exiting 08:09:26 (6532): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7176, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3836, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6356, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9412, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8212, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8952, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8552, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6744, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=10024, selfPID=11716, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 20:05:57 (828): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6720, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4048, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9028, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10376, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6160, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7476, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8412, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... GloController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4164, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8416, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8528, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9880, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 07:58:06 (7248): No heartbeat from core client for 30 sec - exiting 07:58:07 (7248): No heartbeat from core client for 30 sec - exiting 07:58:08 (7248): No heartbeat from core client for 30 sec - exiting 07:58:09 (7248): No heartbeat from core client for 30 sec - exiting 07:58:10 (7248): No heartbeat from core client for 30 sec - exiting 07:58:11 (7248): No heartbeat from core client for 30 sec - exiting 07:58:12 (7248): No heartbeat from core client for 30 sec - exiting 07:58:13 (7248): No heartbeat from core client for 30 sec - exiting 07:58:14 (7248): No heartbeat from core client for 30 sec - exiting 07:58:15 (7248): No heartbeat from core client for 30 sec - exiting 07:58:16 (7248): No heartbeat from core client for 30 sec - exiting 07:58:17 (7248): No heartbeat from core client for 30 sec - exiting 07:58:18 (7248): No heartbeat from core client for 30 sec - exiting 07:58:19 (7248): No heartbeat from core client for 30 sec - exiting 07:58:20 (7248): No heartbeat from core client for 30 sec - exiting 07:58:21 (7248): No heartbeat from core client for 30 sec - exiting 07:58:22 (7248): No heartbeat from core client for 30 sec - exiting 07:58:23 (7248): No heartbeat from core client for 30 sec - exiting 07:58:24 (7248): No heartbeat from core client for 30 sec - exiting 07:58:25 (7248): No heartbeat from core client for 30 sec - exiting 07:58:26 (7248): No heartbeat from core client for 30 sec - exiting 07:58:27 (7248): No heartbeat from core client for 30 sec - exiting 07:58:28 (7248): No heartbeat from core client for 30 sec - exiting 07:58:29 (7248): No heartbeat from core client for 30 sec - exiting 07:58:30 (7248): No heartbeat from core client for 30 sec - exiting 07:58:31 (7248): No heartbeat from core client for 30 sec - exiting 07:58:32 (7248): No heartbeat from core client for 30 sec - exiting 07:58:33 (7248): No heartbeat from core client for 30 sec - exiting 07:58:34 (7248): No heartbeat from core client for 30 sec - exiting 07:58:35 (7248): No heartbeat from core client for 30 sec - exiting 07:58:36 (7248): No heartbeat from core client for 30 sec - exiting 07:58:37 (7248): No heartbeat from core client for 30 sec - exiting 07:58:38 (7248): No heartbeat from core client for 30 sec - exiting 07:58:39 (7248): No heartbeat from core client for 30 sec - exiting 07:58:40 (7248): No heartbeat from core client for 30 sec - exiting 07:58:41 (7248): No heartbeat from core client for 30 sec - exiting 07:58:42 (7248): No heartbeat from core client for 30 sec - exiting 07:58:43 (7248): No heartbeat from core client for 30 sec - exiting 07:58:44 (7248): No heartbeat from core client for 30 sec - exiting 07:58:45 (7248): No heartbeat from core client for 30 sec - exiting 07:58:46 (7248): No heartbeat from core client for 30 sec - exiting 07:58:47 (7248): No heartbeat from core client for 30 sec - exiting 07:58:48 (7248): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6600, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6648, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8920, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 09:11:59 (508): No heartbeat from core client for 30 sec - exiting 09:12:00 (508): No heartbeat from core client for 30 sec - exiting 09:12:01 (508): No heartbeat from core client for 30 sec - exiting 09:12:02 (508): No heartbeat from core client for 30 sec - exiting 09:12:03 (508): No heartbeat from core client for 30 sec - exiting 09:12:04 (508): No heartbeat from core client for 30 sec - exiting 09:12:05 (508): No heartbeat from core client for 30 sec - exiting 09:12:06 (508): No heartbeat from core client for 30 sec - exiting 09:12:07 (508): No heartbeat from core client for 30 sec - exiting 09:12:08 (508): No heartbeat from core client for 30 sec - exiting 09:12:09 (508): No heartbeat from core client for 30 sec - exiting 09:12:10 (508): No heartbeat from core client for 30 sec - exiting 09:12:11 (508): No heartbeat from core client for 30 sec - exiting 09:12:12 (508): No heartbeat from core client for 30 sec - exiting 09:12:13 (508): No heartbeat from core client for 30 sec - exiting 09:12:14 (508): No heartbeat from core client for 30 sec - exiting 09:12:15 (508): No heartbeat from core client for 30 sec - exiting 09:12:16 (508): No heartbeat from core client for 30 sec - exiting 09:12:18 (508): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:12:19 (508): No heartbeat from core client for 30 sec - exiting 09:12:20 (508): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6356, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7812, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1420, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6712, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4632, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5768, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2328, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5788, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5696, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=9160, selfPID=8372, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6500, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6916, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 22:36:46 (5824): No heartbeat from core client for 30 sec - exiting 22:36:47 (5824): No heartbeat from core client for 30 sec - exiting 22:36:48 (5824): No heartbeat from core client for 30 sec - exiting 22:36:49 (5824): No heartbeat from core client for 30 sec - exiting 22:36:50 (5824): No heartbeat from core client for 30 sec - exiting 22:36:51 (5824): No heartbeat from core client for 30 sec - exiting 22:36:52 (5824): No heartbeat from core client for 30 sec - exiting 22:36:53 (5824): No heartbeat from core client for 30 sec - exiting 22:36:54 (5824): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:58:21 (7584): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4984, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9536, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9748, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10016, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6820, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4472, selfPID=7240, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10144, iMonCtr =2 el crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3456, iMonCtr=2 Model crash detected, will try to restart... GCobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8600, iMonCtr=2 ontroller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9424, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9928, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8376, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9688, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 07:40:52 (5800): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:40:54 (5800): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5148, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9408, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9700, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2580, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5088, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6280, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2680, selfPID=7320, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5628, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7252, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10000, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8436, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10012, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8496, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8864, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9504, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2312, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5664, iMonCtr=2 Model crash detected, will try to restart... 09:55:37 (3564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
23 Dec 2011 13:07:13	725427	13402968	hadam3p_saf_2fhj_1964_1_007149754_2	138,336	612,900	4.4305
13 Dec 2011 20:24:08	725427	13402968	hadam3p_saf_2fhj_1964_1_007149754_2	126,822	560,453	4.4192
12 Dec 2011 22:37:04	725427	13402968	hadam3p_saf_2fhj_1964_1_007149754_2	126,816	559,431	4.4114
04 Dec 2011 11:35:38	725427	13402968	hadam3p_saf_2fhj_1964_1_007149754_2	115,296	507,535	4.4020
20 Nov 2011 12:53:44	725427	13402968	hadam3p_saf_2fhj_1964_1_007149754_2	103,776	458,051	4.4138
06 Nov 2011 14:10:58	725427	13402968	hadam3p_saf_2fhj_1964_1_007149754_2	92,256	408,180	4.4244
05 Nov 2011 10:20:54	725427	13402968	hadam3p_saf_2fhj_1964_1_007149754_2	80,736	358,237	4.4371
31 Oct 2011 18:14:25	725427	13402968	hadam3p_saf_2fhj_1964_1_007149754_2	69,216	306,874	4.4336
31 Oct 2011 17:39:05	725427	13402968	hadam3p_saf_2fhj_1964_1_007149754_2	57,696	256,850	4.4518
08 Oct 2011 21:10:49	725427	13402968	hadam3p_saf_2fhj_1964_1_007149754_2	46,176	207,269	4.4887
03 Oct 2011 08:38:32	725427	13402968	hadam3p_saf_2fhj_1964_1_007149754_2	34,656	156,398	4.5129
01 Oct 2011 08:03:00	725427	13402968	hadam3p_saf_2fhj_1964_1_007149754_2	23,136	104,768	4.5284
25 Sep 2011 20:25:48	725427	13402968	hadam3p_saf_2fhj_1964_1_007149754_2	11,616	52,077	4.4832