Name | hadam3p_saf_7k1h_2001_1_007582910_1 |
Workunit | 7761040 |
Created | 5 Dec 2011, 1:25:39 UTC |
Sent | 5 Dec 2011, 1:31:14 UTC |
Report deadline | 16 Nov 2012, 6:51:14 UTC |
Received | 21 Mar 2012, 22:22:36 UTC |
Server state | Over |
Outcome | Success |
Client state | Done |
Exit status | 0 (0x00000000) |
Computer ID | 1039827 |
Run time | 10 days 11 hours 33 min 3 sec |
CPU time | 8 days 10 hours 54 min 24 sec |
Validate state | Workunit error - check skipped |
Credit | 2,244.09 |
Device peak FLOPS | 1.57 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Southern Africa v6.09 windows_intelx86 |
Stderr | <core_client_version>6.10.18</core_client_version> <![CDATA[ <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3092, selfPID=3364, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1320, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1420, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3384, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=388, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3352, selfPID=1332, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3880, selfPID=940, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=956, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1000, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1144, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=872, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2180, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2388, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=288, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3568, selfPID=3224, iMonCtr=1 Model crash detected, will try to restart... 18:39:03 (2288): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1560, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1632, iMonCtr=2 Model crash detected, will try to restart... C13:17:00 (980): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:56:35 (3024): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=828, selfPID=828, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3016, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3156, selfPID=3324, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3696, selfPID=848, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2556, iMonCtr=2 Model crash detected, will try to restart... 19:48:40 (3708): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:41:06 (3476): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:47:47 (3316): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2444, selfPID=2444, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1040, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3028, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3980, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3768, selfPID=4028, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2660, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3608, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2876, iMonCtr=2 18:41:15 (2720): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:50:17 (2780): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:50:18 (2780): No heartbeat from core client for 30 sec - exiting 18:50:19 (2780): No heartbeat from core client for 30 sec - exiting 18:50:20 (2780): No heartbeat from core client for 30 sec - exiting 18:50:21 (2780): No heartbeat from core client for 30 sec - exiting 18:50:22 (2780): No heartbeat from core client for 30 sec - exiting 18:50:23 (2780): No heartbeat from core client for 30 sec - exiting 18:50:24 (2780): No heartbeat from core client for 30 sec - exiting 18:50:25 (2780): No heartbeat from core client for 30 sec - exiting 18:50:26 (2780): No heartbeat from core client for 30 sec - exiting 18:50:28 (2780): No heartbeat from core client for 30 sec - exiting 18:50:29 (2780): No heartbeat from core client for 30 sec - exiting 18:50:30 (2780): No heartbeat from core client for 30 sec - exiting 18:50:31 (2780): No heartbeat from core client for 30 sec - exiting 18:50:32 (2780): No heartbeat from core client for 30 sec - exiting 18:50:33 (2780): No heartbeat from core client for 30 sec - exiting 18:50:34 (2780): No heartbeat from core client for 30 sec - exiting 18:50:35 (2780): No heartbeat from core client for 30 sec - exiting 18:50:36 (2780): No heartbeat from core client for 30 sec - exiting 18:50:37 (2780): No heartbeat from core client for 30 sec - exiting 18:50:38 (2780): No heartbeat from core client for 30 sec - exiting 18:55:30 (2104): No heartbeat from core client for 30 sec - exiting 18:55:31 (2104): No heartbeat from core client for 30 sec - exiting 18:55:32 (2104): No heartbeat from core client for 30 sec - exiting 18:55:33 (2104): No heartbeat from core client for 30 sec - exiting 18:55:34 (2104): No heartbeat from core client for 30 sec - exiting 18:55:35 (2104): No heartbeat from core client for 30 sec - exiting 18:55:36 (2104): No heartbeat from core client for 30 sec - exiting 18:55:37 (2104): No heartbeat from core client for 30 sec - exiting 18:55:38 (2104): No heartbeat from core client for 30 sec - exiting 18:55:39 (2104): No heartbeat from core client for 30 sec - exiting 18:55:40 (2104): No heartbeat from core client for 30 sec - exiting 18:55:42 (2104): No heartbeat from core client for 30 sec - exiting 18:55:43 (2104): No heartbeat from core client for 30 sec - exiting 18:55:44 (2104): No heartbeat from core client for 30 sec - exiting 18:55:45 (2104): No heartbeat from core client for 30 sec - exiting 18:55:46 (2104): No heartbeat from core client for 30 sec - exiting 18:55:47 (2104): No heartbeat from core client for 30 sec - exiting 18:55:48 (2104): No heartbeat from core client for 30 sec - exiting 18:55:49 (2104): No heartbeat from core client for 30 sec - exiting 18:55:50 (2104): No heartbeat from core client for 30 sec - exiting 18:55:52 (2104): No heartbeat from core client for 30 sec - exiting 18:55:53 (2104): No heartbeat from core client for 30 sec - exiting 18:55:54 (2104): No heartbeat from core client for 30 sec - exiting 18:55:55 (2104): No heartbeat from core client for 30 sec - exiting 18:55:56 (2104): No heartbeat from core client for 30 sec - exiting 18:55:57 (2104): No heartbeat from core client for 30 sec - exiting 18:55:58 (2104): No heartbeat from core client for 30 sec - exiting 18:55:59 (2104): No heartbeat from core client for 30 sec - exiting 18:56:00 (2104): No heartbeat from core client for 30 sec - exiting 18:56:01 (2104): No heartbeat from core client for 30 sec - exiting 18:56:03 (2104): No heartbeat from core client for 30 sec - exiting 18:56:04 (2104): No heartbeat from core client for 30 sec - exiting 18:56:05 (2104): No heartbeat from core client for 30 sec - exiting 18:56:06 (2104): No heartbeat from core client for 30 sec - exiting 18:56:07 (2104): No heartbeat from core client for 30 sec - exiting 18:56:08 (2104): No heartbeat from core client for 30 sec - exiting 18:56:09 (2104): No heartbeat from core client for 30 sec - exiting 18:56:10 (2104): No heartbeat from core client for 30 sec - exiting 18:56:11 (2104): No heartbeat from core client for 30 sec - exiting 18:56:12 (2104): No heartbeat from core client for 30 sec - exiting 18:56:13 (2104): No heartbeat from core client for 30 sec - exiting 18:56:15 (2104): No heartbeat from core client for 30 sec - exiting 18:56:16 (2104): No heartbeat from core client for 30 sec - exiting 18:56:17 (2104): No heartbeat from core client for 30 sec - exiting 18:56:18 (2104): No heartbeat from core client for 30 sec - exiting 18:56:19 (2104): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3100, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3184, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3004, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1756, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4076, selfPID=2572, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3600, selfPID=2980, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1944, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1276, selfPID=3496, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2836, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1952, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1288, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2204, iMonCtr=2 Model crash detected, will try to restart... CSuspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3500, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1796, selfPID=2636, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2660, selfPID=2236, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2908, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=200, selfPID=3340, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3344, iMonCtr=2 19:07:53 (2288): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
19 Mar 2012 19:26:10 | 1039827 | 13708561 | hadam3p_saf_7k1h_2001_1_007582910_1 | 138,336 | 729,190 | 5.2712 |
04 Mar 2012 23:27:37 | 1039827 | 13708561 | hadam3p_saf_7k1h_2001_1_007582910_1 | 126,816 | 667,578 | 5.2641 |
25 Feb 2012 18:08:27 | 1039827 | 13708561 | hadam3p_saf_7k1h_2001_1_007582910_1 | 115,296 | 606,942 | 5.2642 |
19 Feb 2012 05:28:40 | 1039827 | 13708561 | hadam3p_saf_7k1h_2001_1_007582910_1 | 103,776 | 545,437 | 5.2559 |
13 Feb 2012 01:06:06 | 1039827 | 13708561 | hadam3p_saf_7k1h_2001_1_007582910_1 | 92,256 | 485,972 | 5.2676 |
06 Feb 2012 01:01:19 | 1039827 | 13708561 | hadam3p_saf_7k1h_2001_1_007582910_1 | 80,736 | 426,149 | 5.2783 |
19 Jan 2012 03:15:24 | 1039827 | 13708561 | hadam3p_saf_7k1h_2001_1_007582910_1 | 69,216 | 365,841 | 5.2855 |
10 Jan 2012 02:11:52 | 1039827 | 13708561 | hadam3p_saf_7k1h_2001_1_007582910_1 | 57,696 | 305,443 | 5.2940 |
06 Jan 2012 19:28:45 | 1039827 | 13708561 | hadam3p_saf_7k1h_2001_1_007582910_1 | 46,176 | 245,219 | 5.3105 |
01 Jan 2012 01:25:48 | 1039827 | 13708561 | hadam3p_saf_7k1h_2001_1_007582910_1 | 34,656 | 183,923 | 5.3071 |
26 Dec 2011 21:19:19 | 1039827 | 13708561 | hadam3p_saf_7k1h_2001_1_007582910_1 | 23,136 | 123,344 | 5.3313 |
17 Dec 2011 01:22:01 | 1039827 | 13708561 | hadam3p_saf_7k1h_2001_1_007582910_1 | 11,616 | 62,472 | 5.3781 |
©2024 cpdn.org