Name | hadam3p_saf_2fhj_1964_1_007149754_2 |
Workunit | 7334534 |
Created | 20 Sep 2011, 20:23:59 UTC |
Sent | 20 Sep 2011, 20:24:14 UTC |
Report deadline | 2 Sep 2012, 1:44:14 UTC |
Received | 24 Dec 2011, 13:09:33 UTC |
Server state | Over |
Outcome | Success |
Client state | Done |
Exit status | 0 (0x00000000) |
Computer ID | 725427 |
Run time | 10 days 5 hours 45 min 13 sec |
CPU time | 7 days 2 hours 29 min 48 sec |
Validate state | Workunit error - check skipped |
Credit | 2,244.09 |
Device peak FLOPS | 2.18 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Southern Africa v6.09 windows_intelx86 |
Stderr | <core_client_version>6.6.36</core_client_version> <![CDATA[ <stderr_txt> Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2612, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8040, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 08:37:20 (7368): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6768, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5428, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5008, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3088, selfPID=4116, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1116, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3592, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7484, selfPID=7804, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7920, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6680, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9944, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5804, iMonCtr=2 Model crash detected, will try to restart... 08:09:24 (6532): No heartbeat from core client for 30 sec - exiting 08:09:25 (6532): No heartbeat from core client for 30 sec - exiting 08:09:26 (6532): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7176, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3836, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6356, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9412, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8212, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8952, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8552, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6744, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=10024, selfPID=11716, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 20:05:57 (828): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6720, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4048, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9028, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10376, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6160, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7476, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8412, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... GloController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4164, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8416, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8528, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9880, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 07:58:06 (7248): No heartbeat from core client for 30 sec - exiting 07:58:07 (7248): No heartbeat from core client for 30 sec - exiting 07:58:08 (7248): No heartbeat from core client for 30 sec - exiting 07:58:09 (7248): No heartbeat from core client for 30 sec - exiting 07:58:10 (7248): No heartbeat from core client for 30 sec - exiting 07:58:11 (7248): No heartbeat from core client for 30 sec - exiting 07:58:12 (7248): No heartbeat from core client for 30 sec - exiting 07:58:13 (7248): No heartbeat from core client for 30 sec - exiting 07:58:14 (7248): No heartbeat from core client for 30 sec - exiting 07:58:15 (7248): No heartbeat from core client for 30 sec - exiting 07:58:16 (7248): No heartbeat from core client for 30 sec - exiting 07:58:17 (7248): No heartbeat from core client for 30 sec - exiting 07:58:18 (7248): No heartbeat from core client for 30 sec - exiting 07:58:19 (7248): No heartbeat from core client for 30 sec - exiting 07:58:20 (7248): No heartbeat from core client for 30 sec - exiting 07:58:21 (7248): No heartbeat from core client for 30 sec - exiting 07:58:22 (7248): No heartbeat from core client for 30 sec - exiting 07:58:23 (7248): No heartbeat from core client for 30 sec - exiting 07:58:24 (7248): No heartbeat from core client for 30 sec - exiting 07:58:25 (7248): No heartbeat from core client for 30 sec - exiting 07:58:26 (7248): No heartbeat from core client for 30 sec - exiting 07:58:27 (7248): No heartbeat from core client for 30 sec - exiting 07:58:28 (7248): No heartbeat from core client for 30 sec - exiting 07:58:29 (7248): No heartbeat from core client for 30 sec - exiting 07:58:30 (7248): No heartbeat from core client for 30 sec - exiting 07:58:31 (7248): No heartbeat from core client for 30 sec - exiting 07:58:32 (7248): No heartbeat from core client for 30 sec - exiting 07:58:33 (7248): No heartbeat from core client for 30 sec - exiting 07:58:34 (7248): No heartbeat from core client for 30 sec - exiting 07:58:35 (7248): No heartbeat from core client for 30 sec - exiting 07:58:36 (7248): No heartbeat from core client for 30 sec - exiting 07:58:37 (7248): No heartbeat from core client for 30 sec - exiting 07:58:38 (7248): No heartbeat from core client for 30 sec - exiting 07:58:39 (7248): No heartbeat from core client for 30 sec - exiting 07:58:40 (7248): No heartbeat from core client for 30 sec - exiting 07:58:41 (7248): No heartbeat from core client for 30 sec - exiting 07:58:42 (7248): No heartbeat from core client for 30 sec - exiting 07:58:43 (7248): No heartbeat from core client for 30 sec - exiting 07:58:44 (7248): No heartbeat from core client for 30 sec - exiting 07:58:45 (7248): No heartbeat from core client for 30 sec - exiting 07:58:46 (7248): No heartbeat from core client for 30 sec - exiting 07:58:47 (7248): No heartbeat from core client for 30 sec - exiting 07:58:48 (7248): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6600, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6648, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8920, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 09:11:59 (508): No heartbeat from core client for 30 sec - exiting 09:12:00 (508): No heartbeat from core client for 30 sec - exiting 09:12:01 (508): No heartbeat from core client for 30 sec - exiting 09:12:02 (508): No heartbeat from core client for 30 sec - exiting 09:12:03 (508): No heartbeat from core client for 30 sec - exiting 09:12:04 (508): No heartbeat from core client for 30 sec - exiting 09:12:05 (508): No heartbeat from core client for 30 sec - exiting 09:12:06 (508): No heartbeat from core client for 30 sec - exiting 09:12:07 (508): No heartbeat from core client for 30 sec - exiting 09:12:08 (508): No heartbeat from core client for 30 sec - exiting 09:12:09 (508): No heartbeat from core client for 30 sec - exiting 09:12:10 (508): No heartbeat from core client for 30 sec - exiting 09:12:11 (508): No heartbeat from core client for 30 sec - exiting 09:12:12 (508): No heartbeat from core client for 30 sec - exiting 09:12:13 (508): No heartbeat from core client for 30 sec - exiting 09:12:14 (508): No heartbeat from core client for 30 sec - exiting 09:12:15 (508): No heartbeat from core client for 30 sec - exiting 09:12:16 (508): No heartbeat from core client for 30 sec - exiting 09:12:18 (508): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:12:19 (508): No heartbeat from core client for 30 sec - exiting 09:12:20 (508): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6356, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7812, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1420, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6712, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4632, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5768, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2328, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5788, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5696, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=9160, selfPID=8372, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6500, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6916, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 22:36:46 (5824): No heartbeat from core client for 30 sec - exiting 22:36:47 (5824): No heartbeat from core client for 30 sec - exiting 22:36:48 (5824): No heartbeat from core client for 30 sec - exiting 22:36:49 (5824): No heartbeat from core client for 30 sec - exiting 22:36:50 (5824): No heartbeat from core client for 30 sec - exiting 22:36:51 (5824): No heartbeat from core client for 30 sec - exiting 22:36:52 (5824): No heartbeat from core client for 30 sec - exiting 22:36:53 (5824): No heartbeat from core client for 30 sec - exiting 22:36:54 (5824): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:58:21 (7584): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4984, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9536, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9748, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10016, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6820, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4472, selfPID=7240, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10144, iMonCtr =2 el crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3456, iMonCtr=2 Model crash detected, will try to restart... GCobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8600, iMonCtr=2 ontroller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9424, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9928, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8376, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9688, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 07:40:52 (5800): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:40:54 (5800): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5148, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9408, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9700, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2580, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5088, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6280, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2680, selfPID=7320, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5628, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7252, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10000, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8436, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10012, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8496, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8864, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9504, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2312, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5664, iMonCtr=2 Model crash detected, will try to restart... 09:55:37 (3564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
23 Dec 2011 13:07:13 | 725427 | 13402968 | hadam3p_saf_2fhj_1964_1_007149754_2 | 138,336 | 612,900 | 4.4305 |
13 Dec 2011 20:24:08 | 725427 | 13402968 | hadam3p_saf_2fhj_1964_1_007149754_2 | 126,822 | 560,453 | 4.4192 |
12 Dec 2011 22:37:04 | 725427 | 13402968 | hadam3p_saf_2fhj_1964_1_007149754_2 | 126,816 | 559,431 | 4.4114 |
04 Dec 2011 11:35:38 | 725427 | 13402968 | hadam3p_saf_2fhj_1964_1_007149754_2 | 115,296 | 507,535 | 4.4020 |
20 Nov 2011 12:53:44 | 725427 | 13402968 | hadam3p_saf_2fhj_1964_1_007149754_2 | 103,776 | 458,051 | 4.4138 |
06 Nov 2011 14:10:58 | 725427 | 13402968 | hadam3p_saf_2fhj_1964_1_007149754_2 | 92,256 | 408,180 | 4.4244 |
05 Nov 2011 10:20:54 | 725427 | 13402968 | hadam3p_saf_2fhj_1964_1_007149754_2 | 80,736 | 358,237 | 4.4371 |
31 Oct 2011 18:14:25 | 725427 | 13402968 | hadam3p_saf_2fhj_1964_1_007149754_2 | 69,216 | 306,874 | 4.4336 |
31 Oct 2011 17:39:05 | 725427 | 13402968 | hadam3p_saf_2fhj_1964_1_007149754_2 | 57,696 | 256,850 | 4.4518 |
08 Oct 2011 21:10:49 | 725427 | 13402968 | hadam3p_saf_2fhj_1964_1_007149754_2 | 46,176 | 207,269 | 4.4887 |
03 Oct 2011 08:38:32 | 725427 | 13402968 | hadam3p_saf_2fhj_1964_1_007149754_2 | 34,656 | 156,398 | 4.5129 |
01 Oct 2011 08:03:00 | 725427 | 13402968 | hadam3p_saf_2fhj_1964_1_007149754_2 | 23,136 | 104,768 | 4.5284 |
25 Sep 2011 20:25:48 | 725427 | 13402968 | hadam3p_saf_2fhj_1964_1_007149754_2 | 11,616 | 52,077 | 4.4832 |
©2024 cpdn.org