Name | hadam3p_anz_f8nj_2013_1_009733131_0 |
Workunit | 9804976 |
Created | 8 Apr 2015, 21:05:00 UTC |
Sent | 19 Apr 2015, 6:46:19 UTC |
Report deadline | 31 Mar 2016, 12:06:19 UTC |
Received | 18 Jul 2015, 11:09:36 UTC |
Server state | Over |
Outcome | Success |
Client state | Done |
Exit status | 0 (0x00000000) |
Computer ID | 1297944 |
Run time | 10 days 12 hours 59 min 40 sec |
CPU time | 9 days 23 hours 47 min 7 sec |
Validate state | Workunit error - check skipped |
Credit | 5,974.74 |
Device peak FLOPS | 2.74 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Australia New Zealand v6.10 windows_intelx86 |
Stderr | <core_client_version>7.4.42</core_client_version> <![CDATA[ <stderr_txt> Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5300, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5328, selfPID=2796, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4436, selfPID=3624, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=284, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4696, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3604, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5688, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5716, selfPID=3716, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4988, selfPID=3184, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4624, selfPID=2316, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3324, iMonCtr=2 Mode l crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4856, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2340, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3352, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5448, selfPID=4716, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4356, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4452, iMonCtr=2 08:25:09 (916): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:25:11 (916): No heartbeat from core client for 30 sec - exiting 08:26:05 (4616): No heartbeat from core client for 30 sec - exiting 08:26:06 (4616): No heartbeat from core client for 30 sec - exiting 08:26:07 (4616): No heartbeat from core client for 30 sec - exiting 08:26:08 (4616): No heartbeat from core client for 30 sec - exiting 08:26:09 (4616): No heartbeat from core client for 30 sec - exiting 08:26:10 (4616): No heartbeat from core client for 30 sec - exiting 08:26:11 (4616): No heartbeat from core client for 30 sec - exiting 08:26:12 (4616): No heartbeat from core client for 30 sec - exiting 08:26:14 (4616): No heartbeat from core client for 30 sec - exiting 08:26:15 (4616): No heartbeat from core client for 30 sec - exiting 08:26:16 (4616): No heartbeat from core client for 30 sec - exiting 08:26:17 (4616): No heartbeat from core client for 30 sec - exiting 08:26:18 (4616): No heartbeat from core client for 30 sec - exiting 08:26:19 (4616): No heartbeat from core client for 30 sec - exiting 08:26:20 (4616): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5588, selfPID=2636, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 15:12:08 (4160): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:12:10 (4160): No heartbeat from core client for 30 sec - exiting Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5208, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5360, selfPID=744, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4728, iMonCtr=2 Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5672, iMonCtr=2 16:27:20 (1476): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:27:21 (1476): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 17:43:52 (2232): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:43:53 (2232): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 11:06:09 (1080): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:06:10 (1080): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CGlobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4228, iMonCtr=2 ontroller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2028, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4704, iMonCtr=2 Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1984, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4248, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=632, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Precis Restart file copy #1 failed on f8njga.dal46q0 Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5944, iMonCtr=2 Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6116, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5364, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4464, selfPID=5460, iMonCtr=1 Model crash detected, will try to restart... 17:59:49 (4724): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1152, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4472, selfPID=5828, iMonCtr=1 Model crash detected, will try to restart... 14:26:46 (5168): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:26:47 (5168): No heartbeat from core client for 30 sec - exiting 14:27:21 (3296): No heartbeat from core client for 30 sec - exiting 14:27:22 (3296): No heartbeat from core client for 30 sec - exiting 14:27:23 (3296): No heartbeat from core client for 30 sec - exiting 14:27:24 (3296): No heartbeat from core client for 30 sec - exiting 14:27:25 (3296): No heartbeat from core client for 30 sec - exiting 14:27:26 (3296): No heartbeat from core client for 30 sec - exiting 14:27:27 (3296): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5476, selfPID=4708, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4688, iMonCtr=2 20:16:00 (5832): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:16:02 (5832): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... GSuspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 18:21:03 (5772): No heartbeat from core client for 30 sec - exiting 18:21:04 (5772): No heartbeat from core client for 30 sec - exiting 18:21:06 (5772): No heartbeat from core client for 30 sec - exiting 18:21:07 (5772): No heartbeat from core client for 30 sec - exiting 18:21:08 (5772): No heartbeat from core client for 30 sec - exiting 18:21:09 (5772): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1848, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4460, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6100, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 17:16:23 (5744): No heartbeat from core client for 30 sec - exiting 17:16:24 (5744): No heartbeat from core client for 30 sec - exiting 17:16:25 (5744): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5056, selfPID=5056, iMonCtr=2 CGntroller:: CPDN procbss is not running, exiting, bRet Val = 1, checkPID=0, selfPID=1320, iMonCtr=2 elfPID=5996, iMoncted, w ill try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5152, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5720, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5088, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2556, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4680, iMonCtr=2 Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5092, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5124, selfPID=2064, iMonCtr=1 Model crash detected, will try to restart... 18:09:39 (3592): No heartbeat from core client for 30 sec - exiting 18:09:40 (3592): No heartbeat from core client for 30 sec - exiting 18:09:41 (3592): No heartbeat from core client for 30 sec - exiting 18:09:42 (3592): No heartbeat from core client for 30 sec - exiting 18:09:43 (3592): No heartbeat from core client for 30 sec - exiting 18:09:44 (3592): No heartbeat from core client for 30 sec - exiting 18:09:45 (3592): No heartbeat from core client for 30 sec - exiting 18:09:46 (3592): No heartbeat from core client for 30 sec - exiting 18:09:48 (3592): No heartbeat from core client for 30 sec - exiting 18:09:49 (3592): No heartbeat from core client for 30 sec - exiting 18:09:50 (3592): No heartbeat from core client for 30 sec - exiting 18:09:51 (3592): No heartbeat from core client for 30 sec - exiting 18:09:52 (3592): No heartbeat from core client for 30 sec - exiting 18:09:53 (3592): No heartbeat from core client for 30 sec - exiting 18:09:54 (3592): No heartbeat from core client for 30 sec - exiting 18:09:55 (3592): No heartbeat from core client for 30 sec - exiting 18:09:56 (3592): No heartbeat from core client for 30 sec - exiting 18:09:57 (3592): No heartbeat from core client for 30 sec - exiting 18:09:58 (3592): No heartbeat from core client for 30 sec - exiting 18:10:00 (3592): No heartbeat from core client for 30 sec - exiting 18:10:01 (3592): No heartbeat from core client for 30 sec - exiting 18:10:02 (3592): No heartbeat from core client for 30 sec - exiting 18:10:03 (3592): No heartbeat from core client for 30 sec - exiting 18:10:04 (3592): No heartbeat from core client for 30 sec - exiting 18:10:05 (3592): No heartbeat from core client for 30 sec - exiting 18:10:06 (3592): No heartbeat from core client for 30 sec - exiting 18:10:07 (3592): No heartbeat from core client for 30 sec - exiting 18:10:08 (3592): No heartbeat from core client for 30 sec - exiting 18:10:09 (3592): No heartbeat from core client for 30 sec - exiting 18:10:10 (3592): No heartbeat from core client for 30 sec - exiting 18:10:12 (3592): No heartbeat from core client for 30 sec - exiting 18:10:13 (3592): No heartbeat from core client for 30 sec - exiting 18:10:14 (3592): No heartbeat from core client for 30 sec - exiting 18:10:15 (3592): No heartbeat from core client for 30 sec - exiting 18:10:16 (3592): No heartbeat from core client for 30 sec - exiting 18:10:17 (3592): No heartbeat from core client for 30 sec - exiting 18:10:18 (3592): No heartbeat from core client for 30 sec - exiting 18:10:19 (3592): No heartbeat from core client for 30 sec - exiting 18:10:20 (3592): No heartbeat from core client for 30 sec - exiting 18:10:21 (3592): No heartbeat from core client for 30 sec - exiting 18:10:23 (3592): No heartbeat from core client for 30 sec - exiting 18:10:24 (3592): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 02:19:32 (4920): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5728, selfPID=5728, iMonCtr=2 Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
17 Jul 2015 17:24:25 | 1297944 | 18286794 | hadam3p_anz_f8nj_2013_1_009733131_0 | 138,539 | 860,966 | 6.2146 |
12 Jul 2015 10:03:10 | 1297944 | 18286794 | hadam3p_anz_f8nj_2013_1_009733131_0 | 127,019 | 790,676 | 6.2249 |
03 Jul 2015 08:38:18 | 1297944 | 18286794 | hadam3p_anz_f8nj_2013_1_009733131_0 | 115,499 | 720,677 | 6.2397 |
20 Jun 2015 14:10:17 | 1297944 | 18286794 | hadam3p_anz_f8nj_2013_1_009733131_0 | 103,979 | 650,329 | 6.2544 |
15 Jun 2015 18:29:38 | 1297944 | 18286794 | hadam3p_anz_f8nj_2013_1_009733131_0 | 92,459 | 578,815 | 6.2602 |
09 Jun 2015 21:32:40 | 1297944 | 18286794 | hadam3p_anz_f8nj_2013_1_009733131_0 | 80,939 | 508,551 | 6.2831 |
19 May 2015 20:06:50 | 1297944 | 18286794 | hadam3p_anz_f8nj_2013_1_009733131_0 | 69,419 | 438,501 | 6.3167 |
10 May 2015 18:45:48 | 1297944 | 18286794 | hadam3p_anz_f8nj_2013_1_009733131_0 | 57,899 | 366,111 | 6.3233 |
08 May 2015 19:06:47 | 1297944 | 18286794 | hadam3p_anz_f8nj_2013_1_009733131_0 | 46,379 | 293,173 | 6.3212 |
29 Apr 2015 05:45:27 | 1297944 | 18286794 | hadam3p_anz_f8nj_2013_1_009733131_0 | 34,859 | 219,575 | 6.2989 |
24 Apr 2015 10:33:50 | 1297944 | 18286794 | hadam3p_anz_f8nj_2013_1_009733131_0 | 23,339 | 146,805 | 6.2901 |
21 Apr 2015 18:21:34 | 1297944 | 18286794 | hadam3p_anz_f8nj_2013_1_009733131_0 | 11,819 | 74,108 | 6.2702 |
©2024 cpdn.org