Name | hadcm3n_yaii_1900_40_007346484_0 |
Workunit | 7543914 |
Created | 6 Jul 2011, 13:37:59 UTC |
Sent | 19 Jul 2011, 17:41:42 UTC |
Report deadline | 19 Oct 2011, 1:08:53 UTC |
Received | 2 Dec 2011, 21:13:08 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1117742 |
Run time | 9 days 14 hours 0 min 4 sec |
CPU time | 8 days 15 hours 13 min 19 sec |
Validate state | Invalid |
Credit | 3,110.40 |
Device peak FLOPS | 2.27 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> 22:25:50 (9604): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:25:51 (9604): No heartbeat from core client for 30 sec - exiting 22:25:52 (9604): No heartbeat from core client for 30 sec - exiting 22:25:53 (9604): No heartbeat from core client for 30 sec - exiting 22:25:54 (9604): No heartbeat from core client for 30 sec - exiting 22:25:55 (9604): No heartbeat from core client for 30 sec - exiting 22:25:56 (9604): No heartbeat from core client for 30 sec - exiting 22:25:57 (9604): No heartbeat from core client for 30 sec - exiting 22:25:58 (9604): No heartbeat from core client for 30 sec - exiting 22:25:59 (9604): No heartbeat from core client for 30 sec - exiting 22:26:00 (9604): No heartbeat from core client for 30 sec - exiting 12:17:48 (7232): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:17:57 (7232): No heartbeat from core client for 30 sec - exiting 12:17:59 (7232): No heartbeat from core client for 30 sec - exiting 12:18:00 (7232): No heartbeat from core client for 30 sec - exiting 12:18:01 (7232): No heartbeat from core client for 30 sec - exiting 12:18:02 (7232): No heartbeat from core client for 30 sec - exiting 12:18:03 (7232): No heartbeat from core client for 30 sec - exiting 12:18:04 (7232): No heartbeat from core client for 30 sec - exiting 12:18:05 (7232): No heartbeat from core client for 30 sec - exiting 12:18:06 (7232): No heartbeat from core client for 30 sec - exiting 12:18:07 (7232): No heartbeat from core client for 30 sec - exiting 12:18:08 (7232): No heartbeat from core client for 30 sec - exiting 12:18:09 (7232): No heartbeat from core client for 30 sec - exiting 12:18:10 (7232): No heartbeat from core client for 30 sec - exiting 12:18:11 (7232): No heartbeat from core client for 30 sec - exiting 12:18:12 (7232): No heartbeat from core client for 30 sec - exiting 12:18:13 (7232): No heartbeat from core client for 30 sec - exiting 12:18:14 (7232): No heartbeat from core client for 30 sec - exiting 12:18:15 (7232): No heartbeat from core client for 30 sec - exiting 12:18:16 (7232): No heartbeat from core client for 30 sec - exiting 12:18:17 (7232): No heartbeat from core client for 30 sec - exiting 12:18:18 (7232): No heartbeat from core client for 30 sec - exiting 12:18:19 (7232): No heartbeat from core client for 30 sec - exiting 12:18:20 (7232): No heartbeat from core client for 30 sec - exiting 12:18:21 (7232): No heartbeat from core client for 30 sec - exiting 12:18:22 (7232): No heartbeat from core client for 30 sec - exiting 12:18:23 (7232): No heartbeat from core client for 30 sec - exiting 12:18:24 (7232): No heartbeat from core client for 30 sec - exiting 12:18:25 (7232): No heartbeat from core client for 30 sec - exiting 12:18:26 (7232): No heartbeat from core client for 30 sec - exiting 12:18:27 (7232): No heartbeat from core client for 30 sec - exiting 12:18:28 (7232): No heartbeat from core client for 30 sec - exiting 12:18:29 (7232): No heartbeat from core client for 30 sec - exiting 12:18:30 (7232): No heartbeat from core client for 30 sec - exiting 12:18:31 (7232): No heartbeat from core client for 30 sec - exiting 12:18:32 (7232): No heartbeat from core client for 30 sec - exiting 12:18:33 (7232): No heartbeat from core client for 30 sec - exiting 12:18:34 (7232): No heartbeat from core client for 30 sec - exiting 12:18:35 (7232): No heartbeat from core client for 30 sec - exiting 12:18:36 (7232): No heartbeat from core client for 30 sec - exiting 12:18:37 (7232): No heartbeat from core client for 30 sec - exiting 12:18:38 (7232): No heartbeat from core client for 30 sec - exiting 12:18:39 (7232): No heartbeat from core client for 30 sec - exiting 12:18:40 (7232): No heartbeat from core client for 30 sec - exiting 12:18:41 (7232): No heartbeat from core client for 30 sec - exiting 12:18:42 (7232): No heartbeat from core client for 30 sec - exiting 12:18:43 (7232): No heartbeat from core client for 30 sec - exiting 12:18:44 (7232): No heartbeat from core client for 30 sec - exiting 12:18:47 (7232): No heartbeat from core client for 30 sec - exiting 12:18:49 (7232): No heartbeat from core client for 30 sec - exiting 12:18:50 (7232): No heartbeat from core client for 30 sec - exiting 12:18:51 (7232): No heartbeat from core client for 30 sec - exiting 12:18:52 (7232): No heartbeat from core client for 30 sec - exiting 12:18:53 (7232): No heartbeat from core client for 30 sec - exiting 12:18:54 (7232): No heartbeat from core client for 30 sec - exiting 12:18:55 (7232): No heartbeat from core client for 30 sec - exiting 12:18:56 (7232): No heartbeat from core client for 30 sec - exiting 12:18:57 (7232): No heartbeat from core client for 30 sec - exiting 12:18:58 (7232): No heartbeat from core client for 30 sec - exiting 12:18:59 (7232): No heartbeat from core client for 30 sec - exiting 12:19:00 (7232): No heartbeat from core client for 30 sec - exiting 12:19:01 (7232): No heartbeat from core client for 30 sec - exiting 12:19:02 (7232): No heartbeat from core client for 30 sec - exiting 12:19:03 (7232): No heartbeat from core client for 30 sec - exiting 12:19:04 (7232): No heartbeat from core client for 30 sec - exiting 12:19:05 (7232): No heartbeat from core client for 30 sec - exiting 12:19:06 (7232): No heartbeat from core client for 30 sec - exiting 12:19:07 (7232): No heartbeat from core client for 30 sec - exiting 12:19:08 (7232): No heartbeat from core client for 30 sec - exiting 12:19:09 (7232): No heartbeat from core client for 30 sec - exiting 12:19:10 (7232): No heartbeat from core client for 30 sec - exiting 12:19:11 (7232): No heartbeat from core client for 30 sec - exiting 12:19:12 (7232): No heartbeat from core client for 30 sec - exiting 12:19:13 (7232): No heartbeat from core client for 30 sec - exiting 12:19:14 (7232): No heartbeat from core client for 30 sec - exiting 12:19:15 (7232): No heartbeat from core client for 30 sec - exiting 12:19:16 (7232): No heartbeat from core client for 30 sec - exiting 12:19:17 (7232): No heartbeat from core client for 30 sec - exiting 12:19:18 (7232): No heartbeat from core client for 30 sec - exiting 12:19:19 (7232): No heartbeat from core client for 30 sec - exiting 12:19:20 (7232): No heartbeat from core client for 30 sec - exiting 12:19:21 (7232): No heartbeat from core client for 30 sec - exiting 12:19:22 (7232): No heartbeat from core client for 30 sec - exiting 12:19:23 (7232): No heartbeat from core client for 30 sec - exiting 12:19:24 (7232): No heartbeat from core client for 30 sec - exiting 12:19:25 (7232): No heartbeat from core client for 30 sec - exiting 12:19:26 (7232): No heartbeat from core client for 30 sec - exiting 12:19:27 (7232): No heartbeat from core client for 30 sec - exiting 12:19:28 (7232): No heartbeat from core client for 30 sec - exiting 12:19:29 (7232): No heartbeat from core client for 30 sec - exiting 12:19:30 (7232): No heartbeat from core client for 30 sec - exiting 12:19:31 (7232): No heartbeat from core client for 30 sec - exiting 12:19:32 (7232): No heartbeat from core client for 30 sec - exiting 12:19:33 (7232): No heartbeat from core client for 30 sec - exiting 12:19:34 (7232): No heartbeat from core client for 30 sec - exiting 12:19:35 (7232): No heartbeat from core client for 30 sec - exiting 12:19:36 (7232): No heartbeat from core client for 30 sec - exiting 12:19:37 (7232): No heartbeat from core client for 30 sec - exiting 12:19:38 (7232): No heartbeat from core client for 30 sec - exiting 12:19:39 (7232): No heartbeat from core client for 30 sec - exiting 12:19:40 (7232): No heartbeat from core client for 30 sec - exiting 12:19:41 (7232): No heartbeat from core client for 30 sec - exiting 12:19:42 (7232): No heartbeat from core client for 30 sec - exiting 12:19:43 (7232): No heartbeat from core client for 30 sec - exiting 12:19:44 (7232): No heartbeat from core client for 30 sec - exiting 12:19:45 (7232): No heartbeat from core client for 30 sec - exiting 12:19:46 (7232): No heartbeat from core client for 30 sec - exiting 12:19:54 (7232): No heartbeat from core client for 30 sec - exiting 12:19:55 (7232): No heartbeat from core client for 30 sec - exiting 12:19:56 (7232): No heartbeat from core client for 30 sec - exiting 12:19:57 (7232): No heartbeat from core client for 30 sec - exiting 12:19:58 (7232): No heartbeat from core client for 30 sec - exiting 12:19:59 (7232): No heartbeat from core client for 30 sec - exiting 12:20:00 (7232): No heartbeat from core client for 30 sec - exiting 12:20:01 (7232): No heartbeat from core client for 30 sec - exiting 12:20:02 (7232): No heartbeat from core client for 30 sec - exiting 12:20:03 (7232): No heartbeat from core client for 30 sec - exiting 12:20:04 (7232): No heartbeat from core client for 30 sec - exiting 12:20:05 (7232): No heartbeat from core client for 30 sec - exiting 12:20:06 (7232): No heartbeat from core client for 30 sec - exiting 12:20:07 (7232): No heartbeat from core client for 30 sec - exiting 12:20:08 (7232): No heartbeat from core client for 30 sec - exiting 12:20:09 (7232): No heartbeat from core client for 30 sec - exiting 12:20:10 (7232): No heartbeat from core client for 30 sec - exiting 12:20:11 (7232): No heartbeat from core client for 30 sec - exiting 12:20:12 (7232): No heartbeat from core client for 30 sec - exiting 12:20:13 (7232): No heartbeat from core client for 30 sec - exiting 12:20:14 (7232): No heartbeat from core client for 30 sec - exiting 12:20:15 (7232): No heartbeat from core client for 30 sec - exiting 12:20:16 (7232): No heartbeat from core client for 30 sec - exiting 12:20:17 (7232): No heartbeat from core client for 30 sec - exiting 12:20:18 (7232): No heartbeat from core client for 30 sec - exiting 12:20:19 (7232): No heartbeat from core client for 30 sec - exiting 12:20:20 (7232): No heartbeat from core client for 30 sec - exiting 12:20:21 (7232): No heartbeat from core client for 30 sec - exiting 12:20:22 (7232): No heartbeat from core client for 30 sec - exiting 12:20:23 (7232): No heartbeat from core client for 30 sec - exiting 12:20:24 (7232): No heartbeat from core client for 30 sec - exiting 12:20:25 (7232): No heartbeat from core client for 30 sec - exiting 12:20:26 (7232): No heartbeat from core client for 30 sec - exiting 12:20:27 (7232): No heartbeat from core client for 30 sec - exiting 12:20:28 (7232): No heartbeat from core client for 30 sec - exiting 12:20:29 (7232): No heartbeat from core client for 30 sec - exiting 12:20:30 (7232): No heartbeat from core client for 30 sec - exiting 12:20:31 (7232): No heartbeat from core client for 30 sec - exiting 12:20:32 (7232): No heartbeat from core client for 30 sec - exiting 23:53:54 (1052): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 10:35:28 (8400): No heartbeat from core client for 30 sec - exiting 10:35:29 (8400): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7108, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4244, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7428, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6724, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7252, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7252, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7252, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7252, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7252, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7252, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7480, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7540, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7540, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7540, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6684, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7008, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7008, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7008, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7008, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7032, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7032, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7032, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7452, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=576, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6448, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6448, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7684, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7684, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7684, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7684, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7684, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6800, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7524, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7524, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8108, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8108, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8108, iMonCtr=1 Model crash detected, will try to restart... 20:04:25 (8176): No heartbeat from core client for 30 sec - exiting 20:04:26 (8176): No heartbeat from core client for 30 sec - exiting 20:04:27 (8176): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7228, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2344, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7596, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7396, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7944, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7844, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6932, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
02 Dec 2011 21:17:34 | 1117742 | 13096834 | hadcm3n_yaii_1900_40_007346484_0 | 259,200 | 745,997 | 2.8781 |
01 Dec 2011 20:08:16 | 1117742 | 13096834 | hadcm3n_yaii_1900_40_007346484_0 | 233,280 | 696,569 | 2.9860 |
28 Nov 2011 22:17:26 | 1117742 | 13096834 | hadcm3n_yaii_1900_40_007346484_0 | 207,360 | 646,289 | 3.1167 |
27 Nov 2011 21:57:02 | 1117742 | 13096834 | hadcm3n_yaii_1900_40_007346484_0 | 181,440 | 596,994 | 3.2903 |
26 Nov 2011 18:49:06 | 1117742 | 13096834 | hadcm3n_yaii_1900_40_007346484_0 | 155,520 | 547,041 | 3.5175 |
26 Sep 2011 17:54:58 | 1117742 | 13096834 | hadcm3n_yaii_1900_40_007346484_0 | 129,600 | 494,389 | 3.8147 |
14 Sep 2011 18:31:28 | 1117742 | 13096834 | hadcm3n_yaii_1900_40_007346484_0 | 103,680 | 198,653 | 1.9160 |
10 Sep 2011 20:52:15 | 1117742 | 13096834 | hadcm3n_yaii_1900_40_007346484_0 | 77,760 | 145,834 | 1.8754 |
06 Aug 2011 21:35:32 | 1117742 | 13096834 | hadcm3n_yaii_1900_40_007346484_0 | 51,840 | 94,402 | 1.8210 |
06 Aug 2011 21:35:32 | 1117742 | 13096834 | hadcm3n_yaii_1900_40_007346484_0 | 25,920 | 47,670 | 1.8391 |
©2024 cpdn.org