Name | hadcm3n_yaen_1900_40_007346345_0 |
Workunit | 7543775 |
Created | 6 Jul 2011, 13:37:06 UTC |
Sent | 19 Jul 2011, 19:43:01 UTC |
Report deadline | 19 Oct 2011, 3:10:12 UTC |
Received | 2 Dec 2011, 21:13:08 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1117742 |
Run time | 9 days 12 hours 26 min 19 sec |
CPU time | 8 days 13 hours 52 min 36 sec |
Validate state | Invalid |
Credit | 3,110.40 |
Device peak FLOPS | 2.27 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> 22:25:50 (9580): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:17:48 (4976): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:17:57 (4976): No heartbeat from core client for 30 sec - exiting 12:17:58 (4976): No heartbeat from core client for 30 sec - exiting 12:17:59 (4976): No heartbeat from core client for 30 sec - exiting 12:18:01 (4976): No heartbeat from core client for 30 sec - exiting 12:18:02 (4976): No heartbeat from core client for 30 sec - exiting 12:18:03 (4976): No heartbeat from core client for 30 sec - exiting 12:18:04 (4976): No heartbeat from core client for 30 sec - exiting 12:18:05 (4976): No heartbeat from core client for 30 sec - exiting 12:18:06 (4976): No heartbeat from core client for 30 sec - exiting 12:18:07 (4976): No heartbeat from core client for 30 sec - exiting 12:18:08 (4976): No heartbeat from core client for 30 sec - exiting 12:18:09 (4976): No heartbeat from core client for 30 sec - exiting 12:18:10 (4976): No heartbeat from core client for 30 sec - exiting 12:18:11 (4976): No heartbeat from core client for 30 sec - exiting 12:18:12 (4976): No heartbeat from core client for 30 sec - exiting 12:18:13 (4976): No heartbeat from core client for 30 sec - exiting 12:18:14 (4976): No heartbeat from core client for 30 sec - exiting 12:18:15 (4976): No heartbeat from core client for 30 sec - exiting 12:18:16 (4976): No heartbeat from core client for 30 sec - exiting 12:18:17 (4976): No heartbeat from core client for 30 sec - exiting 12:18:18 (4976): No heartbeat from core client for 30 sec - exiting 12:18:19 (4976): No heartbeat from core client for 30 sec - exiting 12:18:20 (4976): No heartbeat from core client for 30 sec - exiting 12:18:21 (4976): No heartbeat from core client for 30 sec - exiting 12:18:22 (4976): No heartbeat from core client for 30 sec - exiting 12:18:23 (4976): No heartbeat from core client for 30 sec - exiting 12:18:24 (4976): No heartbeat from core client for 30 sec - exiting 12:18:25 (4976): No heartbeat from core client for 30 sec - exiting 12:18:26 (4976): No heartbeat from core client for 30 sec - exiting 12:18:27 (4976): No heartbeat from core client for 30 sec - exiting 12:18:28 (4976): No heartbeat from core client for 30 sec - exiting 12:18:29 (4976): No heartbeat from core client for 30 sec - exiting 12:18:30 (4976): No heartbeat from core client for 30 sec - exiting 12:18:31 (4976): No heartbeat from core client for 30 sec - exiting 12:18:32 (4976): No heartbeat from core client for 30 sec - exiting 12:18:33 (4976): No heartbeat from core client for 30 sec - exiting 12:18:34 (4976): No heartbeat from core client for 30 sec - exiting 12:18:35 (4976): No heartbeat from core client for 30 sec - exiting 12:18:36 (4976): No heartbeat from core client for 30 sec - exiting 12:18:37 (4976): No heartbeat from core client for 30 sec - exiting 12:18:38 (4976): No heartbeat from core client for 30 sec - exiting 12:18:39 (4976): No heartbeat from core client for 30 sec - exiting 12:18:40 (4976): No heartbeat from core client for 30 sec - exiting 12:18:41 (4976): No heartbeat from core client for 30 sec - exiting 12:18:42 (4976): No heartbeat from core client for 30 sec - exiting 12:18:43 (4976): No heartbeat from core client for 30 sec - exiting 12:18:44 (4976): No heartbeat from core client for 30 sec - exiting 12:18:45 (4976): No heartbeat from core client for 30 sec - exiting 12:18:46 (4976): No heartbeat from core client for 30 sec - exiting 12:18:47 (4976): No heartbeat from core client for 30 sec - exiting 12:18:49 (4976): No heartbeat from core client for 30 sec - exiting 12:18:50 (4976): No heartbeat from core client for 30 sec - exiting 12:18:51 (4976): No heartbeat from core client for 30 sec - exiting 12:18:52 (4976): No heartbeat from core client for 30 sec - exiting 12:18:53 (4976): No heartbeat from core client for 30 sec - exiting 12:18:54 (4976): No heartbeat from core client for 30 sec - exiting 12:18:55 (4976): No heartbeat from core client for 30 sec - exiting 12:18:56 (4976): No heartbeat from core client for 30 sec - exiting 12:18:57 (4976): No heartbeat from core client for 30 sec - exiting 12:18:58 (4976): No heartbeat from core client for 30 sec - exiting 12:18:59 (4976): No heartbeat from core client for 30 sec - exiting 12:19:00 (4976): No heartbeat from core client for 30 sec - exiting 12:19:01 (4976): No heartbeat from core client for 30 sec - exiting 12:19:02 (4976): No heartbeat from core client for 30 sec - exiting 12:19:03 (4976): No heartbeat from core client for 30 sec - exiting 12:19:04 (4976): No heartbeat from core client for 30 sec - exiting 12:19:05 (4976): No heartbeat from core client for 30 sec - exiting 12:19:06 (4976): No heartbeat from core client for 30 sec - exiting 12:19:07 (4976): No heartbeat from core client for 30 sec - exiting 12:19:08 (4976): No heartbeat from core client for 30 sec - exiting 12:19:09 (4976): No heartbeat from core client for 30 sec - exiting 12:19:10 (4976): No heartbeat from core client for 30 sec - exiting 12:19:11 (4976): No heartbeat from core client for 30 sec - exiting 12:19:12 (4976): No heartbeat from core client for 30 sec - exiting 12:19:13 (4976): No heartbeat from core client for 30 sec - exiting 12:19:14 (4976): No heartbeat from core client for 30 sec - exiting 12:19:15 (4976): No heartbeat from core client for 30 sec - exiting 12:19:16 (4976): No heartbeat from core client for 30 sec - exiting 12:19:17 (4976): No heartbeat from core client for 30 sec - exiting 12:19:18 (4976): No heartbeat from core client for 30 sec - exiting 12:19:19 (4976): No heartbeat from core client for 30 sec - exiting 12:19:20 (4976): No heartbeat from core client for 30 sec - exiting 12:19:21 (4976): No heartbeat from core client for 30 sec - exiting 12:19:22 (4976): No heartbeat from core client for 30 sec - exiting 12:19:23 (4976): No heartbeat from core client for 30 sec - exiting 12:19:24 (4976): No heartbeat from core client for 30 sec - exiting 12:19:25 (4976): No heartbeat from core client for 30 sec - exiting 12:19:26 (4976): No heartbeat from core client for 30 sec - exiting 12:19:27 (4976): No heartbeat from core client for 30 sec - exiting 12:19:28 (4976): No heartbeat from core client for 30 sec - exiting 12:19:29 (4976): No heartbeat from core client for 30 sec - exiting 12:19:30 (4976): No heartbeat from core client for 30 sec - exiting 12:19:31 (4976): No heartbeat from core client for 30 sec - exiting 12:19:32 (4976): No heartbeat from core client for 30 sec - exiting 12:19:33 (4976): No heartbeat from core client for 30 sec - exiting 12:19:34 (4976): No heartbeat from core client for 30 sec - exiting 12:19:35 (4976): No heartbeat from core client for 30 sec - exiting 12:19:36 (4976): No heartbeat from core client for 30 sec - exiting 12:19:37 (4976): No heartbeat from core client for 30 sec - exiting 12:19:38 (4976): No heartbeat from core client for 30 sec - exiting 12:19:39 (4976): No heartbeat from core client for 30 sec - exiting 12:19:40 (4976): No heartbeat from core client for 30 sec - exiting 12:19:41 (4976): No heartbeat from core client for 30 sec - exiting 12:19:42 (4976): No heartbeat from core client for 30 sec - exiting 12:19:43 (4976): No heartbeat from core client for 30 sec - exiting 12:19:44 (4976): No heartbeat from core client for 30 sec - exiting 12:19:45 (4976): No heartbeat from core client for 30 sec - exiting 12:19:46 (4976): No heartbeat from core client for 30 sec - exiting 12:19:47 (4976): No heartbeat from core client for 30 sec - exiting 12:19:48 (4976): No heartbeat from core client for 30 sec - exiting 12:19:49 (4976): No heartbeat from core client for 30 sec - exiting 12:19:50 (4976): No heartbeat from core client for 30 sec - exiting 12:19:51 (4976): No heartbeat from core client for 30 sec - exiting 12:19:52 (4976): No heartbeat from core client for 30 sec - exiting 12:19:53 (4976): No heartbeat from core client for 30 sec - exiting 12:19:54 (4976): No heartbeat from core client for 30 sec - exiting 12:19:55 (4976): No heartbeat from core client for 30 sec - exiting 12:19:56 (4976): No heartbeat from core client for 30 sec - exiting 12:19:57 (4976): No heartbeat from core client for 30 sec - exiting 12:19:58 (4976): No heartbeat from core client for 30 sec - exiting 12:19:59 (4976): No heartbeat from core client for 30 sec - exiting 12:20:00 (4976): No heartbeat from core client for 30 sec - exiting 12:20:01 (4976): No heartbeat from core client for 30 sec - exiting 12:20:02 (4976): No heartbeat from core client for 30 sec - exiting 12:20:03 (4976): No heartbeat from core client for 30 sec - exiting 12:20:04 (4976): No heartbeat from core client for 30 sec - exiting 12:20:05 (4976): No heartbeat from core client for 30 sec - exiting 12:20:06 (4976): No heartbeat from core client for 30 sec - exiting 12:20:07 (4976): No heartbeat from core client for 30 sec - exiting 12:20:08 (4976): No heartbeat from core client for 30 sec - exiting 12:20:09 (4976): No heartbeat from core client for 30 sec - exiting 12:20:10 (4976): No heartbeat from core client for 30 sec - exiting 12:20:11 (4976): No heartbeat from core client for 30 sec - exiting 12:20:12 (4976): No heartbeat from core client for 30 sec - exiting 12:20:13 (4976): No heartbeat from core client for 30 sec - exiting 12:20:14 (4976): No heartbeat from core client for 30 sec - exiting 12:20:15 (4976): No heartbeat from core client for 30 sec - exiting 12:20:16 (4976): No heartbeat from core client for 30 sec - exiting 12:20:17 (4976): No heartbeat from core client for 30 sec - exiting 12:20:18 (4976): No heartbeat from core client for 30 sec - exiting 12:20:19 (4976): No heartbeat from core client for 30 sec - exiting 12:20:20 (4976): No heartbeat from core client for 30 sec - exiting 12:20:21 (4976): No heartbeat from core client for 30 sec - exiting 12:20:22 (4976): No heartbeat from core client for 30 sec - exiting 12:20:23 (4976): No heartbeat from core client for 30 sec - exiting 12:20:24 (4976): No heartbeat from core client for 30 sec - exiting 12:20:25 (4976): No heartbeat from core client for 30 sec - exiting 12:20:26 (4976): No heartbeat from core client for 30 sec - exiting 12:20:27 (4976): No heartbeat from core client for 30 sec - exiting 12:20:28 (4976): No heartbeat from core client for 30 sec - exiting 12:20:29 (4976): No heartbeat from core client for 30 sec - exiting 12:20:30 (4976): No heartbeat from core client for 30 sec - exiting 12:20:31 (4976): No heartbeat from core client for 30 sec - exiting 12:20:32 (4976): No heartbeat from core client for 30 sec - exiting 23:53:54 (6132): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:53:55 (6132): No heartbeat from core client for 30 sec - exiting 23:53:56 (6132): No heartbeat from core client for 30 sec - exiting 23:53:57 (6132): No heartbeat from core client for 30 sec - exiting 23:53:58 (6132): No heartbeat from core client for 30 sec - exiting 23:53:59 (6132): No heartbeat from core client for 30 sec - exiting 23:54:00 (6132): No heartbeat from core client for 30 sec - exiting 23:54:01 (6132): No heartbeat from core client for 30 sec - exiting 23:54:02 (6132): No heartbeat from core client for 30 sec - exiting 23:54:03 (6132): No heartbeat from core client for 30 sec - exiting 23:54:04 (6132): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6560, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6560, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7440, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6788, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7264, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7264, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7264, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7264, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7264, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7264, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7544, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7552, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7552, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7552, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4908, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4908, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4908, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7060, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7060, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7060, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7488, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6684, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6468, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7704, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7704, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7704, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7704, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7704, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6884, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7544, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7544, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8120, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8120, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8120, iMonCtr=1 Model crash detected, will try to restart... 20:04:25 (8184): No heartbeat from core client for 30 sec - exiting 20:04:26 (8184): No heartbeat from core client for 30 sec - exiting 20:04:27 (8184): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7908, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7608, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7408, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7856, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6944, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
02 Dec 2011 21:17:33 | 1117742 | 13096556 | hadcm3n_yaen_1900_40_007346345_0 | 259,200 | 741,154 | 2.8594 |
01 Dec 2011 21:08:43 | 1117742 | 13096556 | hadcm3n_yaen_1900_40_007346345_0 | 233,280 | 691,605 | 2.9647 |
28 Nov 2011 22:17:26 | 1117742 | 13096556 | hadcm3n_yaen_1900_40_007346345_0 | 207,360 | 641,031 | 3.0914 |
27 Nov 2011 21:57:02 | 1117742 | 13096556 | hadcm3n_yaen_1900_40_007346345_0 | 181,440 | 591,581 | 3.2605 |
26 Nov 2011 18:49:05 | 1117742 | 13096556 | hadcm3n_yaen_1900_40_007346345_0 | 155,520 | 541,339 | 3.4808 |
26 Sep 2011 18:56:03 | 1117742 | 13096556 | hadcm3n_yaen_1900_40_007346345_0 | 129,600 | 488,507 | 3.7693 |
14 Sep 2011 20:54:36 | 1117742 | 13096556 | hadcm3n_yaen_1900_40_007346345_0 | 103,680 | 201,127 | 1.9399 |
11 Sep 2011 07:04:09 | 1117742 | 13096556 | hadcm3n_yaen_1900_40_007346345_0 | 77,760 | 147,825 | 1.9010 |
06 Aug 2011 21:35:32 | 1117742 | 13096556 | hadcm3n_yaen_1900_40_007346345_0 | 51,840 | 95,205 | 1.8365 |
06 Aug 2011 21:35:32 | 1117742 | 13096556 | hadcm3n_yaen_1900_40_007346345_0 | 25,920 | 48,202 | 1.8596 |
©2024 cpdn.org