Name | hadcm3n_y9ch_1900_40_007521656_0 |
Workunit | 7719131 |
Created | 28 Oct 2011, 13:13:57 UTC |
Sent | 2 Nov 2011, 7:51:55 UTC |
Report deadline | 1 Feb 2012, 15:19:06 UTC |
Received | 20 Nov 2011, 13:17:07 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1134438 |
Run time | 8 days 16 hours 24 min 56 sec |
CPU time | 7 days 2 hours 57 min 7 sec |
Validate state | Invalid |
Credit | 3,110.40 |
Device peak FLOPS | 1.97 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 15:10:54 (5320): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5040, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5040, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5040, iMonCtr=1 Model crash detected, will try to restart... 19:16:22 (1096): No heartbeat from core client for 30 sec - exiting 19:16:23 (1096): No heartbeat from core client for 30 sec - exiting 19:16:24 (1096): No heartbeat from core client for 30 sec - exiting 19:16:25 (1096): No heartbeat from core client for 30 sec - exiting 19:16:26 (1096): No heartbeat from core client for 30 sec - exiting 19:16:27 (1096): No heartbeat from core client for 30 sec - exiting 19:16:28 (1096): No heartbeat from core client for 30 sec - exiting 19:16:29 (1096): No heartbeat from core client for 30 sec - exiting 19:16:30 (1096): No heartbeat from core client for 30 sec - exiting 19:16:31 (1096): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:16:32 (1096): No heartbeat from core client for 30 sec - exiting 19:16:33 (1096): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 10:17:04 (4116): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 08:02:08 (7476): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:02:09 (7476): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4592, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4592, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4592, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4592, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1628, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1628, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8136, iMonCtr=1 Model crash detected, will try to restart... 12:07:06 (792): No heartbeat from core client for 30 sec - exiting 12:07:07 (792): No heartbeat from core client for 30 sec - exiting 12:07:08 (792): No heartbeat from core client for 30 sec - exiting 12:07:09 (792): No heartbeat from core client for 30 sec - exiting 12:07:10 (792): No heartbeat from core client for 30 sec - exiting 12:07:11 (792): No heartbeat from core client for 30 sec - exiting 12:07:12 (792): No heartbeat from core client for 30 sec - exiting 12:07:13 (792): No heartbeat from core client for 30 sec - exiting 12:07:14 (792): No heartbeat from core client for 30 sec - exiting 12:07:15 (792): No heartbeat from core client for 30 sec - exiting 12:07:16 (792): No heartbeat from core client for 30 sec - exiting 12:07:17 (792): No heartbeat from core client for 30 sec - exiting 12:07:18 (792): No heartbeat from core client for 30 sec - exiting 12:07:19 (792): No heartbeat from core client for 30 sec - exiting 12:07:20 (792): No heartbeat from core client for 30 sec - exiting 12:07:21 (792): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6572, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3544, iMonCtr=1 Model crash detected, will try to restart... 21:21:04 (4540): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:21:05 (4540): No heartbeat from core client for 30 sec - exiting 21:21:06 (4540): No heartbeat from core client for 30 sec - exiting 21:21:07 (4540): No heartbeat from core client for 30 sec - exiting 21:21:08 (4540): No heartbeat from core client for 30 sec - exiting 21:21:09 (4540): No heartbeat from core client for 30 sec - exiting 21:21:10 (4540): No heartbeat from core client for 30 sec - exiting 21:21:11 (4540): No heartbeat from core client for 30 sec - exiting 21:21:12 (4540): No heartbeat from core client for 30 sec - exiting 21:21:13 (4540): No heartbeat from core client for 30 sec - exiting 21:21:14 (4540): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7024, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 09:53:19 (4812): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:14:03 (5916): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:25:01 (4428): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:32:36 (2252): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:32:37 (2252): No heartbeat from core client for 30 sec - exiting 21:32:38 (2252): No heartbeat from core client for 30 sec - exiting 21:32:39 (2252): No heartbeat from core client for 30 sec - exiting 21:32:40 (2252): No heartbeat from core client for 30 sec - exiting 21:32:41 (2252): No heartbeat from core client for 30 sec - exiting 21:32:42 (2252): No heartbeat from core client for 30 sec - exiting 21:32:43 (2252): No heartbeat from core client for 30 sec - exiting 21:32:44 (2252): No heartbeat from core client for 30 sec - exiting 21:32:45 (2252): No heartbeat from core client for 30 sec - exiting 21:32:46 (2252): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 09:38:20 (6064): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:38:21 (6064): No heartbeat from core client for 30 sec - exiting 09:38:22 (6064): No heartbeat from core client for 30 sec - exiting 09:38:23 (6064): No heartbeat from core client for 30 sec - exiting 09:38:24 (6064): No heartbeat from core client for 30 sec - exiting 09:38:25 (6064): No heartbeat from core client for 30 sec - exiting 09:38:26 (6064): No heartbeat from core client for 30 sec - exiting 09:38:27 (6064): No heartbeat from core client for 30 sec - exiting 09:38:28 (6064): No heartbeat from core client for 30 sec - exiting 09:38:29 (6064): No heartbeat from core client for 30 sec - exiting 09:38:30 (6064): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5324, iMonCtr=1 Model crash detected, will try to restart... 17:47:44 (4288): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:47:45 (4288): No heartbeat from core client for 30 sec - exiting 17:47:46 (4288): No heartbeat from core client for 30 sec - exiting 17:47:47 (4288): No heartbeat from core client for 30 sec - exiting 17:47:48 (4288): No heartbeat from core client for 30 sec - exiting 17:47:49 (4288): No heartbeat from core client for 30 sec - exiting 17:47:50 (4288): No heartbeat from core client for 30 sec - exiting 17:47:51 (4288): No heartbeat from core client for 30 sec - exiting 17:47:52 (4288): No heartbeat from core client for 30 sec - exiting 17:47:53 (4288): No heartbeat from core client for 30 sec - exiting 17:47:54 (4288): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... C12:20:36 (5536): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4760, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5748, iMonCtr=1 Model crash detected, will try to restart... 13:54:41 (5400): No heartbeat from core client for 30 sec - exiting 13:54:42 (5400): No heartbeat from core client for 30 sec - exiting 13:54:43 (5400): No heartbeat from core client for 30 sec - exiting 13:54:44 (5400): No heartbeat from core client for 30 sec - exiting 13:54:45 (5400): No heartbeat from core client for 30 sec - exiting 13:54:46 (5400): No heartbeat from core client for 30 sec - exiting 13:54:47 (5400): No heartbeat from core client for 30 sec - exiting 13:54:48 (5400): No heartbeat from core client for 30 sec - exiting 13:54:49 (5400): No heartbeat from core client for 30 sec - exiting 13:54:50 (5400): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:00:43 (4356): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7544, iMonCtr=1 Model crash detected, will try to restart... 09:51:39 (4176): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
19 Nov 2011 13:19:59 | 1134438 | 13549000 | hadcm3n_y9ch_1900_40_007521656_0 | 259,200 | 615,419 | 2.3743 |
17 Nov 2011 18:15:58 | 1134438 | 13549000 | hadcm3n_y9ch_1900_40_007521656_0 | 233,280 | 554,752 | 2.3781 |
15 Nov 2011 19:06:17 | 1134438 | 13549000 | hadcm3n_y9ch_1900_40_007521656_0 | 207,360 | 494,790 | 2.3861 |
15 Nov 2011 17:10:48 | 1134438 | 13549000 | hadcm3n_y9ch_1900_40_007521656_0 | 181,440 | 435,153 | 2.3983 |
15 Nov 2011 17:10:48 | 1134438 | 13549000 | hadcm3n_y9ch_1900_40_007521656_0 | 155,520 | 367,779 | 2.3648 |
15 Nov 2011 17:10:48 | 1134438 | 13549000 | hadcm3n_y9ch_1900_40_007521656_0 | 129,600 | 302,567 | 2.3346 |
08 Nov 2011 19:14:35 | 1134438 | 13549000 | hadcm3n_y9ch_1900_40_007521656_0 | 103,680 | 241,087 | 2.3253 |
07 Nov 2011 10:09:32 | 1134438 | 13549000 | hadcm3n_y9ch_1900_40_007521656_0 | 77,760 | 179,873 | 2.3132 |
05 Nov 2011 18:53:31 | 1134438 | 13549000 | hadcm3n_y9ch_1900_40_007521656_0 | 51,840 | 117,959 | 2.2754 |
04 Nov 2011 08:51:23 | 1134438 | 13549000 | hadcm3n_y9ch_1900_40_007521656_0 | 25,920 | 57,866 | 2.2325 |
©2024 cpdn.org