Name | hadcm3n_ycf8_1900_40_007519413_1 |
Workunit | 7716888 |
Created | 28 Oct 2011, 13:02:54 UTC |
Sent | 5 Nov 2011, 4:18:25 UTC |
Report deadline | 4 Feb 2012, 11:45:36 UTC |
Received | 21 Nov 2011, 16:16:34 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1178052 |
Run time | 7 days 14 hours 23 min 21 sec |
CPU time | 7 days 8 hours 8 min 5 sec |
Validate state | Invalid |
Credit | 2,177.28 |
Device peak FLOPS | 1.62 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> 11:30:36 (3208): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 22:06:33 (3628): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 06:01:37 (3808): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 15:51:01 (3456): No heartbeat from core client for 30 sec - exiting 15:51:02 (3456): No heartbeat from core client for 30 sec - exiting 15:51:03 (3456): No heartbeat from core client for 30 sec - exiting 15:51:04 (3456): No heartbeat from core client for 30 sec - exiting 15:51:05 (3456): No heartbeat from core client for 30 sec - exiting 15:51:06 (3456): No heartbeat from core client for 30 sec - exiting 15:51:08 (3456): No heartbeat from core client for 30 sec - exiting 15:51:09 (3456): No heartbeat from core client for 30 sec - exiting 15:51:10 (3456): No heartbeat from core client for 30 sec - exiting 15:51:11 (3456): No heartbeat from core client for 30 sec - exiting 15:51:12 (3456): No heartbeat from core client for 30 sec - exiting 15:51:13 (3456): No heartbeat from core client for 30 sec - exiting 15:51:14 (3456): No heartbeat from core client for 30 sec - exiting 15:51:15 (3456): No heartbeat from core client for 30 sec - exiting 15:51:17 (3456): No heartbeat from core client for 30 sec - exiting 15:51:18 (3456): No heartbeat from core client for 30 sec - exiting 15:51:19 (3456): No heartbeat from core client for 30 sec - exiting 15:51:20 (3456): No heartbeat from core client for 30 sec - exiting 15:51:21 (3456): No heartbeat from core client for 30 sec - exiting 15:51:22 (3456): No heartbeat from core client for 30 sec - exiting 15:51:24 (3456): No heartbeat from core client for 30 sec - exiting 15:51:25 (3456): No heartbeat from core client for 30 sec - exiting 15:51:26 (3456): No heartbeat from core client for 30 sec - exiting 15:51:27 (3456): No heartbeat from core client for 30 sec - exiting 15:51:28 (3456): No heartbeat from core client for 30 sec - exiting 15:51:29 (3456): No heartbeat from core client for 30 sec - exiting 15:51:30 (3456): No heartbeat from core client for 30 sec - exiting 15:51:31 (3456): No heartbeat from core client for 30 sec - exiting 15:51:32 (3456): No heartbeat from core client for 30 sec - exiting 15:51:34 (3456): No heartbeat from core client for 30 sec - exiting 15:51:35 (3456): No heartbeat from core client for 30 sec - exiting 15:51:36 (3456): No heartbeat from core client for 30 sec - exiting 15:51:37 (3456): No heartbeat from core client for 30 sec - exiting 15:51:38 (3456): No heartbeat from core client for 30 sec - exiting 15:51:39 (3456): No heartbeat from core client for 30 sec - exiting 15:51:40 (3456): No heartbeat from core client for 30 sec - exiting 15:51:41 (3456): No heartbeat from core client for 30 sec - exiting 15:51:42 (3456): No heartbeat from core client for 30 sec - exiting 15:51:44 (3456): No heartbeat from core client for 30 sec - exiting 15:51:45 (3456): No heartbeat from core client for 30 sec - exiting 15:51:46 (3456): No heartbeat from core client for 30 sec - exiting 15:51:47 (3456): No heartbeat from core client for 30 sec - exiting 15:51:48 (3456): No heartbeat from core client for 30 sec - exiting 15:51:49 (3456): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2536, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 12:49:35 (4016): No heartbeat from core client for 30 sec - exiting 12:49:36 (4016): No heartbeat from core client for 30 sec - exiting 12:49:38 (4016): No heartbeat from core client for 30 sec - exiting 12:49:39 (4016): No heartbeat from core client for 30 sec - exiting 12:49:40 (4016): No heartbeat from core client for 30 sec - exiting 12:49:41 (4016): No heartbeat from core client for 30 sec - exiting 12:49:42 (4016): No heartbeat from core client for 30 sec - exiting 12:49:43 (4016): No heartbeat from core client for 30 sec - exiting 12:49:45 (4016): No heartbeat from core client for 30 sec - exiting 12:49:46 (4016): No heartbeat from core client for 30 sec - exiting 12:49:47 (4016): No heartbeat from core client for 30 sec - exiting 12:49:48 (4016): No heartbeat from core client for 30 sec - exiting 12:49:49 (4016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 17:31:31 (2536): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:31:34 (2536): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... 22:40:33 (3944): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:41:25 (324): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 17:01:05 (3236): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 17:46:12 (2648): No heartbeat from core client for 30 sec - exiting 17:46:14 (2648): No heartbeat from core client for 30 sec - exiting 17:46:17 (2648): No heartbeat from core client for 30 sec - exiting 17:46:18 (2648): No heartbeat from core client for 30 sec - exiting 17:46:20 (2648): No heartbeat from core client for 30 sec - exiting 17:46:23 (2648): No heartbeat from core client for 30 sec - exiting 17:46:24 (2648): No heartbeat from core client for 30 sec - exiting 17:46:25 (2648): No heartbeat from core client for 30 sec - exiting 17:46:27 (2648): No heartbeat from core client for 30 sec - exiting 17:46:28 (2648): No heartbeat from core client for 30 sec - exiting 17:46:30 (2648): No heartbeat from core client for 30 sec - exiting 17:46:31 (2648): No heartbeat from core client for 30 sec - exiting 17:46:32 (2648): No heartbeat from core client for 30 sec - exiting 17:46:33 (2648): No heartbeat from core client for 30 sec - exiting 17:46:35 (2648): No heartbeat from core client for 30 sec - exiting 17:46:36 (2648): No heartbeat from core client for 30 sec - exiting 17:46:37 (2648): No heartbeat from core client for 30 sec - exiting 17:46:38 (2648): No heartbeat from core client for 30 sec - exiting 17:46:40 (2648): No heartbeat from core client for 30 sec - exiting 17:46:41 (2648): No heartbeat from core client for 30 sec - exiting 17:46:42 (2648): No heartbeat from core client for 30 sec - exiting 17:46:43 (2648): No heartbeat from core client for 30 sec - exiting 17:46:44 (2648): No heartbeat from core client for 30 sec - exiting 17:46:46 (2648): No heartbeat from core client for 30 sec - exiting 17:46:47 (2648): No heartbeat from core client for 30 sec - exiting 17:46:48 (2648): No heartbeat from core client for 30 sec - exiting 17:46:49 (2648): No heartbeat from core client for 30 sec - exiting 17:46:50 (2648): No heartbeat from core client for 30 sec - exiting 17:46:52 (2648): No heartbeat from core client for 30 sec - exiting 17:46:53 (2648): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2732, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2732, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2732, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2732, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2732, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2732, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
19 Nov 2011 11:14:22 | 1178052 | 13544509 | hadcm3n_ycf8_1900_40_007519413_1 | 181,440 | 507,438 | 2.7967 |
18 Nov 2011 06:03:28 | 1178052 | 13544509 | hadcm3n_ycf8_1900_40_007519413_1 | 155,520 | 434,446 | 2.7935 |
17 Nov 2011 00:36:17 | 1178052 | 13544509 | hadcm3n_ycf8_1900_40_007519413_1 | 129,600 | 369,903 | 2.8542 |
15 Nov 2011 18:24:05 | 1178052 | 13544509 | hadcm3n_ycf8_1900_40_007519413_1 | 103,680 | 304,398 | 2.9359 |
15 Nov 2011 17:10:00 | 1178052 | 13544509 | hadcm3n_ycf8_1900_40_007519413_1 | 77,760 | 239,576 | 3.0810 |
15 Nov 2011 17:09:59 | 1178052 | 13544509 | hadcm3n_ycf8_1900_40_007519413_1 | 51,840 | 176,902 | 3.4125 |
07 Nov 2011 09:52:57 | 1178052 | 13544509 | hadcm3n_ycf8_1900_40_007519413_1 | 25,920 | 91,797 | 3.5416 |
©2024 cpdn.org