Name | hadcm3n_ykq8_1900_40_007359722_1 |
Workunit | 7557152 |
Created | 6 Jul 2011, 15:08:06 UTC |
Sent | 7 Jul 2011, 22:52:03 UTC |
Report deadline | 7 Oct 2011, 6:19:14 UTC |
Received | 2 Aug 2011, 22:53:45 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1114161 |
Run time | 6 days 4 hours 40 min 7 sec |
CPU time | 4 days 13 hours 12 min 39 sec |
Validate state | Invalid |
Credit | 2,488.32 |
Device peak FLOPS | 2.68 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.12.33</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 13:21:42 (4088): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:22:20 (3988): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 14:24:38 (3676): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 14:40:55 (2040): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:41:30 (3528): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:41:31 (3528): No heartbeat from core client for 30 sec - exiting 14:41:33 (3528): No heartbeat from core client for 30 sec - exiting 14:41:34 (3528): No heartbeat from core client for 30 sec - exiting 14:41:35 (3528): No heartbeat from core client for 30 sec - exiting 14:41:36 (3528): No heartbeat from core client for 30 sec - exiting 14:41:37 (3528): No heartbeat from core client for 30 sec - exiting 14:41:38 (3528): No heartbeat from core client for 30 sec - exiting 14:41:39 (3528): No heartbeat from core client for 30 sec - exiting 14:41:40 (3528): No heartbeat from core client for 30 sec - exiting 14:41:41 (3528): No heartbeat from core client for 30 sec - exiting 14:49:45 (1484): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:49:53 (1484): No heartbeat from core client for 30 sec - exiting 14:49:54 (1484): No heartbeat from core client for 30 sec - exiting 14:49:55 (1484): No heartbeat from core client for 30 sec - exiting 15:03:25 (3556): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:04:02 (3460): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:04:03 (3460): No heartbeat from core client for 30 sec - exiting 15:04:04 (3460): No heartbeat from core client for 30 sec - exiting 15:04:05 (3460): No heartbeat from core client for 30 sec - exiting 15:04:06 (3460): No heartbeat from core client for 30 sec - exiting 15:04:07 (3460): No heartbeat from core client for 30 sec - exiting 15:04:08 (3460): No heartbeat from core client for 30 sec - exiting 15:05:12 (1900): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:05:50 (3816): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... forrtl: Access is denied. 16:07:12 (1080): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:58:57 (3680): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold Suspended CPDN Monitor - Suspend request from BOINC... 18:00:16 (1724): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 06:58:01 (4904): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 09:34:19 (1172): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 16:02:48 (1996): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 02:44:18 (3364): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 12:43:08 (2800): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 14:57:17 (1172): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 17:26:49 (5032): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 21:09:18 (4484): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:09:19 (4484): No heartbeat from core client for 30 sec - exiting 21:09:20 (4484): No heartbeat from core client for 30 sec - exiting 21:09:21 (4484): No heartbeat from core client for 30 sec - exiting 21:09:22 (4484): No heartbeat from core client for 30 sec - exiting 21:09:23 (4484): No heartbeat from core client for 30 sec - exiting Atmos Hold Restart file rename failed on atmos_restart.hold 21:39:18 (3460): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:04:57 (5052): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:05:03 (5052): No heartbeat from core client for 30 sec - exiting 22:05:04 (5052): No heartbeat from core client for 30 sec - exiting 22:05:05 (5052): No heartbeat from core client for 30 sec - exiting 22:05:06 (5052): No heartbeat from core client for 30 sec - exiting 23:29:37 (4724): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 00:23:07 (1148): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:23:08 (1148): No heartbeat from core client for 30 sec - exiting 00:23:09 (1148): No heartbeat from core client for 30 sec - exiting 00:23:11 (1148): No heartbeat from core client for 30 sec - exiting 00:23:12 (1148): No heartbeat from core client for 30 sec - exiting 00:23:13 (1148): No heartbeat from core client for 30 sec - exiting 00:23:14 (1148): No heartbeat from core client for 30 sec - exiting 00:23:15 (1148): No heartbeat from core client for 30 sec - exiting 01:24:51 (2816): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 11:08:52 (1300): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:08:54 (1300): No heartbeat from core client for 30 sec - exiting 11:08:55 (1300): No heartbeat from core client for 30 sec - exiting 11:08:56 (1300): No heartbeat from core client for 30 sec - exiting 11:08:57 (1300): No heartbeat from core client for 30 sec - exiting 11:08:58 (1300): No heartbeat from core client for 30 sec - exiting 11:08:59 (1300): No heartbeat from core client for 30 sec - exiting 11:09:00 (1300): No heartbeat from core client for 30 sec - exiting 11:09:01 (1300): No heartbeat from core client for 30 sec - exiting 11:09:02 (1300): No heartbeat from core client for 30 sec - exiting 11:09:03 (1300): No heartbeat from core client for 30 sec - exiting 12:03:08 (3132): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:14:30 (2824): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:17:33 (4848): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:17:34 (4848): No heartbeat from core client for 30 sec - exiting 12:17:35 (4848): No heartbeat from core client for 30 sec - exiting 12:17:36 (4848): No heartbeat from core client for 30 sec - exiting 12:17:37 (4848): No heartbeat from core client for 30 sec - exiting 12:17:38 (4848): No heartbeat from core client for 30 sec - exiting 12:17:39 (4848): No heartbeat from core client for 30 sec - exiting 12:17:40 (4848): No heartbeat from core client for 30 sec - exiting 12:17:41 (4848): No heartbeat from core client for 30 sec - exiting 12:17:43 (4848): No heartbeat from core client for 30 sec - exiting 12:17:44 (4848): No heartbeat from core client for 30 sec - exiting 12:27:44 (340): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:58:30 (4304): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:58:34 (4304): No heartbeat from core client for 30 sec - exiting 13:58:35 (4304): No heartbeat from core client for 30 sec - exiting 13:58:36 (4304): No heartbeat from core client for 30 sec - exiting 13:58:37 (4304): No heartbeat from core client for 30 sec - exiting 13:58:38 (4304): No heartbeat from core client for 30 sec - exiting Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=640, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=640, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=640, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=640, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=640, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=640, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
01 Aug 2011 05:40:22 | 1114161 | 13123316 | hadcm3n_ykq8_1900_40_007359722_1 | 207,360 | 369,890 | 1.7838 |
30 Jul 2011 01:06:22 | 1114161 | 13123316 | hadcm3n_ykq8_1900_40_007359722_1 | 181,440 | 332,150 | 1.8306 |
25 Jul 2011 19:49:03 | 1114161 | 13123316 | hadcm3n_ykq8_1900_40_007359722_1 | 155,520 | 283,801 | 1.8249 |
25 Jul 2011 19:07:30 | 1114161 | 13123316 | hadcm3n_ykq8_1900_40_007359722_1 | 129,600 | 235,197 | 1.8148 |
25 Jul 2011 16:35:48 | 1114161 | 13123316 | hadcm3n_ykq8_1900_40_007359722_1 | 103,680 | 188,422 | 1.8173 |
25 Jul 2011 14:25:05 | 1114161 | 13123316 | hadcm3n_ykq8_1900_40_007359722_1 | 77,760 | 143,313 | 1.8430 |
25 Jul 2011 14:25:05 | 1114161 | 13123316 | hadcm3n_ykq8_1900_40_007359722_1 | 51,840 | 96,978 | 1.8707 |
25 Jul 2011 14:25:05 | 1114161 | 13123316 | hadcm3n_ykq8_1900_40_007359722_1 | 25,920 | 48,507 | 1.8714 |
©2024 climateprediction.net