Name | hadcm3n_o4gx_1940_40_007445333_3 |
Workunit | 7642836 |
Created | 18 Sep 2011, 7:19:48 UTC |
Sent | 18 Sep 2011, 7:30:53 UTC |
Report deadline | 18 Dec 2011, 14:58:04 UTC |
Received | 21 Sep 2011, 5:05:09 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1523870 |
Run time | 2 days 3 hours 29 min 45 sec |
CPU time | 2 days 3 hours 3 min 31 sec |
Validate state | Invalid |
Credit | 1,244.16 |
Device peak FLOPS | 3.05 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.12.34</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 16:29:54 (5992): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:29:55 (5992): No heartbeat from core client for 30 sec - exiting 16:29:56 (5992): No heartbeat from core client for 30 sec - exiting 16:29:58 (5992): No heartbeat from core client for 30 sec - exiting 16:29:59 (5992): No heartbeat from core client for 30 sec - exiting 16:34:49 (3664): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:34:59 (3664): No heartbeat from core client for 30 sec - exiting 16:56:45 (3052): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:10:21 (4552): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:15:35 (1212): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:15:42 (1212): No heartbeat from core client for 30 sec - exiting 17:15:43 (1212): No heartbeat from core client for 30 sec - exiting 17:15:44 (1212): No heartbeat from core client for 30 sec - exiting 17:15:45 (1212): No heartbeat from core client for 30 sec - exiting 17:15:47 (1212): No heartbeat from core client for 30 sec - exiting 17:15:48 (1212): No heartbeat from core client for 30 sec - exiting 17:15:49 (1212): No heartbeat from core client for 30 sec - exiting 17:15:50 (1212): No heartbeat from core client for 30 sec - exiting 17:15:51 (1212): No heartbeat from core client for 30 sec - exiting 17:15:52 (1212): No heartbeat from core client for 30 sec - exiting 17:25:42 (4848): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold 17:33:35 (6136): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:45:09 (896): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:45:12 (896): No heartbeat from core client for 30 sec - exiting 17:45:13 (896): No heartbeat from core client for 30 sec - exiting 17:45:14 (896): No heartbeat from core client for 30 sec - exiting 17:45:15 (896): No heartbeat from core client for 30 sec - exiting 17:45:16 (896): No heartbeat from core client for 30 sec - exiting 17:45:17 (896): No heartbeat from core client for 30 sec - exiting 17:45:18 (896): No heartbeat from core client for 30 sec - exiting 17:45:19 (896): No heartbeat from core client for 30 sec - exiting 17:45:21 (896): No heartbeat from core client for 30 sec - exiting 17:45:22 (896): No heartbeat from core client for 30 sec - exiting 19:15:40 (5664): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:16:02 (5664): No heartbeat from core client for 30 sec - exiting 19:16:03 (5664): No heartbeat from core client for 30 sec - exiting 19:16:04 (5664): No heartbeat from core client for 30 sec - exiting 19:16:05 (5664): No heartbeat from core client for 30 sec - exiting 19:17:09 (3320): No heartbeat from core client for 30 sec - exiting 19:17:10 (3320): No heartbeat from core client for 30 sec - exiting 19:17:11 (3320): No heartbeat from core client for 30 sec - exiting 19:17:12 (3320): No heartbeat from core client for 30 sec - exiting 19:17:13 (3320): No heartbeat from core client for 30 sec - exiting 19:18:52 (3320): No heartbeat from core client for 30 sec - exiting 19:18:53 (3320): No heartbeat from core client for 30 sec - exiting 19:18:55 (3320): No heartbeat from core client for 30 sec - exiting 19:18:56 (3320): No heartbeat from core client for 30 sec - exiting 19:18:57 (3320): No heartbeat from core client for 30 sec - exiting 19:18:58 (3320): No heartbeat from core client for 30 sec - exiting 19:18:59 (3320): No heartbeat from core client for 30 sec - exiting 19:19:00 (3320): No heartbeat from core client for 30 sec - exiting 19:19:01 (3320): No heartbeat from core client for 30 sec - exiting 19:19:02 (3320): No heartbeat from core client for 30 sec - exiting 19:19:03 (3320): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 01:52:47 (2396): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:56:49 (5600): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:56:53 (5600): No heartbeat from core client for 30 sec - exiting 01:56:54 (5600): No heartbeat from core client for 30 sec - exiting 01:56:55 (5600): No heartbeat from core client for 30 sec - exiting 01:56:56 (5600): No heartbeat from core client for 30 sec - exiting 01:56:57 (5600): No heartbeat from core client for 30 sec - exiting 01:58:19 (1764): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:58:21 (1764): No heartbeat from core client for 30 sec - exiting 01:58:22 (1764): No heartbeat from core client for 30 sec - exiting 01:58:23 (1764): No heartbeat from core client for 30 sec - exiting 01:58:24 (1764): No heartbeat from core client for 30 sec - exiting 01:59:32 (1564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:00:26 (4356): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:04:10 (3392): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:04:59 (1620): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:06:40 (4252): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:09:23 (4356): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:13:23 (3440): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:13:28 (3440): No heartbeat from core client for 30 sec - exiting 02:13:29 (3440): No heartbeat from core client for 30 sec - exiting 02:13:30 (3440): No heartbeat from core client for 30 sec - exiting 02:13:31 (3440): No heartbeat from core client for 30 sec - exiting 02:13:32 (3440): No heartbeat from core client for 30 sec - exiting 02:13:33 (3440): No heartbeat from core client for 30 sec - exiting 02:13:34 (3440): No heartbeat from core client for 30 sec - exiting 02:13:35 (3440): No heartbeat from core client for 30 sec - exiting 02:13:36 (3440): No heartbeat from core client for 30 sec - exiting 02:13:37 (3440): No heartbeat from core client for 30 sec - exiting 02:14:52 (624): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold 02:16:48 (1760): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:17:22 (5452): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:18:16 (2948): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:19:05 (4264): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:19:59 (2436): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... forrtl: Access is denied. 02:20:59 (1104): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... A02:22:10 (3080): mo heartbeat from core client for 30 sec - exiting os Hold Restart file rename failed on atmos_restart.hold CPDN Monitor - No 'heartbeat' from BOINC... 02:22:14 (3080): No heartbeat from core client for 30 sec - exiting 02:22:15 (3080): No heartbeat from core client for 30 sec - exiting 02:22:16 (3080): No heartbeat from core client for 30 sec - exiting 02:22:17 (3080): No heartbeat from core client for 30 sec - exiting 02:22:18 (3080): No heartbeat from core client for 30 sec - exiting 02:22:19 (3080): No heartbeat from core client for 30 sec - exiting 02:22:20 (3080): No heartbeat from core client for 30 sec - exiting 02:22:21 (3080): No heartbeat from core client for 30 sec - exiting 02:22:22 (3080): No heartbeat from core client for 30 sec - exiting 02:23:31 (3136): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:26:54 (4388): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:38:33 (5408): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:40:43 (4388): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:43:47 (5352): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:47:24 (260): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:52:25 (6004): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 05:22:24 (3776): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:27:42 (4732): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:01:17 (3580): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4100, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4100, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4100, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1752, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1752, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3428, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
20 Sep 2011 08:23:20 | 1102903 | 13395546 | hadcm3n_o4gx_1940_40_007445333_3 | 103,680 | 161,616 | 1.5588 |
19 Sep 2011 17:53:47 | 1102903 | 13395546 | hadcm3n_o4gx_1940_40_007445333_3 | 77,760 | 119,355 | 1.5349 |
19 Sep 2011 06:26:22 | 1102903 | 13395546 | hadcm3n_o4gx_1940_40_007445333_3 | 51,840 | 78,833 | 1.5207 |
18 Sep 2011 18:56:32 | 1102903 | 13395546 | hadcm3n_o4gx_1940_40_007445333_3 | 25,920 | 38,327 | 1.4787 |
©2024 cpdn.org