Name | hadcm3n_4bcf_1940_40_008311228_0 |
Workunit | 8462363 |
Created | 8 Feb 2013, 3:44:36 UTC |
Sent | 8 Feb 2013, 21:53:53 UTC |
Report deadline | 11 May 2013, 5:21:04 UTC |
Received | 13 Feb 2013, 23:51:56 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1237982 |
Run time | 3 days 9 hours 54 min 1 sec |
CPU time | 3 days 5 hours 41 min 4 sec |
Validate state | Invalid |
Credit | 4,976.64 |
Device peak FLOPS | 3.64 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 08:36:58 (6920): No heartbeat from core client for 30 sec - exiting 08:36:59 (6920): No heartbeat from core client for 30 sec - exiting 08:37:00 (6920): No heartbeat from core client for 30 sec - exiting 08:37:01 (6920): No heartbeat from core client for 30 sec - exiting 08:37:02 (6920): No heartbeat from core client for 30 sec - exiting 08:37:03 (6920): No heartbeat from core client for 30 sec - exiting 08:37:04 (6920): No heartbeat from core client for 30 sec - exiting 08:37:05 (6920): No heartbeat from core client for 30 sec - exiting 08:37:06 (6920): No heartbeat from core client for 30 sec - exiting 08:37:07 (6920): No heartbeat from core client for 30 sec - exiting 08:37:08 (6920): No heartbeat from core client for 30 sec - exiting 08:37:09 (6920): No heartbeat from core client for 30 sec - exiting 08:37:10 (6920): No heartbeat from core client for 30 sec - exiting 08:37:11 (6920): No heartbeat from core client for 30 sec - exiting 08:37:12 (6920): No heartbeat from core client for 30 sec - exiting 08:37:13 (6920): No heartbeat from core client for 30 sec - exiting 08:37:14 (6920): No heartbeat from core client for 30 sec - exiting 08:37:15 (6920): No heartbeat from core client for 30 sec - exiting 08:37:16 (6920): No heartbeat from core client for 30 sec - exiting 08:37:17 (6920): No heartbeat from core client for 30 sec - exiting 08:37:18 (6920): No heartbeat from core client for 30 sec - exiting 08:37:19 (6920): No heartbeat from core client for 30 sec - exiting 08:37:20 (6920): No heartbeat from core client for 30 sec - exiting 08:37:21 (6920): No heartbeat from core client for 30 sec - exiting 08:37:22 (6920): No heartbeat from core client for 30 sec - exiting 08:37:23 (6920): No heartbeat from core client for 30 sec - exiting 08:37:24 (6920): No heartbeat from core client for 30 sec - exiting 08:37:25 (6920): No heartbeat from core client for 30 sec - exiting 08:37:26 (6920): No heartbeat from core client for 30 sec - exiting 08:37:27 (6920): No heartbeat from core client for 30 sec - exiting 08:37:28 (6920): No heartbeat from core client for 30 sec - exiting 08:37:29 (6920): No heartbeat from core client for 30 sec - exiting 08:37:30 (6920): No heartbeat from core client for 30 sec - exiting 08:37:31 (6920): No heartbeat from core client for 30 sec - exiting 08:37:32 (6920): No heartbeat from core client for 30 sec - exiting 08:37:33 (6920): No heartbeat from core client for 30 sec - exiting 08:37:34 (6920): No heartbeat from core client for 30 sec - exiting 08:37:35 (6920): No heartbeat from core client for 30 sec - exiting 08:37:36 (6920): No heartbeat from core client for 30 sec - exiting 08:37:37 (6920): No heartbeat from core client for 30 sec - exiting 08:37:38 (6920): No heartbeat from core client for 30 sec - exiting 08:37:39 (6920): No heartbeat from core client for 30 sec - exiting 08:37:40 (6920): No heartbeat from core client for 30 sec - exiting 08:37:41 (6920): No heartbeat from core client for 30 sec - exiting 08:37:42 (6920): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 15:41:33 (5096): No heartbeat from core client for 30 sec - exiting 15:41:34 (5096): No heartbeat from core client for 30 sec - exiting 15:41:35 (5096): No heartbeat from core client for 30 sec - exiting 15:41:36 (5096): No heartbeat from core client for 30 sec - exiting 15:41:37 (5096): No heartbeat from core client for 30 sec - exiting 15:41:38 (5096): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 10:33:21 (3996): No heartbeat from core client for 30 sec - exiting 10:33:22 (3996): No heartbeat from core client for 30 sec - exiting 10:33:23 (3996): No heartbeat from core client for 30 sec - exiting 10:33:24 (3996): No heartbeat from core client for 30 sec - exiting 10:33:25 (3996): No heartbeat from core client for 30 sec - exiting 10:33:26 (3996): No heartbeat from core client for 30 sec - exiting 10:33:27 (3996): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 14:32:14 (5796): No heartbeat from core client for 30 sec - exiting 14:32:15 (5796): No heartbeat from core client for 30 sec - exiting 14:32:16 (5796): No heartbeat from core client for 30 sec - exiting 14:32:17 (5796): No heartbeat from core client for 30 sec - exiting 14:32:18 (5796): No heartbeat from core client for 30 sec - exiting 14:32:19 (5796): No heartbeat from core client for 30 sec - exiting 14:32:20 (5796): No heartbeat from core client for 30 sec - exiting 14:32:21 (5796): No heartbeat from core client for 30 sec - exiting 14:32:22 (5796): No heartbeat from core client for 30 sec - exiting 14:32:23 (5796): No heartbeat from core client for 30 sec - exiting 14:32:24 (5796): No heartbeat from core client for 30 sec - exiting 14:32:25 (5796): No heartbeat from core client for 30 sec - exiting 14:32:26 (5796): No heartbeat from core client for 30 sec - exiting 14:32:27 (5796): No heartbeat from core client for 30 sec - exiting 14:32:28 (5796): No heartbeat from core client for 30 sec - exiting 14:32:29 (5796): No heartbeat from core client for 30 sec - exiting 14:32:30 (5796): No heartbeat from core client for 30 sec - exiting 14:32:31 (5796): No heartbeat from core client for 30 sec - exiting 14:32:32 (5796): No heartbeat from core client for 30 sec - exiting 14:32:33 (5796): No heartbeat from core client for 30 sec - exiting 14:32:34 (5796): No heartbeat from core client for 30 sec - exiting 14:32:35 (5796): No heartbeat from core client for 30 sec - exiting 14:32:36 (5796): No heartbeat from core client for 30 sec - exiting 14:32:37 (5796): No heartbeat from core client for 30 sec - exiting 14:32:38 (5796): No heartbeat from core client for 30 sec - exiting 14:32:39 (5796): No heartbeat from core client for 30 sec - exiting 14:32:40 (5796): No heartbeat from core client for 30 sec - exiting 14:32:41 (5796): No heartbeat from core client for 30 sec - exiting 14:32:42 (5796): No heartbeat from core client for 30 sec - exiting 14:32:43 (5796): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 13:02:48 (7464): No heartbeat from core client for 30 sec - exiting 13:02:49 (7464): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7864, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7864, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7864, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=11248, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=11248, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=11248, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
12 Feb 2013 21:18:43 | 1237982 | 15598784 | hadcm3n_4bcf_1940_40_008311228_0 | 414,720 | 264,977 | 0.6389 |
12 Feb 2013 16:26:34 | 1237982 | 15598784 | hadcm3n_4bcf_1940_40_008311228_0 | 388,800 | 248,457 | 0.6390 |
12 Feb 2013 11:35:15 | 1237982 | 15598784 | hadcm3n_4bcf_1940_40_008311228_0 | 362,880 | 231,930 | 0.6391 |
12 Feb 2013 06:39:09 | 1237982 | 15598784 | hadcm3n_4bcf_1940_40_008311228_0 | 336,960 | 215,417 | 0.6393 |
12 Feb 2013 01:27:54 | 1237982 | 15598784 | hadcm3n_4bcf_1940_40_008311228_0 | 311,040 | 198,851 | 0.6393 |
11 Feb 2013 19:48:56 | 1237982 | 15598784 | hadcm3n_4bcf_1940_40_008311228_0 | 285,120 | 182,301 | 0.6394 |
11 Feb 2013 15:00:39 | 1237982 | 15598784 | hadcm3n_4bcf_1940_40_008311228_0 | 259,200 | 165,774 | 0.6396 |
11 Feb 2013 10:11:29 | 1237982 | 15598784 | hadcm3n_4bcf_1940_40_008311228_0 | 233,280 | 149,319 | 0.6401 |
11 Feb 2013 05:15:25 | 1237982 | 15598784 | hadcm3n_4bcf_1940_40_008311228_0 | 207,360 | 132,938 | 0.6411 |
10 Feb 2013 22:48:47 | 1237982 | 15598784 | hadcm3n_4bcf_1940_40_008311228_0 | 181,440 | 116,341 | 0.6412 |
10 Feb 2013 06:11:22 | 1237982 | 15598784 | hadcm3n_4bcf_1940_40_008311228_0 | 155,520 | 99,726 | 0.6412 |
10 Feb 2013 01:20:12 | 1237982 | 15598784 | hadcm3n_4bcf_1940_40_008311228_0 | 129,600 | 83,079 | 0.6410 |
09 Feb 2013 20:32:12 | 1237982 | 15598784 | hadcm3n_4bcf_1940_40_008311228_0 | 103,680 | 66,442 | 0.6408 |
09 Feb 2013 15:50:45 | 1237982 | 15598784 | hadcm3n_4bcf_1940_40_008311228_0 | 77,760 | 49,852 | 0.6411 |
09 Feb 2013 11:02:23 | 1237982 | 15598784 | hadcm3n_4bcf_1940_40_008311228_0 | 51,840 | 33,276 | 0.6419 |
09 Feb 2013 06:11:12 | 1237982 | 15598784 | hadcm3n_4bcf_1940_40_008311228_0 | 25,920 | 16,672 | 0.6432 |
©2024 cpdn.org