Name | hadcm3n_yjuy_1940_40_007959091_2 |
Workunit | 8114203 |
Created | 27 May 2012, 2:21:00 UTC |
Sent | 27 May 2012, 2:21:23 UTC |
Report deadline | 26 Aug 2012, 9:48:34 UTC |
Received | 23 Jun 2012, 21:28:25 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1129423 |
Run time | 6 days 1 hours 5 min 9 sec |
CPU time | 5 days 16 hours 53 min 4 sec |
Validate state | Invalid |
Credit | 3,110.40 |
Device peak FLOPS | 2.67 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> 22:55:09 (5324): No heartbeat from core client for 30 sec - exiting 22:55:10 (5324): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:03:04 (5576): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:32:44 (1004): No heartbeat from core client for 30 sec - exiting 20:32:45 (1004): No heartbeat from core client for 30 sec - exiting 20:32:46 (1004): No heartbeat from core client for 30 sec - exiting 20:32:47 (1004): No heartbeat from core client for 30 sec - exiting 20:32:49 (1004): No heartbeat from core client for 30 sec - exiting 20:32:50 (1004): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:32:51 (1004): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4412, iMonCtr=1 Model crash detected, will try to restart... 09:16:43 (4872): No heartbeat from core client for 30 sec - exiting 09:16:44 (4872): No heartbeat from core client for 30 sec - exiting 09:16:45 (4872): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 07:49:49 (5540): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 14:28:31 (3612): No heartbeat from core client for 30 sec - exiting 14:28:32 (3612): No heartbeat from core client for 30 sec - exiting 14:28:33 (3612): No heartbeat from core client for 30 sec - exiting 14:28:34 (3612): No heartbeat from core client for 30 sec - exiting 14:28:35 (3612): No heartbeat from core client for 30 sec - exiting 14:28:36 (3612): No heartbeat from core client for 30 sec - exiting 14:28:37 (3612): No heartbeat from core client for 30 sec - exiting 14:28:38 (3612): No heartbeat from core client for 30 sec - exiting 14:28:39 (3612): No heartbeat from core client for 30 sec - exiting 14:28:40 (3612): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5092, iMonCtr=1 Model crash detected, will try to restart... 09:48:21 (5228): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 13:54:34 (5548): No heartbeat from core client for 30 sec - exiting 13:54:35 (5548): No heartbeat from core client for 30 sec - exiting 13:54:36 (5548): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:54:37 (5548): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 16:12:26 (5372): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 16:44:52 (640): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 09:11:40 (2884): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 16:00:14 (5720): No heartbeat from core client for 30 sec - exiting 16:00:15 (5720): No heartbeat from core client for 30 sec - exiting 16:00:16 (5720): No heartbeat from core client for 30 sec - exiting 16:00:17 (5720): No heartbeat from core client for 30 sec - exiting 16:00:18 (5720): No heartbeat from core client for 30 sec - exiting 16:00:19 (5720): No heartbeat from core client for 30 sec - exiting 16:00:20 (5720): No heartbeat from core client for 30 sec - exiting 16:00:21 (5720): No heartbeat from core client for 30 sec - exiting 16:00:22 (5720): No heartbeat from core client for 30 sec - exiting 16:00:23 (5720): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 08:59:54 (5164): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 00:30:50 (2656): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:00:39 (4788): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 10:30:59 (2964): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 16:20:18 (5176): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 09:46:58 (3656): No heartbeat from core client for 30 sec - exiting 09:47:00 (3656): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6096, iMonCtr=1 Model crash detected, will try to restart... 16:38:53 (5680): No heartbeat from core client for 30 sec - exiting 16:38:54 (5680): No heartbeat from core client for 30 sec - exiting 16:38:55 (5680): No heartbeat from core client for 30 sec - exiting 16:38:56 (5680): No heartbeat from core client for 30 sec - exiting 16:38:57 (5680): No heartbeat from core client for 30 sec - exiting 16:38:58 (5680): No heartbeat from core client for 30 sec - exiting 16:38:59 (5680): No heartbeat from core client for 30 sec - exiting 16:39:00 (5680): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5004, iMonCtr=1 Model crash detected, will try to restart... 10:17:06 (2860): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 14:15:56 (5336): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1872, iMonCtr=1 Model crash detected, will try to restart... 10:07:38 (5468): No heartbeat from core client for 30 sec - exiting 10:07:39 (5468): No heartbeat from core client for 30 sec - exiting 10:07:40 (5468): No heartbeat from core client for 30 sec - exiting 10:07:41 (5468): No heartbeat from core client for 30 sec - exiting 10:07:42 (5468): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 16:17:18 (5660): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 13:21:23 (5628): No heartbeat from core client for 30 sec - exiting 13:21:24 (5628): No heartbeat from core client for 30 sec - exiting 13:21:25 (5628): No heartbeat from core client for 30 sec - exiting 13:21:26 (5628): No heartbeat from core client for 30 sec - exiting 13:21:27 (5628): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 08:52:22 (5652): No heartbeat from core client for 30 sec - exiting 08:52:23 (5652): No heartbeat from core client for 30 sec - exiting 08:52:24 (5652): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 12:18:00 (9636): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6564, iMonCtr=1 Model crash detected, will try to restart... 13:12:24 (1324): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish 01:29:46 (5692): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5636, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5636, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5636, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5636, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5636, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5636, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
22 Jun 2012 19:26:42 | 1129423 | 14745456 | hadcm3n_yjuy_1940_40_007959091_2 | 259,200 | 483,167 | 1.8641 |
21 Jun 2012 16:30:01 | 1129423 | 14745456 | hadcm3n_yjuy_1940_40_007959091_2 | 233,280 | 434,469 | 1.8624 |
18 Jun 2012 21:14:43 | 1129423 | 14745456 | hadcm3n_yjuy_1940_40_007959091_2 | 207,360 | 386,172 | 1.8623 |
15 Jun 2012 01:47:53 | 1129423 | 14745456 | hadcm3n_yjuy_1940_40_007959091_2 | 181,440 | 337,804 | 1.8618 |
13 Jun 2012 23:42:14 | 1129423 | 14745456 | hadcm3n_yjuy_1940_40_007959091_2 | 155,520 | 289,785 | 1.8633 |
10 Jun 2012 17:54:25 | 1129423 | 14745456 | hadcm3n_yjuy_1940_40_007959091_2 | 129,600 | 241,705 | 1.8650 |
07 Jun 2012 21:24:56 | 1129423 | 14745456 | hadcm3n_yjuy_1940_40_007959091_2 | 103,680 | 193,459 | 1.8659 |
04 Jun 2012 01:55:19 | 1129423 | 14745456 | hadcm3n_yjuy_1940_40_007959091_2 | 77,760 | 145,182 | 1.8671 |
01 Jun 2012 20:31:58 | 1129423 | 14745456 | hadcm3n_yjuy_1940_40_007959091_2 | 51,840 | 97,212 | 1.8752 |
28 May 2012 00:29:53 | 1129423 | 14745456 | hadcm3n_yjuy_1940_40_007959091_2 | 25,920 | 48,693 | 1.8786 |
©2024 climateprediction.net