Name | hadcm3n_3g5l_2020_40_008402385_3 |
Workunit | 8553241 |
Created | 25 Nov 2013, 22:20:43 UTC |
Sent | 25 Nov 2013, 23:37:23 UTC |
Report deadline | 25 Feb 2014, 7:04:34 UTC |
Received | 12 Dec 2013, 19:13:11 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1158176 |
Run time | 12 days 11 hours 6 min 5 sec |
CPU time | 9 days 0 hours 17 min 37 sec |
Validate state | Invalid |
Credit | 4,354.56 |
Device peak FLOPS | 2.91 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.64</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> 03:02:34 (30064): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:46:16 (5452): No heartbeat from core client for 30 sec - exiting 06:46:17 (5452): No heartbeat from core client for 30 sec - exiting 06:46:18 (5452): No heartbeat from core client for 30 sec - exiting 06:46:19 (5452): No heartbeat from core client for 30 sec - exiting 06:46:20 (5452): No heartbeat from core client for 30 sec - exiting 06:46:21 (5452): No heartbeat from core client for 30 sec - exiting 06:46:22 (5452): No heartbeat from core client for 30 sec - exiting 06:46:23 (5452): No heartbeat from core client for 30 sec - exiting 06:46:24 (5452): No heartbeat from core client for 30 sec - exiting 06:46:25 (5452): No heartbeat from core client for 30 sec - exiting 06:46:26 (5452): No heartbeat from core client for 30 sec - exiting 06:46:27 (5452): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:47:28 (6924): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:48:05 (4824): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:48:51 (6136): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:49:57 (3716): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:18:44 (7128): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:23:31 (7936): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:05:17 (7436): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:27:45 (21352): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:31:44 (19428): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:18:23 (8392): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:04:02 (21636): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:05:50 (28412): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:07:04 (29260): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:07:44 (30848): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 00:25:54 (30096): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:14:05 (44528): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:22:41 (73048): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:35:41 (27528): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:38:14 (52032): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:15:31 (79904): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:17:39 (3764): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:17:40 (3764): No heartbeat from core client for 30 sec - exiting 23:17:41 (3764): No heartbeat from core client for 30 sec - exiting 23:17:42 (3764): No heartbeat from core client for 30 sec - exiting 23:17:43 (3764): No heartbeat from core client for 30 sec - exiting 23:17:44 (3764): No heartbeat from core client for 30 sec - exiting 23:17:45 (3764): No heartbeat from core client for 30 sec - exiting 23:17:46 (3764): No heartbeat from core client for 30 sec - exiting 23:17:47 (3764): No heartbeat from core client for 30 sec - exiting 23:17:48 (3764): No heartbeat from core client for 30 sec - exiting 23:17:49 (3764): No heartbeat from core client for 30 sec - exiting 23:21:19 (99912): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:25:21 (104624): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:25:22 (104624): No heartbeat from core client for 30 sec - exiting 23:25:23 (104624): No heartbeat from core client for 30 sec - exiting 23:25:24 (104624): No heartbeat from core client for 30 sec - exiting 23:25:25 (104624): No heartbeat from core client for 30 sec - exiting 23:25:26 (104624): No heartbeat from core client for 30 sec - exiting 23:25:27 (104624): No heartbeat from core client for 30 sec - exiting 23:25:28 (104624): No heartbeat from core client for 30 sec - exiting 23:25:29 (104624): No heartbeat from core client for 30 sec - exiting 23:25:30 (104624): No heartbeat from core client for 30 sec - exiting 23:25:31 (104624): No heartbeat from core client for 30 sec - exiting 00:05:21 (79948): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:07:46 (103072): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:09:31 (104744): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:09:32 (104744): No heartbeat from core client for 30 sec - exiting 00:09:33 (104744): No heartbeat from core client for 30 sec - exiting 00:09:34 (104744): No heartbeat from core client for 30 sec - exiting 00:09:35 (104744): No heartbeat from core client for 30 sec - exiting 00:09:36 (104744): No heartbeat from core client for 30 sec - exiting 00:09:37 (104744): No heartbeat from core client for 30 sec - exiting 00:09:38 (104744): No heartbeat from core client for 30 sec - exiting 00:09:39 (104744): No heartbeat from core client for 30 sec - exiting 00:09:40 (104744): No heartbeat from core client for 30 sec - exiting 00:09:41 (104744): No heartbeat from core client for 30 sec - exiting 00:29:16 (104864): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:31:39 (102912): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:55:56 (6112): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=756, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=756, iMonCtr=1 Model crash detected, will try to restart... Called boinc_finish CPDN Monitor - Quit request from BOINC... 17:44:38 (4160): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:18:06 (3940): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:35:58 (996): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=11108, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=11108, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=11108, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=11108, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=11108, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=11108, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
04 Dec 2013 21:59:05 | 1158176 | 16083715 | hadcm3n_3g5l_2020_40_008402385_3 | 362,880 | 677,641 | 1.8674 |
04 Dec 2013 08:19:58 | 1158176 | 16083715 | hadcm3n_3g5l_2020_40_008402385_3 | 336,960 | 630,174 | 1.8702 |
03 Dec 2013 18:25:19 | 1158176 | 16083715 | hadcm3n_3g5l_2020_40_008402385_3 | 311,040 | 583,589 | 1.8763 |
03 Dec 2013 04:12:40 | 1158176 | 16083715 | hadcm3n_3g5l_2020_40_008402385_3 | 285,120 | 536,091 | 1.8802 |
02 Dec 2013 14:07:55 | 1158176 | 16083715 | hadcm3n_3g5l_2020_40_008402385_3 | 259,200 | 488,703 | 1.8854 |
01 Dec 2013 22:31:24 | 1158176 | 16083715 | hadcm3n_3g5l_2020_40_008402385_3 | 233,280 | 438,109 | 1.8780 |
01 Dec 2013 07:33:08 | 1158176 | 16083715 | hadcm3n_3g5l_2020_40_008402385_3 | 207,360 | 388,250 | 1.8723 |
30 Nov 2013 16:34:07 | 1158176 | 16083715 | hadcm3n_3g5l_2020_40_008402385_3 | 181,440 | 337,663 | 1.8610 |
30 Nov 2013 01:43:08 | 1158176 | 16083715 | hadcm3n_3g5l_2020_40_008402385_3 | 155,520 | 287,887 | 1.8511 |
29 Nov 2013 11:09:34 | 1158176 | 16083715 | hadcm3n_3g5l_2020_40_008402385_3 | 129,600 | 238,350 | 1.8391 |
28 Nov 2013 18:10:02 | 1158176 | 16083715 | hadcm3n_3g5l_2020_40_008402385_3 | 103,680 | 188,389 | 1.8170 |
27 Nov 2013 23:20:12 | 1158176 | 16083715 | hadcm3n_3g5l_2020_40_008402385_3 | 77,760 | 142,730 | 1.8355 |
27 Nov 2013 08:25:48 | 1158176 | 16083715 | hadcm3n_3g5l_2020_40_008402385_3 | 51,840 | 96,424 | 1.8600 |
26 Nov 2013 17:40:21 | 1158176 | 16083715 | hadcm3n_3g5l_2020_40_008402385_3 | 25,920 | 49,619 | 1.9143 |
©2024 cpdn.org