Name | hadcm3n_4i14_1980_40_008366393_0 |
Workunit | 8517252 |
Created | 11 May 2013, 5:05:39 UTC |
Sent | 11 May 2013, 7:44:26 UTC |
Report deadline | 10 Aug 2013, 15:11:37 UTC |
Received | 14 Aug 2013, 15:54:29 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1114586 |
Run time | 12 days 15 hours 32 min 45 sec |
CPU time | 11 days 18 hours 37 min 57 sec |
Validate state | Invalid |
Credit | 7,153.92 |
Device peak FLOPS | 2.69 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Ocean Restart file copy failed on 4i14ko.dai4c30 Suspended CPDN Monitor - Suspend request from BOINC... 12:23:36 (4104): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:23:38 (4104): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 22:55:58 (7868): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3420, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 14:43:36 (5148): No heartbeat from core client for 30 sec - exiting 14:43:37 (5148): No heartbeat from core client for 30 sec - exiting 14:43:38 (5148): No heartbeat from core client for 30 sec - exiting 14:43:39 (5148): No heartbeat from core client for 30 sec - exiting 14:43:40 (5148): No heartbeat from core client for 30 sec - exiting 14:43:41 (5148): No heartbeat from core client for 30 sec - exiting 14:43:42 (5148): No heartbeat from core client for 30 sec - exiting 14:43:43 (5148): No heartbeat from core client for 30 sec - exiting 14:43:44 (5148): No heartbeat from core client for 30 sec - exiting 14:43:45 (5148): No heartbeat from core client for 30 sec - exiting 14:43:46 (5148): No heartbeat from core client for 30 sec - exiting 14:43:47 (5148): No heartbeat from core client for 30 sec - exiting 14:43:48 (5148): No heartbeat from core client for 30 sec - exiting 14:43:49 (5148): No heartbeat from core client for 30 sec - exiting 14:43:50 (5148): No heartbeat from core client for 30 sec - exiting 14:43:51 (5148): No heartbeat from core client for 30 sec - exiting 14:43:52 (5148): No heartbeat from core client for 30 sec - exiting 14:43:53 (5148): No heartbeat from core client for 30 sec - exiting 14:43:54 (5148): No heartbeat from core client for 30 sec - exiting 14:43:55 (5148): No heartbeat from core client for 30 sec - exiting 14:43:56 (5148): No heartbeat from core client for 30 sec - exiting 14:43:57 (5148): No heartbeat from core client for 30 sec - exiting 14:43:58 (5148): No heartbeat from core client for 30 sec - exiting 14:43:59 (5148): No heartbeat from core client for 30 sec - exiting 14:44:00 (5148): No heartbeat from core client for 30 sec - exiting 14:44:01 (5148): No heartbeat from core client for 30 sec - exiting 14:44:02 (5148): No heartbeat from core client for 30 sec - exiting 14:44:03 (5148): No heartbeat from core client for 30 sec - exiting 14:44:04 (5148): No heartbeat from core client for 30 sec - exiting 14:44:05 (5148): No heartbeat from core client for 30 sec - exiting 14:44:06 (5148): No heartbeat from core client for 30 sec - exiting 14:44:07 (5148): No heartbeat from core client for 30 sec - exiting 14:44:08 (5148): No heartbeat from core client for 30 sec - exiting 14:44:09 (5148): No heartbeat from core client for 30 sec - exiting 14:44:10 (5148): No heartbeat from core client for 30 sec - exiting 14:44:11 (5148): No heartbeat from core client for 30 sec - exiting 14:44:12 (5148): No heartbeat from core client for 30 sec - exiting 14:44:13 (5148): No heartbeat from core client for 30 sec - exiting 14:44:14 (5148): No heartbeat from core client for 30 sec - exiting 14:44:15 (5148): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:45:35 (5900): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6096, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6096, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3400, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4308, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5040, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=196, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
02 Jul 2013 10:04:26 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 596,160 | 1,004,182 | 1.6844 |
28 Jun 2013 03:00:58 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 570,240 | 962,542 | 1.6880 |
25 Jun 2013 19:02:52 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 544,320 | 921,044 | 1.6921 |
21 Jun 2013 19:22:58 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 518,400 | 878,626 | 1.6949 |
18 Jun 2013 01:20:21 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 492,480 | 834,663 | 1.6948 |
17 Jun 2013 12:57:38 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 466,560 | 791,416 | 1.6963 |
17 Jun 2013 00:42:22 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 440,640 | 748,135 | 1.6978 |
16 Jun 2013 12:02:11 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 414,720 | 704,405 | 1.6985 |
08 Jun 2013 13:45:45 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 388,800 | 660,759 | 1.6995 |
07 Jun 2013 22:53:45 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 362,880 | 617,432 | 1.7015 |
07 Jun 2013 08:35:27 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 336,960 | 574,304 | 1.7044 |
06 Jun 2013 16:50:03 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 311,040 | 530,782 | 1.7065 |
06 Jun 2013 03:03:47 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 285,120 | 486,803 | 1.7074 |
05 Jun 2013 07:39:51 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 259,200 | 443,526 | 1.7111 |
04 Jun 2013 16:23:34 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 233,280 | 398,432 | 1.7080 |
04 Jun 2013 06:59:05 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 207,360 | 353,861 | 1.7065 |
03 Jun 2013 13:27:39 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 181,440 | 309,213 | 1.7042 |
03 Jun 2013 00:06:35 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 155,520 | 264,199 | 1.6988 |
02 Jun 2013 10:34:22 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 129,600 | 219,628 | 1.6947 |
01 Jun 2013 22:24:46 | 1114586 | 15776682 | hadcm3n_4i14_1980_40_008366393_0 | 103,680 | 176,061 | 1.6981 |
©2024 cpdn.org