Name | hadcm3n_ykoj_1900_40_007523635_3 |
Workunit | 7721110 |
Created | 24 Jan 2012, 7:22:06 UTC |
Sent | 24 Jan 2012, 7:22:43 UTC |
Report deadline | 24 Apr 2012, 14:49:54 UTC |
Received | 2 May 2012, 4:13:31 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1156442 |
Run time | 20 days 10 hours 49 min 53 sec |
CPU time | 18 days 0 hours 29 min 42 sec |
Validate state | Invalid |
Credit | 5,598.72 |
Device peak FLOPS | 1.47 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.25</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 12:55:24 (41896): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 01:38:38 (33632): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:55:51 (26376): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 16:58:09 (14436): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 13:07:13 (23764): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 02:46:34 (36596): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 10:45:32 (11292): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:37:38 (30572): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:05:40 (22008): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 12:07:47 (13448): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 12:07:48 (13448): No heartbeat from core client for 30 sec - exiting 12:07:49 (13448): No heartbeat from core client for 30 sec - exiting 12:07:50 (13448): No heartbeat from core client for 30 sec - exiting 12:07:51 (13448): No heartbeat from core client for 30 sec - exiting 12:07:52 (13448): No heartbeat from core client for 30 sec - exiting 12:07:53 (13448): No heartbeat from core client for 30 sec - exiting 12:07:54 (13448): No heartbeat from core client for 30 sec - exiting 12:07:55 (13448): No heartbeat from core client for 30 sec - exiting 12:07:56 (13448): No heartbeat from core client for 30 sec - exiting 12:07:57 (13448): No heartbeat from core client for 30 sec - exiting 12:07:58 (13448): No heartbeat from core client for 30 sec - exiting 12:07:59 (13448): No heartbeat from core client for 30 sec - exiting 12:08:00 (13448): No heartbeat from core client for 30 sec - exiting 12:08:01 (13448): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 06:32:52 (15420): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 00:47:30 (6672): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:14:39 (8104): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:00:37 (22984): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 04:25:33 (16204): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:21:39 (15748): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:54:59 (4188): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:58:02 (23756): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 16:00:27 (21228): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:51:40 (25804): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 17:35:20 (6272): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:35:21 (6272): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 06:05:16 (12892): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:18:15 (10724): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:58:05 (9800): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 17:09:03 (18784): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 10:22:02 (31016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:11:24 (10664): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:36:08 (23616): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:25:51 (21532): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:48:36 (32784): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:51:42 (5880): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:53:57 (31292): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 00:11:40 (8360): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 01:49:20 (17824): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:16:31 (7796): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:58:56 (14816): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:00:28 (19088): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 13:54:14 (8016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 05:25:05 (13228): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:34:09 (11384): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 16:07:52 (22708): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:07:53 (22708): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... 13:43:53 (3028): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:13:03 (13976): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 15:40:15 (19364): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 00:00:12 (11072): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3656, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3656, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=13236, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=15208, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=15208, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=15208, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
27 Apr 2012 21:47:29 | 1156442 | 13960049 | hadcm3n_ykoj_1900_40_007523635_3 | 466,560 | 1,533,661 | 3.2872 |
22 Apr 2012 19:38:54 | 1156442 | 13960049 | hadcm3n_ykoj_1900_40_007523635_3 | 440,640 | 1,456,263 | 3.3049 |
20 Apr 2012 08:53:29 | 1156442 | 13960049 | hadcm3n_ykoj_1900_40_007523635_3 | 414,720 | 1,378,590 | 3.3241 |
02 Apr 2012 03:58:36 | 1156442 | 13960049 | hadcm3n_ykoj_1900_40_007523635_3 | 388,800 | 1,290,283 | 3.3186 |
30 Mar 2012 04:31:48 | 1156442 | 13960049 | hadcm3n_ykoj_1900_40_007523635_3 | 362,880 | 1,196,065 | 3.2960 |
23 Mar 2012 06:07:44 | 1156442 | 13960049 | hadcm3n_ykoj_1900_40_007523635_3 | 336,960 | 1,100,936 | 3.2673 |
16 Mar 2012 15:22:32 | 1156442 | 13960049 | hadcm3n_ykoj_1900_40_007523635_3 | 311,040 | 1,009,425 | 3.2453 |
15 Mar 2012 06:23:36 | 1156442 | 13960049 | hadcm3n_ykoj_1900_40_007523635_3 | 285,120 | 927,455 | 3.2529 |
11 Mar 2012 14:11:05 | 1156442 | 13960049 | hadcm3n_ykoj_1900_40_007523635_3 | 259,200 | 834,414 | 3.2192 |
21 Feb 2012 06:29:43 | 1156442 | 13960049 | hadcm3n_ykoj_1900_40_007523635_3 | 233,280 | 742,888 | 3.1845 |
19 Feb 2012 12:16:59 | 1156442 | 13960049 | hadcm3n_ykoj_1900_40_007523635_3 | 207,360 | 664,484 | 3.2045 |
13 Feb 2012 19:20:44 | 1156442 | 13960049 | hadcm3n_ykoj_1900_40_007523635_3 | 181,440 | 586,584 | 3.2329 |
09 Feb 2012 18:55:48 | 1156442 | 13960049 | hadcm3n_ykoj_1900_40_007523635_3 | 155,520 | 493,624 | 3.1740 |
06 Feb 2012 14:07:31 | 1156442 | 13960049 | hadcm3n_ykoj_1900_40_007523635_3 | 129,600 | 391,595 | 3.0216 |
03 Feb 2012 18:26:47 | 1156442 | 13960049 | hadcm3n_ykoj_1900_40_007523635_3 | 103,680 | 313,020 | 3.0191 |
02 Feb 2012 05:28:35 | 1156442 | 13960049 | hadcm3n_ykoj_1900_40_007523635_3 | 77,760 | 234,961 | 3.0216 |
30 Jan 2012 12:27:09 | 1156442 | 13960049 | hadcm3n_ykoj_1900_40_007523635_3 | 51,840 | 156,331 | 3.0156 |
27 Jan 2012 11:31:47 | 1156442 | 13960049 | hadcm3n_ykoj_1900_40_007523635_3 | 25,920 | 78,075 | 3.0122 |
©2024 cpdn.org