Name | hadcm3n_8ent_1980_40_008728420_0 |
Workunit | 8874398 |
Created | 23 Apr 2014, 14:06:44 UTC |
Sent | 25 Apr 2014, 21:34:11 UTC |
Report deadline | 26 Jul 2014, 5:01:22 UTC |
Received | 8 Jun 2014, 19:29:32 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1314929 |
Run time | 10 days 8 hours 37 min 58 sec |
CPU time | 9 days 19 hours 4 min 58 sec |
Validate state | Invalid |
Credit | 10,575.36 |
Device peak FLOPS | 3.63 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.2.42</core_client_version> <![CDATA[ <message> O dispositivo não reconhece o comando. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 06:31:49 (6100): No heartbeat from core client for 30 sec - exiting 06:31:50 (6100): No heartbeat from core client for 30 sec - exiting 06:31:51 (6100): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 07:34:52 (5836): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:34:53 (5836): No heartbeat from core client for 30 sec - exiting 07:34:54 (5836): No heartbeat from core client for 30 sec - exiting 07:34:55 (5836): No heartbeat from core client for 30 sec - exiting 07:34:56 (5836): No heartbeat from core client for 30 sec - exiting 07:34:57 (5836): No heartbeat from core client for 30 sec - exiting 07:34:58 (5836): No heartbeat from core client for 30 sec - exiting 07:34:59 (5836): No heartbeat from core client for 30 sec - exiting 07:35:00 (5836): No heartbeat from core client for 30 sec - exiting 07:35:01 (5836): No heartbeat from core client for 30 sec - exiting 07:35:02 (5836): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 20:46:49 (4320): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6868, iMonCtr=1 Model crash detected, will try to restart... 08:26:10 (5856): No heartbeat from core client for 30 sec - exiting 08:26:11 (5856): No heartbeat from core client for 30 sec - exiting 08:26:12 (5856): No heartbeat from core client for 30 sec - exiting 08:26:13 (5856): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:16:28 (6064): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:18:02 (1124): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6964, iMonCtr=1 Model crash detected, will try to restart... 08:35:40 (5708): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 09:31:23 (5348): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 23:23:29 (3660): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7884, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 10:10:59 (5448): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 14:47:11 (3560): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:08:43 (8368): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:56:38 (6452): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 19:24:02 (5440): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:27:56 (8040): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6216, iMonCtr=1 Model crash detected, will try to restart... 09:20:41 (4756): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:20:42 (4756): No heartbeat from core client for 30 sec - exiting 09:20:43 (4756): No heartbeat from core client for 30 sec - exiting 09:22:23 (8704): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:29:05 (3828): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 08:43:21 (948): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 17:45:36 (4776): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 21:40:36 (624): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 21:30:53 (10120): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 06:53:35 (4348): No heartbeat from core client for 30 sec - exiting 06:53:36 (4348): No heartbeat from core client for 30 sec - exiting 06:53:37 (4348): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:53:38 (4348): No heartbeat from core client for 30 sec - exiting 06:55:10 (5568): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:55:46 (6920): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 18:27:19 (3204): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3388, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3388, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3388, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3388, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4656, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4656, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
10 Jun 2014 09:02:16 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 881,280 | 842,743 | 0.9563 |
10 Jun 2014 09:01:10 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 855,360 | 819,804 | 0.9584 |
10 Jun 2014 09:00:40 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 829,440 | 796,790 | 0.9606 |
09 Jun 2014 23:29:52 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 803,520 | 773,587 | 0.9627 |
07 Jun 2014 00:10:56 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 777,600 | 750,864 | 0.9656 |
06 Jun 2014 11:03:07 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 751,680 | 728,082 | 0.9686 |
03 Jun 2014 18:28:25 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 725,760 | 705,634 | 0.9723 |
02 Jun 2014 00:15:35 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 699,840 | 682,046 | 0.9746 |
31 May 2014 12:57:40 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 673,920 | 658,073 | 0.9765 |
30 May 2014 17:01:53 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 648,000 | 634,102 | 0.9786 |
29 May 2014 23:29:59 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 622,080 | 610,220 | 0.9809 |
29 May 2014 16:17:25 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 596,160 | 585,786 | 0.9826 |
29 May 2014 00:34:56 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 570,240 | 561,836 | 0.9853 |
28 May 2014 02:30:14 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 544,320 | 537,859 | 0.9881 |
27 May 2014 17:58:03 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 518,400 | 513,800 | 0.9911 |
26 May 2014 11:43:33 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 492,480 | 489,932 | 0.9948 |
22 May 2014 23:03:32 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 466,560 | 465,435 | 0.9976 |
22 May 2014 16:05:11 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 440,640 | 441,312 | 1.0015 |
21 May 2014 23:23:14 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 414,720 | 417,277 | 1.0062 |
21 May 2014 16:27:54 | 1314929 | 16593338 | hadcm3n_8ent_1980_40_008728420_0 | 388,800 | 393,057 | 1.0109 |
©2024 cpdn.org