Name | hadcm3n_p7if_1900_40_007227319_0 |
Workunit | 7425559 |
Created | 26 Apr 2011, 15:39:53 UTC |
Sent | 26 Apr 2011, 18:36:39 UTC |
Report deadline | 27 Jul 2011, 2:03:50 UTC |
Received | 30 May 2011, 15:34:28 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1126512 |
Run time | 20 days 22 hours 29 min 51 sec |
CPU time | 20 days 5 hours 26 min 25 sec |
Validate state | Invalid |
Credit | 11,508.48 |
Device peak FLOPS | 2.58 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.10.60</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> 10:02:10 (2076): No heartbeat from core client for 30 sec - exiting 10:02:12 (2076): No heartbeat from core client for 30 sec - exiting 10:02:13 (2076): No heartbeat from core client for 30 sec - exiting 10:02:14 (2076): No heartbeat from core client for 30 sec - exiting 10:02:15 (2076): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:10:46 (1756): No heartbeat from core client for 30 sec - exiting 18:10:47 (1756): No heartbeat from core client for 30 sec - exiting 18:10:48 (1756): No heartbeat from core client for 30 sec - exiting 18:10:49 (1756): No heartbeat from core client for 30 sec - exiting 18:10:50 (1756): No heartbeat from core client for 30 sec - exiting 18:10:51 (1756): No heartbeat from core client for 30 sec - exiting 18:10:52 (1756): No heartbeat from core client for 30 sec - exiting 18:10:53 (1756): No heartbeat from core client for 30 sec - exiting 18:10:54 (1756): No heartbeat from core client for 30 sec - exiting 18:10:55 (1756): No heartbeat from core client for 30 sec - exiting 18:10:56 (1756): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:13:55 (224): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 02:00:31 (6196): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 21:13:33 (6152): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3328, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3328, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3328, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4116, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4116, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4116, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
30 May 2011 05:37:46 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 959,040 | 1,717,573 | 1.7909 |
29 May 2011 14:54:46 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 933,120 | 1,666,486 | 1.7859 |
29 May 2011 02:15:51 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 907,200 | 1,620,988 | 1.7868 |
28 May 2011 12:50:48 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 881,280 | 1,572,954 | 1.7849 |
27 May 2011 23:56:53 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 855,360 | 1,527,387 | 1.7857 |
27 May 2011 06:17:47 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 829,440 | 1,481,083 | 1.7856 |
26 May 2011 12:20:46 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 803,520 | 1,434,723 | 1.7855 |
25 May 2011 18:28:33 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 777,600 | 1,389,103 | 1.7864 |
24 May 2011 07:58:18 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 751,680 | 1,343,052 | 1.7867 |
23 May 2011 04:50:41 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 725,760 | 1,296,719 | 1.7867 |
22 May 2011 07:06:24 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 699,840 | 1,250,279 | 1.7865 |
21 May 2011 17:37:26 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 673,920 | 1,202,596 | 1.7845 |
21 May 2011 04:02:32 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 648,000 | 1,154,083 | 1.7810 |
20 May 2011 14:13:32 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 622,080 | 1,105,144 | 1.7765 |
19 May 2011 23:56:04 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 596,160 | 1,056,390 | 1.7720 |
19 May 2011 10:04:50 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 570,240 | 1,007,202 | 1.7663 |
16 May 2011 14:00:49 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 544,320 | 958,011 | 1.7600 |
14 May 2011 08:39:17 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 518,400 | 910,107 | 1.7556 |
13 May 2011 07:02:59 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 492,480 | 865,145 | 1.7567 |
12 May 2011 12:50:51 | 1126512 | 12834568 | hadcm3n_p7if_1900_40_007227319_0 | 466,560 | 819,748 | 1.7570 |
©2024 cpdn.org