Name | hadcm3n_zdv8_1920_40_008280966_1 |
Workunit | 8432101 |
Created | 3 Apr 2013, 6:48:20 UTC |
Sent | 3 Apr 2013, 6:48:33 UTC |
Report deadline | 3 Jul 2013, 14:15:44 UTC |
Received | 9 Jul 2013, 9:34:35 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1195149 |
Run time | 18 days 21 hours 3 min 20 sec |
CPU time | 17 days 20 hours 53 min 7 sec |
Validate state | Invalid |
Credit | 12,441.60 |
Device peak FLOPS | 2.98 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.64</core_client_version> <![CDATA[ <message> (unknown error) - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4768, iMonCtr=1 Model crash detected, will try to restart... 09:33:37 (3964): No heartbeat from core client for 30 sec - exiting 09:33:38 (3964): No heartbeat from core client for 30 sec - exiting 09:33:39 (3964): No heartbeat from core client for 30 sec - exiting 09:33:40 (3964): No heartbeat from core client for 30 sec - exiting 09:33:41 (3964): No heartbeat from core client for 30 sec - exiting 09:33:43 (3964): No heartbeat from core client for 30 sec - exiting 09:33:44 (3964): No heartbeat from core client for 30 sec - exiting 09:33:45 (3964): No heartbeat from core client for 30 sec - exiting 09:33:46 (3964): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4808, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4748, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4748, iMonCtr=1 Model crash detected, will try to restart... 08:33:39 (3104): No heartbeat from core client for 30 sec - exiting 08:33:40 (3104): No heartbeat from core client for 30 sec - exiting 08:33:41 (3104): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:16:15 (4544): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3608, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3608, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3608, iMonCtr=1 Model crash detected, will try to restart... 08:23:43 (2408): No heartbeat from core client for 30 sec - exiting 08:23:45 (2408): No heartbeat from core client for 30 sec - exiting 08:23:46 (2408): No heartbeat from core client for 30 sec - exiting 08:23:47 (2408): No heartbeat from core client for 30 sec - exiting 08:23:48 (2408): No heartbeat from core client for 30 sec - exiting 08:23:49 (2408): No heartbeat from core client for 30 sec - exiting 08:23:50 (2408): No heartbeat from core client for 30 sec - exiting 08:23:51 (2408): No heartbeat from core client for 30 sec - exiting 08:23:52 (2408): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:34:08 (948): No heartbeat from core client for 30 sec - exiting 08:34:09 (948): No heartbeat from core client for 30 sec - exiting 08:34:10 (948): No heartbeat from core client for 30 sec - exiting 08:34:11 (948): No heartbeat from core client for 30 sec - exiting 08:34:12 (948): No heartbeat from core client for 30 sec - exiting 08:34:13 (948): No heartbeat from core client for 30 sec - exiting 08:34:14 (948): No heartbeat from core client for 30 sec - exiting 08:34:15 (948): No heartbeat from core client for 30 sec - exiting 08:34:16 (948): No heartbeat from core client for 30 sec - exiting 08:34:17 (948): No heartbeat from core client for 30 sec - exiting 08:34:18 (948): No heartbeat from core client for 30 sec - exiting 08:34:19 (948): No heartbeat from core client for 30 sec - exiting 08:34:20 (948): No heartbeat from core client for 30 sec - exiting 08:34:21 (948): No heartbeat from core client for 30 sec - exiting 08:34:22 (948): No heartbeat from core client for 30 sec - exiting 08:34:23 (948): No heartbeat from core client for 30 sec - exiting 08:34:24 (948): No heartbeat from core client for 30 sec - exiting 08:34:25 (948): No heartbeat from core client for 30 sec - exiting 08:34:26 (948): No heartbeat from core client for 30 sec - exiting 08:34:27 (948): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2480, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5064, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5064, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5064, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5064, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5064, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4424, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4424, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 17:13:57 (6004): Can't acquire lockfile (32) - waiting 35s 17:14:14 (1908): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6004, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 16:18:28 (4684): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:18:09 (4900): No heartbeat from core client for 30 sec - exiting 16:18:10 (4900): No heartbeat from core client for 30 sec - exiting 16:18:11 (4900): No heartbeat from core client for 30 sec - exiting 16:18:12 (4900): No heartbeat from core client for 30 sec - exiting 16:18:13 (4900): No heartbeat from core client for 30 sec - exiting 16:18:14 (4900): No heartbeat from core client for 30 sec - exiting 16:18:15 (4900): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1864, iMonCtr=1 Model crash detected, will try to restart... 08:21:27 (4684): No heartbeat from core client for 30 sec - exiting 08:21:28 (4684): No heartbeat from core client for 30 sec - exiting 08:21:29 (4684): No heartbeat from core client for 30 sec - exiting 08:21:30 (4684): No heartbeat from core client for 30 sec - exiting 08:21:31 (4684): No heartbeat from core client for 30 sec - exiting 08:21:32 (4684): No heartbeat from core client for 30 sec - exiting 08:21:33 (4684): No heartbeat from core client for 30 sec - exiting 08:21:35 (4684): No heartbeat from core client for 30 sec - exiting 08:21:36 (4684): No heartbeat from core client for 30 sec - exiting 08:21:37 (4684): No heartbeat from core client for 30 sec - exiting 08:21:38 (4684): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
09 Jul 2013 08:36:59 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 1,036,800 | 1,543,944 | 1.4891 |
08 Jul 2013 14:28:50 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 1,010,880 | 1,512,984 | 1.4967 |
04 Jul 2013 14:25:31 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 984,960 | 1,472,986 | 1.4955 |
27 Jun 2013 08:33:52 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 959,040 | 1,432,259 | 1.4934 |
25 Jun 2013 14:03:58 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 933,120 | 1,391,789 | 1.4915 |
24 Jun 2013 10:15:42 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 907,200 | 1,350,418 | 1.4886 |
20 Jun 2013 07:17:19 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 881,280 | 1,316,275 | 1.4936 |
19 Jun 2013 09:06:36 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 855,360 | 1,289,993 | 1.5081 |
18 Jun 2013 10:45:23 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 829,440 | 1,263,219 | 1.5230 |
17 Jun 2013 12:22:25 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 803,520 | 1,236,657 | 1.5390 |
14 Jun 2013 12:39:56 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 777,600 | 1,210,243 | 1.5564 |
13 Jun 2013 14:26:04 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 751,680 | 1,184,554 | 1.5759 |
13 Jun 2013 07:08:12 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 725,760 | 1,158,342 | 1.5960 |
12 Jun 2013 07:55:40 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 699,840 | 1,129,790 | 1.6144 |
11 Jun 2013 06:22:15 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 673,920 | 1,095,078 | 1.6249 |
06 Jun 2013 10:42:05 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 648,000 | 1,054,500 | 1.6273 |
05 Jun 2013 07:39:51 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 622,080 | 1,014,327 | 1.6305 |
03 Jun 2013 13:22:30 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 596,160 | 974,776 | 1.6351 |
30 May 2013 15:13:10 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 570,240 | 935,008 | 1.6397 |
29 May 2013 12:24:12 | 1195149 | 15700565 | hadcm3n_zdv8_1920_40_008280966_1 | 544,320 | 894,861 | 1.6440 |
©2024 cpdn.org