Name | hadcm3n_y8aw_1980_40_007956221_2 |
Workunit | 8111333 |
Created | 10 May 2012, 16:10:34 UTC |
Sent | 10 May 2012, 16:21:49 UTC |
Report deadline | 9 Aug 2012, 23:49:00 UTC |
Received | 1 Jun 2012, 6:53:49 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1215127 |
Run time | 18 days 13 hours 4 min 19 sec |
CPU time | 15 days 14 hours 44 min 13 sec |
Validate state | Invalid |
Credit | 12,441.60 |
Device peak FLOPS | 3.11 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.25</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... 06:52:53 (3204): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:26:49 (7344): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6740, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4216, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3096, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4236, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 17:51:53 (6616): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:51:54 (6616): No heartbeat from core client for 30 sec - exiting 17:51:55 (6616): No heartbeat from core client for 30 sec - exiting 17:51:56 (6616): No heartbeat from core client for 30 sec - exiting 17:51:57 (6616): No heartbeat from core client for 30 sec - exiting 17:51:58 (6616): No heartbeat from core client for 30 sec - exiting 17:51:59 (6616): No heartbeat from core client for 30 sec - exiting 17:52:01 (6616): No heartbeat from core client for 30 sec - exiting 17:52:02 (6616): No heartbeat from core client for 30 sec - exiting 17:52:03 (6616): No heartbeat from core client for 30 sec - exiting 17:52:04 (6616): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6792, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7892, iMonCtr=1 Model crash detected, will try to restart... 16:22:01 (5732): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:46:31 (8104): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:46:32 (8104): No heartbeat from core client for 30 sec - exiting 16:46:33 (8104): No heartbeat from core client for 30 sec - exiting 16:46:34 (8104): No heartbeat from core client for 30 sec - exiting 16:46:35 (8104): No heartbeat from core client for 30 sec - exiting 16:46:36 (8104): No heartbeat from core client for 30 sec - exiting 16:46:37 (8104): No heartbeat from core client for 30 sec - exiting 16:46:38 (8104): No heartbeat from core client for 30 sec - exiting 16:46:39 (8104): No heartbeat from core client for 30 sec - exiting 16:46:40 (8104): No heartbeat from core client for 30 sec - exiting 16:46:41 (8104): No heartbeat from core client for 30 sec - exiting 08:05:36 (4224): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4324, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1692, iMonCtr=1 Model crash detected, will try to restart... 08:46:00 (4896): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:45:27 (8348): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 11:24:48 (4028): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:26:16 (6492): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:26:17 (6492): No heartbeat from core client for 30 sec - exiting 11:26:18 (6492): No heartbeat from core client for 30 sec - exiting 11:26:19 (6492): No heartbeat from core client for 30 sec - exiting 11:26:20 (6492): No heartbeat from core client for 30 sec - exiting 11:26:22 (6492): No heartbeat from core client for 30 sec - exiting 11:26:23 (6492): No heartbeat from core client for 30 sec - exiting 11:26:24 (6492): No heartbeat from core client for 30 sec - exiting 11:26:25 (6492): No heartbeat from core client for 30 sec - exiting 11:26:26 (6492): No heartbeat from core client for 30 sec - exiting 11:26:27 (6492): No heartbeat from core client for 30 sec - exiting 16:36:00 (6792): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:16:03 (3440): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:11:19 (1712): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:11:21 (1712): No heartbeat from core client for 30 sec - exiting 17:12:34 (4040): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:12:35 (4040): No heartbeat from core client for 30 sec - exiting 17:12:36 (4040): No heartbeat from core client for 30 sec - exiting 17:12:38 (4040): No heartbeat from core client for 30 sec - exiting 17:12:39 (4040): No heartbeat from core client for 30 sec - exiting 17:12:40 (4040): No heartbeat from core client for 30 sec - exiting 17:12:41 (4040): No heartbeat from core client for 30 sec - exiting 17:12:42 (4040): No heartbeat from core client for 30 sec - exiting 17:12:43 (4040): No heartbeat from core client for 30 sec - exiting 17:12:44 (4040): No heartbeat from core client for 30 sec - exiting 17:12:45 (4040): No heartbeat from core client for 30 sec - exiting forrtl: Access is denied. Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2444, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2444, iMonCtr=1 Model crash detected, will try to restart... 11:08:58 (428): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 15:37:59 (9904): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:38:00 (9904): No heartbeat from core client for 30 sec - exiting 15:38:01 (9904): No heartbeat from core client for 30 sec - exiting 15:38:02 (9904): No heartbeat from core client for 30 sec - exiting 15:38:03 (9904): No heartbeat from core client for 30 sec - exiting 15:38:04 (9904): No heartbeat from core client for 30 sec - exiting 15:38:05 (9904): No heartbeat from core client for 30 sec - exiting 15:38:06 (9904): No heartbeat from core client for 30 sec - exiting 15:38:07 (9904): No heartbeat from core client for 30 sec - exiting 15:38:08 (9904): No heartbeat from core client for 30 sec - exiting 15:38:09 (9904): No heartbeat from core client for 30 sec - exiting Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
01 Jun 2012 04:05:59 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 1,036,800 | 1,349,050 | 1.3012 |
31 May 2012 18:26:03 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 1,010,880 | 1,315,565 | 1.3014 |
30 May 2012 17:14:14 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 984,960 | 1,281,598 | 1.3012 |
30 May 2012 04:38:03 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 959,040 | 1,248,040 | 1.3013 |
29 May 2012 18:08:15 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 933,120 | 1,214,920 | 1.3020 |
29 May 2012 11:43:17 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 907,200 | 1,181,349 | 1.3022 |
29 May 2012 11:43:17 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 881,280 | 1,147,722 | 1.3023 |
29 May 2012 11:43:17 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 855,360 | 1,113,921 | 1.3023 |
29 May 2012 11:43:17 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 829,440 | 1,079,998 | 1.3021 |
29 May 2012 11:43:17 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 803,520 | 1,046,413 | 1.3023 |
29 May 2012 11:43:17 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 777,600 | 1,012,499 | 1.3021 |
29 May 2012 11:43:17 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 751,680 | 978,684 | 1.3020 |
29 May 2012 11:43:17 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 725,760 | 944,728 | 1.3017 |
25 May 2012 19:57:51 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 699,840 | 910,682 | 1.3013 |
25 May 2012 05:12:53 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 673,920 | 876,796 | 1.3010 |
24 May 2012 20:56:09 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 648,000 | 842,678 | 1.3004 |
24 May 2012 04:59:39 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 622,080 | 808,640 | 1.2999 |
23 May 2012 20:52:53 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 596,160 | 774,475 | 1.2991 |
23 May 2012 06:59:30 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 570,240 | 740,556 | 1.2987 |
22 May 2012 21:01:50 | 1215127 | 14655424 | hadcm3n_y8aw_1980_40_007956221_2 | 544,320 | 706,485 | 1.2979 |
©2024 cpdn.org