Name | hadcm3n_3ijk_1940_40_008259260_2 |
Workunit | 8414384 |
Created | 22 Jan 2013, 22:38:07 UTC |
Sent | 22 Jan 2013, 22:38:14 UTC |
Report deadline | 24 Apr 2013, 6:05:25 UTC |
Received | 28 Mar 2013, 18:58:41 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1137520 |
Run time | 5 days 16 hours 19 min 27 sec |
CPU time | 4 days 23 hours 24 min 57 sec |
Validate state | Invalid |
Credit | 3,110.40 |
Device peak FLOPS | 2.92 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=816, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3064, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5588, iMonCtr=1 Model crash detected, will try to restart... 22:28:08 (3332): No heartbeat from core client for 30 sec - exiting 22:28:09 (3332): No heartbeat from core client for 30 sec - exiting 22:28:10 (3332): No heartbeat from core client for 30 sec - exiting 22:28:11 (3332): No heartbeat from core client for 30 sec - exiting 22:28:12 (3332): No heartbeat from core client for 30 sec - exiting 22:28:13 (3332): No heartbeat from core client for 30 sec - exiting 22:28:14 (3332): No heartbeat from core client for 30 sec - exiting 22:28:15 (3332): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2952, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3216, iMonCtr=1 Model crash detected, will try to restart... 22:31:53 (636): No heartbeat from core client for 30 sec - exiting 22:31:54 (636): No heartbeat from core client for 30 sec - exiting 22:31:55 (636): No heartbeat from core client for 30 sec - exiting 22:31:56 (636): No heartbeat from core client for 30 sec - exiting 22:31:57 (636): No heartbeat from core client for 30 sec - exiting 22:31:58 (636): No heartbeat from core client for 30 sec - exiting 22:31:59 (636): No heartbeat from core client for 30 sec - exiting 22:32:00 (636): No heartbeat from core client for 30 sec - exiting 22:32:01 (636): No heartbeat from core client for 30 sec - exiting 22:32:02 (636): No heartbeat from core client for 30 sec - exiting 22:32:03 (636): No heartbeat from core client for 30 sec - exiting 22:32:04 (636): No heartbeat from core client for 30 sec - exiting 22:32:05 (636): No heartbeat from core client for 30 sec - exiting 22:32:06 (636): No heartbeat from core client for 30 sec - exiting 22:32:07 (636): No heartbeat from core client for 30 sec - exiting 22:32:08 (636): No heartbeat from core client for 30 sec - exiting 22:32:09 (636): No heartbeat from core client for 30 sec - exiting 22:32:10 (636): No heartbeat from core client for 30 sec - exiting 22:32:11 (636): No heartbeat from core client for 30 sec - exiting 22:32:12 (636): No heartbeat from core client for 30 sec - exiting 22:32:13 (636): No heartbeat from core client for 30 sec - exiting 22:32:14 (636): No heartbeat from core client for 30 sec - exiting 22:32:15 (636): No heartbeat from core client for 30 sec - exiting 22:32:16 (636): No heartbeat from core client for 30 sec - exiting 22:32:17 (636): No heartbeat from core client for 30 sec - exiting 22:32:18 (636): No heartbeat from core client for 30 sec - exiting 22:32:19 (636): No heartbeat from core client for 30 sec - exiting 22:32:20 (636): No heartbeat from core client for 30 sec - exiting 22:32:21 (636): No heartbeat from core client for 30 sec - exiting 22:32:22 (636): No heartbeat from core client for 30 sec - exiting 22:32:23 (636): No heartbeat from core client for 30 sec - exiting 22:32:24 (636): No heartbeat from core client for 30 sec - exiting 22:32:25 (636): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CC22:14:37 (4772): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... C10:14:26 (5888): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3852, iMonCtr=1 Model crash detected, will try to restart... C11:04:31 (684): No heartbeat from core client for 30 sec - exiting 11:04:32 (684): No heartbeat from core client for 30 sec - exiting 11:04:33 (684): No heartbeat from core client for 30 sec - exiting 11:04:34 (684): No heartbeat from core client for 30 sec - exiting 11:04:35 (684): No heartbeat from core client for 30 sec - exiting 11:04:36 (684): No heartbeat from core client for 30 sec - exiting 11:04:37 (684): No heartbeat from core client for 30 sec - exiting 11:04:38 (684): No heartbeat from core client for 30 sec - exiting 11:04:39 (684): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=860, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2500, iMonCtr=1 Model crash detected, will try to restart... 14:45:27 (2272): No heartbeat from core client for 30 sec - exiting 14:45:28 (2272): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=564, iMonCtr=1 Model crash detected, will try to restart... 22:37:28 (4956): No heartbeat from core client for 30 sec - exiting 22:37:29 (4956): No heartbeat from core client for 30 sec - exiting 22:37:30 (4956): No heartbeat from core client for 30 sec - exiting 22:37:31 (4956): No heartbeat from core client for 30 sec - exiting 22:37:32 (4956): No heartbeat from core client for 30 sec - exiting 22:37:33 (4956): No heartbeat from core client for 30 sec - exiting 22:37:34 (4956): No heartbeat from core client for 30 sec - exiting 22:37:35 (4956): No heartbeat from core client for 30 sec - exiting 22:37:36 (4956): No heartbeat from core client for 30 sec - exiting 22:37:37 (4956): No heartbeat from core client for 30 sec - exiting 22:37:38 (4956): No heartbeat from core client for 30 sec - exiting 22:37:39 (4956): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5396, iMonCtr=1 Model crash detected, will try to restart... 11:43:17 (2996): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 16:06:06 (3348): No heartbeat from core client for 30 sec - exiting 16:06:07 (3348): No heartbeat from core client for 30 sec - exiting 16:06:08 (3348): No heartbeat from core client for 30 sec - exiting 16:06:09 (3348): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:37:19 (1756): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5832, iMonCtr=1 Model crash detected, will try to restart... 22:40:14 (4304): No heartbeat from core client for 30 sec - exiting 22:40:15 (4304): No heartbeat from core client for 30 sec - exiting 22:40:16 (4304): No heartbeat from core client for 30 sec - exiting 22:40:17 (4304): No heartbeat from core client for 30 sec - exiting 22:40:18 (4304): No heartbeat from core client for 30 sec - exiting 22:40:19 (4304): No heartbeat from core client for 30 sec - exiting 22:40:20 (4304): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:40:21 (4304): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2900, iMonCtr=1 Model crash detected, will try to restart... 21:42:48 (5016): No heartbeat from core client for 30 sec - exiting 21:42:49 (5016): No heartbeat from core client for 30 sec - exiting 21:42:50 (5016): No heartbeat from core client for 30 sec - exiting 21:42:51 (5016): No heartbeat from core client for 30 sec - exiting 21:42:52 (5016): No heartbeat from core client for 30 sec - exiting 21:42:53 (5016): No heartbeat from core client for 30 sec - exiting 21:42:54 (5016): No heartbeat from core client for 30 sec - exiting 21:42:55 (5016): No heartbeat from core client for 30 sec - exiting 21:42:56 (5016): No heartbeat from core client for 30 sec - exiting 21:42:57 (5016): No heartbeat from core client for 30 sec - exiting 21:42:58 (5016): No heartbeat from core client for 30 sec - exiting 21:42:59 (5016): No heartbeat from core client for 30 sec - exiting 21:43:00 (5016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=948, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1976, iMonCtr=1 Model crash detected, will try to restart... 16:58:31 (3944): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5908, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6120, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2856, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5640, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4220, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=776, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2692, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5592, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2952, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1412, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4508, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5540, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4296, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
27 Mar 2013 13:35:03 | 1137520 | 15555142 | hadcm3n_3ijk_1940_40_008259260_2 | 259,200 | 429,888 | 1.6585 |
20 Mar 2013 22:46:21 | 1137520 | 15555142 | hadcm3n_3ijk_1940_40_008259260_2 | 233,280 | 387,607 | 1.6616 |
11 Mar 2013 20:05:56 | 1137520 | 15555142 | hadcm3n_3ijk_1940_40_008259260_2 | 207,360 | 345,347 | 1.6654 |
09 Mar 2013 12:08:03 | 1137520 | 15555142 | hadcm3n_3ijk_1940_40_008259260_2 | 181,440 | 301,668 | 1.6626 |
03 Mar 2013 11:06:23 | 1137520 | 15555142 | hadcm3n_3ijk_1940_40_008259260_2 | 155,520 | 258,848 | 1.6644 |
24 Feb 2013 11:51:40 | 1137520 | 15555142 | hadcm3n_3ijk_1940_40_008259260_2 | 129,600 | 216,838 | 1.6731 |
12 Feb 2013 22:19:03 | 1137520 | 15555142 | hadcm3n_3ijk_1940_40_008259260_2 | 103,680 | 174,236 | 1.6805 |
08 Feb 2013 18:47:49 | 1137520 | 15555142 | hadcm3n_3ijk_1940_40_008259260_2 | 77,760 | 131,742 | 1.6942 |
03 Feb 2013 22:50:11 | 1137520 | 15555142 | hadcm3n_3ijk_1940_40_008259260_2 | 51,840 | 89,428 | 1.7251 |
31 Jan 2013 11:51:11 | 1137520 | 15555142 | hadcm3n_3ijk_1940_40_008259260_2 | 25,920 | 44,082 | 1.7007 |
©2024 climateprediction.net