Name | hadcm3n_z8qe_1920_40_008316159_0 |
Workunit | 8467294 |
Created | 23 Feb 2013, 19:09:39 UTC |
Sent | 23 Feb 2013, 19:09:42 UTC |
Report deadline | 26 May 2013, 2:36:53 UTC |
Received | 1 Jul 2013, 17:51:39 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1094300 |
Run time | 24 days 10 hours 40 min 15 sec |
CPU time | 20 days 18 hours 11 min 25 sec |
Validate state | Invalid |
Credit | 12,441.60 |
Device peak FLOPS | 2.72 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.10.56</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5884, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4592, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 19:19:59 (4564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:21:55 (5124): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:41:24 (2496): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:41:25 (2496): No heartbeat from core client for 30 sec - exiting 19:42:59 (3872): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:43:00 (3872): No heartbeat from core client for 30 sec - exiting 20:06:04 (4252): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:06:05 (4252): No heartbeat from core client for 30 sec - exiting 20:23:04 (5236): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:23:05 (5236): No heartbeat from core client for 30 sec - exiting 20:30:51 (4340): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:30:52 (4340): No heartbeat from core client for 30 sec - exiting 20:54:04 (3588): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:54:05 (3588): No heartbeat from core client for 30 sec - exiting 21:17:37 (4592): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:17:38 (4592): No heartbeat from core client for 30 sec - exiting 22:10:34 (4128): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:10:35 (4128): No heartbeat from core client for 30 sec - exiting 22:36:12 (4004): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:36:13 (4004): No heartbeat from core client for 30 sec - exiting 23:03:56 (1392): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:03:57 (1392): No heartbeat from core client for 30 sec - exiting 23:29:39 (5680): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:29:40 (5680): No heartbeat from core client for 30 sec - exiting Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5696, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5696, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5696, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5696, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5696, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5696, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish 19:14:09 (5520): No heartbeat from core client for 30 sec - exiting 19:14:10 (5520): No heartbeat from core client for 30 sec - exiting 19:14:11 (5520): No heartbeat from core client for 30 sec - exiting 19:14:12 (5520): No heartbeat from core client for 30 sec - exiting 19:14:13 (5520): No heartbeat from core client for 30 sec - exiting 19:14:14 (5520): No heartbeat from core client for 30 sec - exiting 19:14:15 (5520): No heartbeat from core client for 30 sec - exiting 19:14:16 (5520): No heartbeat from core client for 30 sec - exiting 19:14:17 (5520): No heartbeat from core client for 30 sec - exiting 19:14:18 (5520): No heartbeat from core client for 30 sec - exiting 19:14:19 (5520): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:16:33 (5332): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:17:26 (5964): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:18:40 (4856): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5624, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6080, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6080, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5928, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5928, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1824, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1824, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5852, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5852, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5852, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 19:45:04 (5612): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6084, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 20:07:17 (5320): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 13:36:13 (6116): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5432, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5556, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5116, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5116, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5504, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5504, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5868, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4860, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3932, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3932, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5928, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5928, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4660, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4680, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5576, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5576, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5576, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2884, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2884, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77736FCF read attempt to address 0x407D4797 Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77356FCF read attempt to address 0x407D4797 Engaging BOINC Windows Runtime Debugger... Cannot serialize file I:\BOINC\Data_dir/projects/climateprediction.net/hadcm3n_z8qe_1920_40_008316159/dataout/shmem_restart.day Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
24 Jun 2013 09:20:10 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 1,036,800 | 1,793,400 | 1.7297 |
20 Jun 2013 21:01:09 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 1,010,880 | 1,752,719 | 1.7339 |
19 Jun 2013 22:02:22 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 984,960 | 1,713,147 | 1.7393 |
16 Jun 2013 20:23:06 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 959,040 | 1,672,599 | 1.7440 |
15 Jun 2013 11:56:02 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 933,120 | 1,632,686 | 1.7497 |
11 Jun 2013 18:44:39 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 907,200 | 1,590,539 | 1.7532 |
08 Jun 2013 12:10:00 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 881,280 | 1,551,000 | 1.7599 |
05 Jun 2013 17:19:38 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 855,360 | 1,511,029 | 1.7665 |
02 Jun 2013 18:50:46 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 829,440 | 1,470,839 | 1.7733 |
01 Jun 2013 20:28:40 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 803,520 | 1,431,314 | 1.7813 |
30 May 2013 17:43:30 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 777,600 | 1,392,011 | 1.7901 |
27 May 2013 18:50:59 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 751,680 | 1,350,992 | 1.7973 |
24 May 2013 16:00:40 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 725,760 | 1,310,198 | 1.8053 |
20 May 2013 19:20:46 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 699,840 | 1,270,706 | 1.8157 |
18 May 2013 20:32:03 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 673,920 | 1,230,907 | 1.8265 |
18 May 2013 09:18:48 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 648,000 | 1,191,129 | 1.8382 |
12 May 2013 10:51:44 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 622,080 | 1,150,763 | 1.8499 |
11 May 2013 11:39:07 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 596,160 | 1,111,145 | 1.8638 |
08 May 2013 18:10:09 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 570,240 | 1,072,157 | 1.8802 |
05 May 2013 16:36:02 | 1094300 | 15629354 | hadcm3n_z8qe_1920_40_008316159_0 | 544,320 | 1,032,276 | 1.8965 |
©2025 cpdn.org