climateprediction.net home page
Task 15629354

Task 15629354

Name hadcm3n_z8qe_1920_40_008316159_0
Workunit 8467294
Created 23 Feb 2013, 19:09:39 UTC
Sent 23 Feb 2013, 19:09:42 UTC
Report deadline 26 May 2013, 2:36:53 UTC
Received 1 Jul 2013, 17:51:39 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 193 (0x000000C1) EXIT_SIGNAL
Computer ID 1094300
Run time 24 days 10 hours 40 min 15 sec
CPU time 20 days 18 hours 11 min 25 sec
Validate state Invalid
Credit 12,441.60
Device peak FLOPS 2.72 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.10.56</core_client_version>
<![CDATA[
<message>
 - exit code 193 (0xc1)
</message>
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5884, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4592, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
19:19:59 (4564): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:21:55 (5124): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:41:24 (2496): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:41:25 (2496): No heartbeat from core client for 30 sec - exiting
19:42:59 (3872): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:43:00 (3872): No heartbeat from core client for 30 sec - exiting
20:06:04 (4252): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:06:05 (4252): No heartbeat from core client for 30 sec - exiting
20:23:04 (5236): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:23:05 (5236): No heartbeat from core client for 30 sec - exiting
20:30:51 (4340): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:30:52 (4340): No heartbeat from core client for 30 sec - exiting
20:54:04 (3588): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:54:05 (3588): No heartbeat from core client for 30 sec - exiting
21:17:37 (4592): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:17:38 (4592): No heartbeat from core client for 30 sec - exiting
22:10:34 (4128): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
22:10:35 (4128): No heartbeat from core client for 30 sec - exiting
22:36:12 (4004): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
22:36:13 (4004): No heartbeat from core client for 30 sec - exiting
23:03:56 (1392): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
23:03:57 (1392): No heartbeat from core client for 30 sec - exiting
23:29:39 (5680): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
23:29:40 (5680): No heartbeat from core client for 30 sec - exiting
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5696, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5696, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5696, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5696, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5696, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5696, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish
19:14:09 (5520): No heartbeat from core client for 30 sec - exiting
19:14:10 (5520): No heartbeat from core client for 30 sec - exiting
19:14:11 (5520): No heartbeat from core client for 30 sec - exiting
19:14:12 (5520): No heartbeat from core client for 30 sec - exiting
19:14:13 (5520): No heartbeat from core client for 30 sec - exiting
19:14:14 (5520): No heartbeat from core client for 30 sec - exiting
19:14:15 (5520): No heartbeat from core client for 30 sec - exiting
19:14:16 (5520): No heartbeat from core client for 30 sec - exiting
19:14:17 (5520): No heartbeat from core client for 30 sec - exiting
19:14:18 (5520): No heartbeat from core client for 30 sec - exiting
19:14:19 (5520): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:16:33 (5332): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:17:26 (5964): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:18:40 (4856): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5624, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6080, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6080, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5928, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5928, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1824, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1824, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5852, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5852, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5852, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
19:45:04 (5612): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6084, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
20:07:17 (5320): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
13:36:13 (6116): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5432, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5556, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5116, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5116, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5504, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5504, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5868, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4860, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3932, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3932, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5928, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5928, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4660, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4680, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5576, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5576, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5576, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2884, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2884, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x77736FCF read attempt to address 0x407D4797

Engaging BOINC Windows Runtime Debugger...



Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x77356FCF read attempt to address 0x407D4797

Engaging BOINC Windows Runtime Debugger...

Cannot serialize file I:\BOINC\Data_dir/projects/climateprediction.net/hadcm3n_z8qe_1920_40_008316159/dataout/shmem_restart.day
Signal 11 received, exiting...
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
24 Jun 2013 09:20:10 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 1,036,800 1,793,400 1.7297
20 Jun 2013 21:01:09 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 1,010,880 1,752,719 1.7339
19 Jun 2013 22:02:22 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 984,960 1,713,147 1.7393
16 Jun 2013 20:23:06 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 959,040 1,672,599 1.7440
15 Jun 2013 11:56:02 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 933,120 1,632,686 1.7497
11 Jun 2013 18:44:39 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 907,200 1,590,539 1.7532
08 Jun 2013 12:10:00 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 881,280 1,551,000 1.7599
05 Jun 2013 17:19:38 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 855,360 1,511,029 1.7665
02 Jun 2013 18:50:46 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 829,440 1,470,839 1.7733
01 Jun 2013 20:28:40 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 803,520 1,431,314 1.7813
30 May 2013 17:43:30 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 777,600 1,392,011 1.7901
27 May 2013 18:50:59 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 751,680 1,350,992 1.7973
24 May 2013 16:00:40 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 725,760 1,310,198 1.8053
20 May 2013 19:20:46 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 699,840 1,270,706 1.8157
18 May 2013 20:32:03 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 673,920 1,230,907 1.8265
18 May 2013 09:18:48 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 648,000 1,191,129 1.8382
12 May 2013 10:51:44 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 622,080 1,150,763 1.8499
11 May 2013 11:39:07 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 596,160 1,111,145 1.8638
08 May 2013 18:10:09 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 570,240 1,072,157 1.8802
05 May 2013 16:36:02 1094300 15629354 hadcm3n_z8qe_1920_40_008316159_0 544,320 1,032,276 1.8965


©2024 climateprediction.net