climateprediction.net home page
Task 15899015

Task 15899015

Name hadcm3n_o0ti_1940_40_008383185_2
Workunit 8534044
Created 21 Jul 2013, 17:14:59 UTC
Sent 21 Jul 2013, 17:15:24 UTC
Report deadline 21 Oct 2013, 0:42:35 UTC
Received 9 Nov 2013, 8:41:09 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status -1073741819 (0xC0000005) STATUS_ACCESS_VIOLATION
Computer ID 1279396
Run time 27 days 4 hours 36 min 15 sec
CPU time 26 days 7 hours 44 min 54 sec
Validate state Invalid
Credit 11,819.52
Device peak FLOPS 2.59 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.64</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3784, iMonCtr=1
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3576, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3512, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3440, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4816, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2828, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4064, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not rController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3904, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3896, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3572, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3572, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3572, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3572, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3572, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3572, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish
21:56:00 (2932): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
BUFFIN: C I/O Error feof - Unit 63 - Return code = 16
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3280, iMonCtr=1
Model crash detected, will try to restart...


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x779AFF6B write attempt to address 0x4334478F

Engaging BOINC Windows Runtime Debugger...



Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x76F471B8 read attempt to address 0xFFFFFFF8

Engaging BOINC Windows Runtime Debugger...



Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x77D63AC3 read attempt to address 0x00000000

Engaging BOINC Windows Runtime Debugger...


</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
02 Oct 2013 01:45:37 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 984,960 1,267,515 1.2869
01 Oct 2013 16:14:43 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 959,040 1,234,490 1.2872
30 Sep 2013 22:36:08 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 933,120 1,201,849 1.2880
30 Sep 2013 13:13:03 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 907,200 1,169,048 1.2886
29 Sep 2013 11:19:43 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 881,280 1,135,865 1.2889
27 Sep 2013 14:09:45 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 855,360 1,102,582 1.2890
26 Sep 2013 10:32:51 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 829,440 1,069,743 1.2897
25 Sep 2013 17:25:52 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 803,520 1,036,914 1.2905
23 Sep 2013 23:19:13 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 777,600 1,003,763 1.2908
23 Sep 2013 12:13:23 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 751,680 970,224 1.2907
20 Sep 2013 19:46:27 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 725,760 936,646 1.2906
20 Sep 2013 00:06:02 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 699,840 902,710 1.2899
19 Sep 2013 11:12:55 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 673,920 869,452 1.2901
17 Sep 2013 15:23:08 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 648,000 836,689 1.2912
15 Sep 2013 14:00:35 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 622,080 803,694 1.2919
12 Sep 2013 22:13:36 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 596,160 770,668 1.2927
11 Sep 2013 14:02:25 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 570,240 737,769 1.2938
10 Sep 2013 03:52:26 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 544,320 704,947 1.2951
09 Sep 2013 18:23:21 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 518,400 672,165 1.2966
08 Sep 2013 23:31:01 1279396 15899015 hadcm3n_o0ti_1940_40_008383185_2 492,480 639,079 1.2977


©2024 cpdn.org