climateprediction.net home page
Task 13565616

Task 13565616

Name hadcm3n_yfi4_1900_40_007526155_2
Workunit 7723630
Created 29 Oct 2011, 21:08:18 UTC
Sent 29 Oct 2011, 21:16:22 UTC
Report deadline 29 Jan 2012, 4:43:33 UTC
Received 22 Nov 2011, 15:05:41 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 25 (0x00000019) Unknown error code
Computer ID 1174717
Run time 4 days 6 hours 38 min 52 sec
CPU time 4 days 1 hours 24 min 18 sec
Validate state Invalid
Credit 2,488.32
Device peak FLOPS 3.31 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.12.34</core_client_version>
<![CDATA[
<message>
The drive cannot locate a specific area or track on the disk. (0x19) - exit code 25 (0x19)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
13:47:12 (4328): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6876, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6876, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6844, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6844, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6844, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6844, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish
06:34:08 (5464): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
16:50:30 (3104): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4952, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4952, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4952, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4180, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
16:58:07 (4432): No heartbeat from core client for 30 sec - exiting
16:58:08 (4432): No heartbeat from core client for 30 sec - exiting
16:58:09 (4432): No heartbeat from core client for 30 sec - exiting
16:58:10 (4432): No heartbeat from core client for 30 sec - exiting
16:58:11 (4432): No heartbeat from core client for 30 sec - exiting
16:58:12 (4432): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5688, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5688, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
22 Nov 2011 13:20:06 1174717 13565616 hadcm3n_yfi4_1900_40_007526155_2 207,360 350,722 1.6914
21 Nov 2011 05:32:38 1174717 13565616 hadcm3n_yfi4_1900_40_007526155_2 181,440 317,710 1.7510
19 Nov 2011 12:34:41 1174717 13565616 hadcm3n_yfi4_1900_40_007526155_2 155,520 284,560 1.8297
19 Nov 2011 02:57:11 1174717 13565616 hadcm3n_yfi4_1900_40_007526155_2 129,600 251,381 1.9397
18 Nov 2011 17:21:31 1174717 13565616 hadcm3n_yfi4_1900_40_007526155_2 103,680 218,083 2.1034
15 Nov 2011 17:59:13 1174717 13565616 hadcm3n_yfi4_1900_40_007526155_2 77,760 185,720 2.3884
15 Nov 2011 17:59:13 1174717 13565616 hadcm3n_yfi4_1900_40_007526155_2 51,840 153,903 2.9688
15 Nov 2011 17:59:13 1174717 13565616 hadcm3n_yfi4_1900_40_007526155_2 25,920 32,088 1.2380


©2024 cpdn.org