climateprediction.net home page
Task 13260411

Task 13260411

Name hadam3p_saf_2cg3_1975_1_007411012_1
Workunit 7608442
Created 16 Aug 2011, 1:30:11 UTC
Sent 16 Aug 2011, 1:35:13 UTC
Report deadline 28 Jul 2012, 6:55:13 UTC
Received 20 Nov 2011, 7:05:34 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 193 (0x000000C1) EXIT_SIGNAL
Computer ID 989394
Run time 4 days 6 hours 42 min 21 sec
CPU time 3 days 6 hours 9 min 53 sec
Validate state Invalid
Credit 1,122.82
Device peak FLOPS 1.48 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Southern Africa v6.09
windows_intelx86
Stderr
<core_client_version>6.6.36</core_client_version>
<![CDATA[
<message>
 - exit code 193 (0xc1)
</message>
<stderr_txt>
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2308, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2220, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5300, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5472, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5316, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4132, selfPID=4476, iMonCtr=1
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5832, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6008, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4960, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4696, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5616, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2780, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5036, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4772, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=384, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5528, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5728, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6044, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4568, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3724, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4604, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2040, iMonCtr=2
Model crash detected, will try to restart...
17:42:31 (5332): No heartbeat from core client for 30 sec - exiting
17:42:32 (5332): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4004, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2492, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4836, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4024, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4432, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4568, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5904, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5828, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2476, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4244, iMonCtr=2
Model crash detected, will try to restart...
09:02:05 (4888): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
09:02:19 (4888): No heartbeat from core client for 30 sec - exiting
09:02:20 (4888): No heartbeat from core client for 30 sec - exiting
09:02:22 (4888): No heartbeat from core client for 30 sec - exiting
09:02:23 (4888): No heartbeat from core client for 30 sec - exiting
09:02:24 (4888): No heartbeat from core client for 30 sec - exiting
09:02:25 (4888): No heartbeat from core client for 30 sec - exiting
09:02:26 (4888): No heartbeat from core client for 30 sec - exiting
09:02:27 (4888): No heartbeat from core client for 30 sec - exiting
09:02:28 (4888): No heartbeat from core client for 30 sec - exiting
09:02:29 (4888): No heartbeat from core client for 30 sec - exiting
09:02:31 (4888): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3768, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5308, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4356, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5684, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2020, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5740, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4448, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4372, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4140, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3516, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4824, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2204, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5760, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5052, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6036, selfPID=4212, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5348, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5864, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4276, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3080, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3592, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4216, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4496, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4520, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5376, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5220, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4428, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4536, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4252, iMonCtr=2
Model crash detected, will try to restart...
23:00:39 (4540): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=528, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5356, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4456, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5616, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3992, iMonCtr=2
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3044, iMonCtr=2
Model crash detected, will try to restart...
Signal 11 received, exiting...
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
19 Nov 2011 23:35:42 989394 13260411 hadam3p_saf_2cg3_1975_1_007411012_1 69,216 280,609 4.0541
07 Nov 2011 01:25:01 989394 13260411 hadam3p_saf_2cg3_1975_1_007411012_1 57,696 235,205 4.0766
31 Oct 2011 18:12:18 989394 13260411 hadam3p_saf_2cg3_1975_1_007411012_1 46,176 189,742 4.1091
31 Oct 2011 15:20:57 989394 13260411 hadam3p_saf_2cg3_1975_1_007411012_1 34,656 143,917 4.1527
16 Oct 2011 19:48:20 989394 13260411 hadam3p_saf_2cg3_1975_1_007411012_1 23,136 97,976 4.2348
20 Sep 2011 20:32:51 989394 13260411 hadam3p_saf_2cg3_1975_1_007411012_1 11,616 51,194 4.4072


©2024 climateprediction.net