climateprediction.net home page
Task 12897033

Task 12897033

Name hadam3p_eu_2qmp_1995_1_007236254_2
Workunit 7434494
Created 19 May 2011, 8:23:09 UTC
Sent 19 May 2011, 8:23:27 UTC
Report deadline 30 Apr 2012, 13:43:27 UTC
Received 4 Jun 2011, 20:47:26 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 194 (0x000000C2) EXIT_ABORTED_BY_CLIENT
Computer ID 1132548
Run time 6 days 1 hours 12 min 14 sec
CPU time 3 days 6 hours 35 min 27 sec
Validate state Invalid
Credit 1,194.02
Device peak FLOPS 1.68 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
Got ack for job that's till active
</message>
<stderr_txt>
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6956, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5708, iMonCtr=2
Model crash detected, will try to restart...
16:35:03 (4300): No heartbeat from core client for 30 sec - exiting
16:35:04 (4300): No heartbeat from core client for 30 sec - exiting
16:35:05 (4300): No heartbeat from core client for 30 sec - exiting
16:35:06 (4300): No heartbeat from core client for 30 sec - exiting
16:35:07 (4300): No heartbeat from core client for 30 sec - exiting
16:35:08 (4300): No heartbeat from core client for 30 sec - exiting
16:35:09 (4300): No heartbeat from core client for 30 sec - exiting
16:35:10 (4300): No heartbeat from core client for 30 sec - exiting
16:35:11 (4300): No heartbeat from core client for 30 sec - exiting
16:35:12 (4300): No heartbeat from core client for 30 sec - exiting
16:35:13 (4300): No heartbeat from core client for 30 sec - exiting
16:35:14 (4300): No heartbeat from core client for 30 sec - exiting
16:35:15 (4300): No heartbeat from core client for 30 sec - exiting
16:35:16 (4300): No heartbeat from core client for 30 sec - exiting
13:33:00 (368): No heartbeat from core client for 30 sec - exiting
13:33:01 (368): No heartbeat from core client for 30 sec - exiting
13:33:02 (368): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
13:34:45 (5724): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4396, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5964, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5316, iMonCtr=2
17:13:58 (4144): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3616, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5900, selfPID=4488, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1192, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1844, selfPID=4836, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6112, selfPID=4140, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3764, selfPID=3788, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=696, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1624, selfPID=4948, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3800, selfPID=5108, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5568, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2320, selfPID=4692, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3932, selfPID=4452, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...

BUFFOUT: Write Failed: No space left on device
BUFFOUT: C I/O Error - Return code = 32

Model crashed: WRITDUMP: BAD BUFFOUT OF DATA                                                                                                                                                                                                                                   tmp/xaakm.pipe_dummy                                                            2048    
Leaving CPDN_Main::Monitor...

zip error: Output file write failure (write error on zip file)
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
03 Jun 2011 13:15:06 1132548 12897033 hadam3p_eu_2qmp_1995_1_007236254_2 69,216 265,670 3.8383
02 Jun 2011 17:34:20 1132548 12897033 hadam3p_eu_2qmp_1995_1_007236254_2 57,696 223,486 3.8735
29 May 2011 16:57:35 1132548 12897033 hadam3p_eu_2qmp_1995_1_007236254_2 46,176 179,313 3.8833
28 May 2011 15:21:30 1132548 12897033 hadam3p_eu_2qmp_1995_1_007236254_2 34,656 132,326 3.8183
27 May 2011 04:59:19 1132548 12897033 hadam3p_eu_2qmp_1995_1_007236254_2 23,136 87,503 3.7821
21 May 2011 17:13:40 1132548 12897033 hadam3p_eu_2qmp_1995_1_007236254_2 11,616 43,916 3.7806


©2024 cpdn.org