climateprediction.net home page
Task 12506947

Task 12506947

Name hadam3p_pnw_zr1j_1966_1_007013567_1
Workunit 7216883
Created 20 Jan 2011, 0:21:46 UTC
Sent 20 Jan 2011, 2:20:32 UTC
Report deadline 2 Jan 2012, 7:40:32 UTC
Received 26 Jan 2011, 9:34:06 UTC
Server state Over
Outcome No reply
Client state Compute error
Exit status 194 (0x000000C2) EXIT_ABORTED_BY_CLIENT
Computer ID 1118002
Run time 5 days 7 hours 42 min 8 sec
CPU time 3 days 22 hours 20 min 34 sec
Validate state Invalid
Credit 1,503.98
Device peak FLOPS 1.97 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Pacific North West v6.08
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
Got ack for job that's till active
</message>
<stderr_txt>
11:33:43 (4912): No heartbeat from core client for 30 sec - exiting
11:33:44 (4912): No heartbeat from core client for 30 sec - exiting
11:33:45 (4912): No heartbeat from core client for 30 sec - exiting
11:33:47 (4912): No heartbeat from core client for 30 sec - exiting
11:33:48 (4912): No heartbeat from core client for 30 sec - exiting
11:33:49 (4912): No heartbeat from core client for 30 sec - exiting
11:33:50 (4912): No heartbeat from core client for 30 sec - exiting
11:33:51 (4912): No heartbeat from core client for 30 sec - exiting
11:33:52 (4912): No heartbeat from core client for 30 sec - exiting
11:33:53 (4912): No heartbeat from core client for 30 sec - exiting
11:33:54 (4912): No heartbeat from core client for 30 sec - exiting
11:33:55 (4912): No heartbeat from core client for 30 sec - exiting
11:33:56 (4912): No heartbeat from core client for 30 sec - exiting
11:33:57 (4912): No heartbeat from core client for 30 sec - exiting
11:33:59 (4912): No heartbeat from core client for 30 sec - exiting
11:34:00 (4912): No heartbeat from core client for 30 sec - exiting
11:34:01 (4912): No heartbeat from core client for 30 sec - exiting
11:34:02 (4912): No heartbeat from core client for 30 sec - exiting
11:34:03 (4912): No heartbeat from core client for 30 sec - exiting
11:34:04 (4912): No heartbeat from core client for 30 sec - exiting
11:34:05 (4912): No heartbeat from core client for 30 sec - exiting
11:34:06 (4912): No heartbeat from core client for 30 sec - exiting
11:34:07 (4912): No heartbeat from core client for 30 sec - exiting
11:34:08 (4912): No heartbeat from core client for 30 sec - exiting
11:34:10 (4912): No heartbeat from core client for 30 sec - exiting
11:34:11 (4912): No heartbeat from core client for 30 sec - exiting
11:34:12 (4912): No heartbeat from core client for 30 sec - exiting
11:34:13 (4912): No heartbeat from core client for 30 sec - exiting
11:34:14 (4912): No heartbeat from core client for 30 sec - exiting
11:34:15 (4912): No heartbeat from core client for 30 sec - exiting
11:34:16 (4912): No heartbeat from core client for 30 sec - exiting
11:34:17 (4912): No heartbeat from core client for 30 sec - exiting
11:34:18 (4912): No heartbeat from core client for 30 sec - exiting
11:34:19 (4912): No heartbeat from core client for 30 sec - exiting
11:34:20 (4912): No heartbeat from core client for 30 sec - exiting
11:34:22 (4912): No heartbeat from core client for 30 sec - exiting
11:34:23 (4912): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5956, selfPID=4048, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 1
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7700, selfPID=5764, iMonCtr=1
Model crash detected, will try to restart...
Glontroller:: CPDN process is sot running, exiting, bRetVal = 1, checkPID=0, selfPID=5912, iMonCtr=2
300, iMonCtr=2
tected, will try to restart...
Leaving CPDN_Main::Monitor...
18:11:55 (5152): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3216, selfPID=5688, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 4
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6668, selfPID=4464, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5904, iMonCtr=2
Mode
l crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8160, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6196, selfPID=4532, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
BUFFOUT: C I/O Error - Return code = 32

Model crashed: WRITDUMP: BAD BUFFOUT OF DATA                                                                                                                                                                                                                                   tmp/xaakm.pipe_dummy                                                            2048    
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 6

zip error: Output file write failure (write error on zip file)
04:33:06 (6088): called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
25 Jan 2011 18:31:20 1118002 12506947 hadam3p_pnw_zr1j_1966_1_007013567_1 69,216 307,823 4.4473
24 Jan 2011 22:59:34 1118002 12506947 hadam3p_pnw_zr1j_1966_1_007013567_1 57,696 257,336 4.4602
24 Jan 2011 02:24:14 1118002 12506947 hadam3p_pnw_zr1j_1966_1_007013567_1 46,176 205,821 4.4573
23 Jan 2011 11:19:12 1118002 12506947 hadam3p_pnw_zr1j_1966_1_007013567_1 34,656 154,138 4.4477
22 Jan 2011 10:03:24 1118002 12506947 hadam3p_pnw_zr1j_1966_1_007013567_1 23,136 103,428 4.4704
21 Jan 2011 14:43:45 1118002 12506947 hadam3p_pnw_zr1j_1966_1_007013567_1 11,616 52,773 4.5431


©2024 cpdn.org