climateprediction.net home page
Task 15438568

Task 15438568

Name hadam3p_pnw_780s_2001_1_007605636_1
Workunit 7783766
Created 18 Nov 2012, 19:49:58 UTC
Sent 18 Nov 2012, 19:50:05 UTC
Report deadline 1 Nov 2013, 1:10:05 UTC
Received 15 Mar 2013, 20:46:22 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 194 (0x000000C2) EXIT_ABORTED_BY_CLIENT
Computer ID 1067422
Run time 5 days 21 hours 11 min 4 sec
CPU time 5 days 21 hours 11 min 4 sec
Validate state Invalid
Credit 2,755.56
Device peak FLOPS 2.08 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Pacific North West v6.09
windows_intelx86
Stderr
<core_client_version>6.4.7</core_client_version>
<![CDATA[
<message>
Got ack for job that's till active
</message>
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4964, selfPID=3984, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5000, selfPID=5000, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5984, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4684, iMonCtr=2
Model crash detected, will try to restart...
21:48:39 (4548): No heartbeat from core client for 30 sec - exiting
21:48:40 (4548): No heartbeat from core client for 30 sec - exiting
21:48:41 (4548): No heartbeat from core client for 30 sec - exiting
21:48:42 (4548): No heartbeat from core client for 30 sec - exiting
21:48:43 (4548): No heartbeat from core client for 30 sec - exiting
21:48:44 (4548): No heartbeat from core client for 30 sec - exiting
21:48:45 (4548): No heartbeat from core client for 30 sec - exiting
21:48:46 (4548): No heartbeat from core client for 30 sec - exiting
21:48:47 (4548): No heartbeat from core client for 30 sec - exiting
21:48:48 (4548): No heartbeat from core client for 30 sec - exiting
21:48:50 (4548): No heartbeat from core client for 30 sec - exiting
21:48:51 (4548): No heartbeat from core client for 30 sec - exiting
21:48:52 (4548): No heartbeat from core client for 30 sec - exiting
21:48:53 (4548): No heartbeat from core client for 30 sec - exiting
21:48:54 (4548): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5884, selfPID=3412, iMonCtr=1
Model crash detected, will try to restart...
19:22:56 (2928): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
GCPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1396, selfPID=1396, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3752, selfPID=2848, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 6
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
07:33:16 (2776): No heartbeat from core client for 30 sec - exiting
07:33:17 (2776): No heartbeat from core client for 30 sec - exiting
07:33:19 (2776): No heartbeat from core client for 30 sec - exiting
07:33:20 (2776): No heartbeat from core client for 30 sec - exiting
07:33:21 (2776): No heartbeat from core client for 30 sec - exiting
07:33:22 (2776): No heartbeat from core client for 30 sec - exiting
07:33:23 (2776): No heartbeat from core client for 30 sec - exiting
07:33:24 (2776): No heartbeat from core client for 30 sec - exiting
07:33:25 (2776): No heartbeat from core client for 30 sec - exiting
07:33:26 (2776): No heartbeat from core client for 30 sec - exiting
07:33:27 (2776): No heartbeat from core client for 30 sec - exiting
07:33:28 (2776): No heartbeat from core client for 30 sec - exiting
07:33:29 (2776): No heartbeat from core client for 30 sec - exiting
07:33:31 (2776): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
07:33:32 (2776): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5408, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6880, selfPID=6880, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6996, selfPID=6416, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4420, selfPID=4352, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4072, iMonCtr=2
Model crash detected, will try to restart...

BUFFOUT: Write Failed: No space left on device
BUFFOUT: C I/O Error - Return code = 32

Model crashed: WRITDUMP: BAD BUFFOUT OF DATA                                                                                                                                                                                                                                   tmp/xaakm.pipe_dummy                                                            2048    
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 11

zip error: Output file write failure (write error on zip file)
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
03 Mar 2013 21:44:39 1067422 15438568 hadam3p_pnw_780s_2001_1_007605636_1 126,816 475,856 3.7523
07 Feb 2013 00:30:08 1067422 15438568 hadam3p_pnw_780s_2001_1_007605636_1 115,296 433,119 3.7566
01 Feb 2013 20:31:20 1067422 15438568 hadam3p_pnw_780s_2001_1_007605636_1 103,776 389,479 3.7531
18 Jan 2013 10:45:47 1067422 15438568 hadam3p_pnw_780s_2001_1_007605636_1 92,256 344,392 3.7330
13 Jan 2013 15:05:25 1067422 15438568 hadam3p_pnw_780s_2001_1_007605636_1 80,736 301,854 3.7388
02 Jan 2013 17:27:23 1067422 15438568 hadam3p_pnw_780s_2001_1_007605636_1 69,216 259,529 3.7496
26 Dec 2012 16:15:38 1067422 15438568 hadam3p_pnw_780s_2001_1_007605636_1 57,696 217,036 3.7617
24 Dec 2012 02:14:48 1067422 15438568 hadam3p_pnw_780s_2001_1_007605636_1 46,176 174,826 3.7861
19 Dec 2012 20:47:17 1067422 15438568 hadam3p_pnw_780s_2001_1_007605636_1 34,656 131,193 3.7856
06 Dec 2012 20:08:41 1067422 15438568 hadam3p_pnw_780s_2001_1_007605636_1 23,144 87,811 3.7941
05 Dec 2012 21:45:55 1067422 15438568 hadam3p_pnw_780s_2001_1_007605636_1 23,137 87,346 3.7752
04 Dec 2012 22:01:58 1067422 15438568 hadam3p_pnw_780s_2001_1_007605636_1 23,136 86,891 3.7557
25 Nov 2012 18:32:27 1067422 15438568 hadam3p_pnw_780s_2001_1_007605636_1 11,616 44,220 3.8068


©2024 cpdn.org