climateprediction.net home page
Task 13969528

Task 13969528

Name hadam3p_pnw_8l9l_1999_1_007700517_0
Workunit 7855625
Created 25 Jan 2012, 17:23:06 UTC
Sent 25 Jan 2012, 17:26:33 UTC
Report deadline 6 Jan 2013, 22:46:33 UTC
Received 10 Feb 2012, 0:07:37 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status -187 (0xFFFFFF45) ERR_RESULT_UPLOAD
Computer ID 1192939
Run time 6 days 0 hours 2 min 39 sec
CPU time 5 days 13 hours 11 min 4 sec
Validate state Invalid
Credit 1,754.30
Device peak FLOPS 1.55 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Pacific North West v6.09
windows_intelx86
Stderr
<core_client_version>6.12.34</core_client_version>
<![CDATA[
<message>
upload failure
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3348, selfPID=2600, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4716, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
09:59:28 (4880): No heartbeat from core client for 30 sec - exiting
09:59:29 (4880): No heartbeat from core client for 30 sec - exiting
09:59:30 (4880): No heartbeat from core client for 30 sec - exiting
09:59:31 (4880): No heartbeat from core client for 30 sec - exiting
09:59:32 (4880): No heartbeat from core client for 30 sec - exiting
09:59:33 (4880): No heartbeat from core client for 30 sec - exiting
09:59:34 (4880): No heartbeat from core client for 30 sec - exiting
09:59:35 (4880): No heartbeat from core client for 30 sec - exiting
09:59:37 (4880): No heartbeat from core client for 30 sec - exiting
09:59:38 (4880): No heartbeat from core client for 30 sec - exiting
09:59:39 (4880): No heartbeat from core client for 30 sec - exiting
09:59:40 (4880): No heartbeat from core client for 30 sec - exiting
09:59:41 (4880): No heartbeat from core client for 30 sec - exiting
09:59:42 (4880): No heartbeat from core client for 30 sec - exiting
09:59:43 (4880): No heartbeat from core client for 30 sec - exiting
09:59:44 (4880): No heartbeat from core client for 30 sec - exiting
09:59:45 (4880): No heartbeat from core client for 30 sec - exiting
09:59:46 (4880): No heartbeat from core client for 30 sec - exiting
09:59:47 (4880): No heartbeat from core client for 30 sec - exiting
09:59:48 (4880): No heartbeat from core client for 30 sec - exiting
09:59:49 (4880): No heartbeat from core client for 30 sec - exiting
09:59:50 (4880): No heartbeat from core client for 30 sec - exiting
09:59:51 (4880): No heartbeat from core client for 30 sec - exiting
09:59:52 (4880): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
09:59:53 (4880): No heartbeat from core client for 30 sec - exiting
09:59:54 (4880): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6216, selfPID=5424, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4236, selfPID=4932, iMonCtr=1
Model crash detected, will try to restart...
22:56:16 (1600): No heartbeat from core client for 30 sec - exiting
22:56:17 (1600): No heartbeat from core client for 30 sec - exiting
22:56:18 (1600): No heartbeat from core client for 30 sec - exiting
22:56:19 (1600): No heartbeat from core client for 30 sec - exiting
22:56:20 (1600): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
22:56:21 (1600): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3356, selfPID=5848, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1780, selfPID=5624, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5764, selfPID=5236, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...

BUFFOUT: Write Failed: No space left on device
BUFFOUT: C I/O Error - Return code = 32

Model crashed: WRITDUMP: BAD BUFFOUT OF DATA                                                                                                                                                                                                                                   tmp/xaakm.pipe_dummy                                                            2048    
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3148, selfPID=3148, iMonCtr=2
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 7

zip error: Output file write failure (write error on zip file)
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
08 Feb 2012 15:25:32 1192939 13969528 hadam3p_pnw_8l9l_1999_1_007700517_0 80,736 442,550 5.4814
06 Feb 2012 15:08:48 1192939 13969528 hadam3p_pnw_8l9l_1999_1_007700517_0 69,216 378,682 5.4710
04 Feb 2012 09:52:06 1192939 13969528 hadam3p_pnw_8l9l_1999_1_007700517_0 57,696 315,995 5.4769
01 Feb 2012 13:20:00 1192939 13969528 hadam3p_pnw_8l9l_1999_1_007700517_0 46,176 250,603 5.4271
30 Jan 2012 13:58:52 1192939 13969528 hadam3p_pnw_8l9l_1999_1_007700517_0 34,656 186,390 5.3783
28 Jan 2012 19:32:25 1192939 13969528 hadam3p_pnw_8l9l_1999_1_007700517_0 23,136 123,983 5.3589
27 Jan 2012 15:11:49 1192939 13969528 hadam3p_pnw_8l9l_1999_1_007700517_0 11,616 61,668 5.3089


©2024 cpdn.org