climateprediction.net home page

The world's largest climate forecasting experiment for the 21st century.

transient upload error


Advanced search

Message boards : Number crunching : transient upload error

AuthorMessage
Vepide
Send message
Joined: Aug 31 12
Posts: 40
Credit: 4,773
RAC: 0
Message 44988 - Posted 2 Oct 2012 16:59:45 UTC

    I've been getting the same thing must be over a week now and there aren't any problems with my internet connection.


    10/2/2012 12:59:52 AM | climateprediction.net | [error] Error reported by file upload server: Server is out of disk space
    10/2/2012 12:59:52 AM | climateprediction.net | Temporarily failed upload of hadam3p_eu_6avw_2007_1_008175177_0_7.zip: transient upload error
    10/2/2012 12:59:52 AM | climateprediction.net | Backing off 5 hr 33 min 17 sec on upload of hadam3p_eu_6avw_2007_1_008175177_0_7.zip

    ____________

    Profile [B@H] Ray
    Avatar
    Send message
    Joined: Aug 19 05
    Posts: 103
    Credit: 1,700,197
    RAC: 597
    Message 44990 - Posted 2 Oct 2012 17:27:30 UTC

      Are you having problems with your mouse sending extra clicks?

      That is a known disk problem in the upload server, a replacement is being worked on.
      ____________
      Keep on crunching Pizza@Home

      Profile astroWX
      Forum moderator
      Send message
      Joined: Aug 5 04
      Posts: 1250
      Credit: 34,995,599
      RAC: 23,022
      Message 44991 - Posted 2 Oct 2012 18:43:25 UTC

        Last modified: 2 Oct 2012 19:14:57 UTC

        Eighteen redundant posts: 'Tis frustrating dealing with that, especially when the server is S-L-O-W.

        Edit: ... and 21 redundant posts in a separate thread. There is no point in posting in two places. 40 posts is overkill (to state the obvious).
        ____________
        "We have met the enemy and he is us." -- Pogo
        Greetings from coastal Washington state, the scenic US Pacific Northwest.

        Profile Dave Jackson
        Send message
        Joined: May 15 09
        Posts: 605
        Credit: 581,731
        RAC: 157
        Message 45056 - Posted 11 Oct 2012 11:01:07 UTC

          Last modified: 11 Oct 2012 11:16:03 UTC

          The transient upload error is back
          Thu 11 Oct 2012 11:37:32 BST | climateprediction.net | Started upload of hadam3p_eu_w7zl_1992_1_006805705_2_3.zip
          Thu 11 Oct 2012 11:51:30 BST | climateprediction.net | [error] Error reported by file upload server: EOF on socket read : asked for 16382, got 14028
          Thu 11 Oct 2012 11:51:30 BST | climateprediction.net | Temporarily failed upload of hadam3p_eu_w7zl_1992_1_006805705_2_3.zip: transient upload error
          Thu 11 Oct 2012 11:51:30 BST | climateprediction.net | Backing off 12 min 37 sec on upload of hadam3p_eu_w7zl_1992_1_006805705_2_3.zip

          Shortly before I looked at the messages, it was uploading albeit only at 4.5KB/s an order of magnitude slower than I normally get.


          I noticed the line file upload server: EOF on socket read : asked for 16382, got 14028 Does this indicate a problem with the work unit? An hour or so earlier a zip file from the other task running on the machine went through ok so it will be a few hours before I can check if it is experiencing the same symptoms and even longer till the next upload from my dual core atom machine!

          Edit: Watched the last try, progress indicator on upload gets to 100% before the error message. do I abort the work unit?

          Profile Dave Jackson
          Send message
          Joined: May 15 09
          Posts: 605
          Credit: 581,731
          RAC: 157
          Message 45057 - Posted 11 Oct 2012 13:20:49 UTC

            OK panic over! It went through @ the 7th attempt. I am left with a certain curiosity as to what might have been the problem however and if it is transferring 100% each time before telling me it has failed that is a fair amount of bandwidth to waste. Will check when next zip from that task goes through and report back.

            Dave

            Les Bayliss
            Forum moderator
            Send message
            Joined: Sep 5 04
            Posts: 5129
            Credit: 8,459,347
            RAC: 5,837
            Message 45058 - Posted 11 Oct 2012 14:38:23 UTC - in response to Message 45056.

              These errors are an indication of a server under heavy load.


              ____________
              Backups: Here

              Les Bayliss
              Forum moderator
              Send message
              Joined: Sep 5 04
              Posts: 5129
              Credit: 8,459,347
              RAC: 5,837
              Message 45059 - Posted 11 Oct 2012 14:50:47 UTC - in response to Message 45057.

                Close examination of all the 'tell tales' on my system, hd, the LAN socket led, router, and modem, indicate that the "data transmission", isn't. It seems to be that BOINC has to go through the entire zip file to get to where it left off, and THEN it starts actually sending data to the internet.
                A bit like looking for where you're up to on a video tape that rewinds to the start each time, rather than a dvd, which can just jump to anywhere in a fraction of a second.


                ____________
                Backups: Here

                Profile Dave Jackson
                Send message
                Joined: May 15 09
                Posts: 605
                Credit: 581,731
                RAC: 157
                Message 45060 - Posted 11 Oct 2012 16:16:31 UTC - in response to Message 45059.

                  Thanks Les, I am not so sure about the data not being transmitted though as my impression from the lights on the router was that data was being transmitted. Anyway will check in a few hours time or tomorrow morning to see what happens with the next one.

                  Dave

                  Les Bayliss
                  Forum moderator
                  Send message
                  Joined: Sep 5 04
                  Posts: 5129
                  Credit: 8,459,347
                  RAC: 5,837
                  Message 45061 - Posted 11 Oct 2012 19:36:37 UTC

                    And another thing - clicking the Abort button doesn't just stop the current transfer, it causes BOINC to delete the zip file from the computer.


                    ____________
                    Backups: Here

                    Profile Dave Jackson
                    Send message
                    Joined: May 15 09
                    Posts: 605
                    Credit: 581,731
                    RAC: 157
                    Message 45062 - Posted 11 Oct 2012 19:51:27 UTC - in response to Message 45061.

                      Thanks for the reminder Les, I wasn't going to abort at least till I saw what happened with the next zip. If it happens even once I shall suspend network activity for a couple of days to see if things improve.

                      Dave

                      Profile Dave Jackson
                      Send message
                      Joined: May 15 09
                      Posts: 605
                      Credit: 581,731
                      RAC: 157
                      Message 45067 - Posted 12 Oct 2012 14:11:29 UTC

                        Fri 12 Oct 2012 13:33:06 BST | climateprediction.net | Temporarily failed upload of hadam3p_eu_w042_1961_1_006766698_2_6.zip: transient upload error
                        Fri 12 Oct 2012 14:23:40 BST | climateprediction.net | Temporarily failed upload of hadam3p_eu_w7zl_1992_1_006805705_2_4.zip: transient upload error

                        Got this along with the unexpected eof on both the latest zips so will try suspending internet activity for a while though one did get through from my atom box, albeit on 2nd attempt.

                        Dave

                        Profile KWSN-Sir Papa Smurph
                        Send message
                        Joined: Sep 25 12
                        Posts: 4
                        Credit: 32,662
                        RAC: 0
                        Message 45126 - Posted 22 Oct 2012 13:15:06 UTC

                          Last modified: 22 Oct 2012 13:18:15 UTC

                          I am also getting an Error however mine is slightly different.

                          10/22/2012 9:04:45 AM | climateprediction.net | Started upload of hadam3p_eu_2rhd_1982_1_008209863_0_11.zip
                          10/22/2012 9:04:45 AM | climateprediction.net | Started upload of hadam3p_eu_2kbg_1999_1_008210105_0_11.zip
                          10/22/2012 9:04:47 AM | climateprediction.net | Temporarily failed upload of hadam3p_eu_2rhd_1982_1_008209863_0_11.zip: transient HTTP error
                          10/22/2012 9:04:47 AM | climateprediction.net | Backing off 5 hr 6 min 30 sec on upload of hadam3p_eu_2rhd_1982_1_008209863_0_11.zip
                          10/22/2012 9:04:47 AM | climateprediction.net | Temporarily failed upload of hadam3p_eu_2kbg_1999_1_008210105_0_11.zip: transient HTTP error
                          10/22/2012 9:04:47 AM | climateprediction.net | Backing off 4 hr 19 min 15 sec on upload of hadam3p_eu_2kbg_1999_1_008210105_0_11.zip
                          10/22/2012 9:04:48 AM | climateprediction.net | Started upload of hadam3p_eu_2kbg_1999_1_008210105_0_12.zip
                          10/22/2012 9:04:49 AM | climateprediction.net | Temporarily failed upload of hadam3p_eu_2kbg_1999_1_008210105_0_12.zip: transient HTTP error
                          10/22/2012 9:04:49 AM | climateprediction.net | Backing off 4 hr 34 min 19 sec on upload of hadam3p_eu_2kbg_1999_1_008210105_0_12.zip


                          This has been going on for over a Week. I thought that it was because uploader1.atm was not running, However after reading this I am not so sure...


                          This is over 200 hours of work, I really don't want to lose it....

                          Dave Roberts
                          Send message
                          Joined: Jan 15 11
                          Posts: 57
                          Credit: 1,232,854
                          RAC: 607
                          Message 45127 - Posted 22 Oct 2012 16:06:12 UTC

                            I might be wrong but isn't a reasonable solution to suspend the project and wait until server problems are sorted out. That way, network activity isn't affected on other projects.
                            I did this a while ago when there were similar problems. I checked each day to see if it was worth a try.
                            Worked for me and stopped possible loss of results. We can't get any more work at the moment anyway.

                            Les Bayliss
                            Forum moderator
                            Send message
                            Joined: Sep 5 04
                            Posts: 5129
                            Credit: 8,459,347
                            RAC: 5,837
                            Message 45131 - Posted 22 Oct 2012 18:43:14 UTC

                              Last modified: 22 Oct 2012 19:08:43 UTC

                              New News post up the top of this section.

                              About the only thing that can be done on computers, is to set the Network activity off. But then it needs to be manually watched for those people with multiple projects with short completion times.
                              Perhaps once an hour have a short on, and then off again when the other projects have uploaded.
                              And Suspend cpdn from running in Tasks, so you don't get any more zips.
                              ____________
                              Backups: Here

                              Profile KWSN-Sir Papa Smurph
                              Send message
                              Joined: Sep 25 12
                              Posts: 4
                              Credit: 32,662
                              RAC: 0
                              Message 45138 - Posted 23 Oct 2012 17:35:36 UTC

                                Andy is looking into whether there is anything that can be done to cancel these work units.


                                So what your saying is that after spending 500 hours ( the equivalent of ~21 days ) crunching work for your project your intention is to cancel the WU and give me no credit for 3 weeks of work?

                                Am I reading this correctly?

                                Profile astroWX
                                Forum moderator
                                Send message
                                Joined: Aug 5 04
                                Posts: 1250
                                Credit: 34,995,599
                                RAC: 23,022
                                Message 45139 - Posted 23 Oct 2012 19:06:19 UTC - in response to Message 45138.

                                  Andy is looking into whether there is anything that can be done to cancel these work units.


                                  So what your saying is that after spending 500 hours ( the equivalent of ~21 days ) crunching work for your project your intention is to cancel the WU and give me no credit for 3 weeks of work?

                                  Am I reading this correctly?

                                  Hardly.

                                  To start with, a time-line analysis would be useful. Was your task in the lot Les mentioned? If its release can be squeezed into the recent few days, you might have a concern, but only for scientific relevance (identical tasks run twice).

                                  As has been mentioned on these boards, over and over and over..., credits are awarded for Trickles returned, not at the completion of the task. No credits withdrawn, no exceptions.
                                  ____________
                                  "We have met the enemy and he is us." -- Pogo
                                  Greetings from coastal Washington state, the scenic US Pacific Northwest.

                                  skirtedrunner
                                  Send message
                                  Joined: Sep 6 12
                                  Posts: 1
                                  Credit: 11,715
                                  RAC: 0
                                  Message 45149 - Posted 24 Oct 2012 9:56:02 UTC

                                    I am experiencing the same trouble as Dave Jackson. My hadam3p_eu files are not uploading. My user id is 684918.

                                    Profile Dave Jackson
                                    Send message
                                    Joined: May 15 09
                                    Posts: 605
                                    Credit: 581,731
                                    RAC: 157
                                    Message 45150 - Posted 24 Oct 2012 10:21:47 UTC - in response to Message 45149.

                                      Though all mine have gone through and have been going through again for a while now. I am a day or so away from requesting any new work so with a bit of luck the problems on that score will be gone by the time I need some!

                                      Profile KWSN-Sir Papa Smurph
                                      Send message
                                      Joined: Sep 25 12
                                      Posts: 4
                                      Credit: 32,662
                                      RAC: 0
                                      Message 45151 - Posted 24 Oct 2012 11:05:29 UTC

                                        Last modified: 24 Oct 2012 11:06:10 UTC

                                        Firstly, Thanks for responding astro;

                                        I have looked at this thread and the News thread mentioned here and I can find no reference as to how I can determine if my WU's are in a specific batch.
                                        I can tell you that they were both downloaded on 3 Oct 2012.

                                        As far as credits being awarded, I don't post or read the boards often & considering how long this project has been around there must be tens of thousands of posts. So I do apologize for not knowing your credit system.

                                        I have run this project on a couple/3 occasions and I have always had some kind of problem, there is a reason that out of over 250 million credits in 41 projects I have less than 12k in this one. It seems that I just have bad luck in Climate.

                                        I will just wait until y'all sort this one out.

                                        Thanks again for responding...

                                        Profile KWSN-Sir Papa Smurph
                                        Send message
                                        Joined: Sep 25 12
                                        Posts: 4
                                        Credit: 32,662
                                        RAC: 0
                                        Message 45178 - Posted 27 Oct 2012 13:47:45 UTC - in response to Message 45151.

                                          Last modified: 27 Oct 2012 13:51:01 UTC

                                          Woo Hoo, They are Gone!!!

                                          Good work Folks...

                                          Profile mo.v
                                          Forum moderator
                                          Avatar
                                          Send message
                                          Joined: Sep 29 04
                                          Posts: 2354
                                          Credit: 6,492,442
                                          RAC: 2,144
                                          Message 45202 - Posted 31 Oct 2012 3:15:28 UTC

                                            Hi Smurph

                                            We can see which batch a WU belongs to from the date when the WU was created. Most batches consist of hundreds of WUs and take hours to be created but all on the same day. Each batch is of a single model type. You can look at the following or previous WUs in a batch by editing the WU number in the address box. For example here's one of mine that should I hope complete shortly:

                                            http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=8204632
                                            If you change the final 2 in the address box to 3 and press Go you see the next WU:
                                            http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=8204633

                                            Occasionally a batch or part-batch is rereleased later if extra completed models from it are needed. You then often see a great disparity in the dates when different computers downloaded their tasks.
                                            ____________
                                            Cpdn news
                                            5 CPDN READMEs

                                            Post to thread

                                            Message boards : Number crunching : transient upload error




                                            Copyright © 2002-2014 climateprediction.net