climateprediction.net home page

The world's largest climate forecasting experiment for the 21st century.

Climate prediction using twice CPU allocation


Advanced search

Message boards : Number crunching : Climate prediction using twice CPU allocation

AuthorMessage
Ed Weber
Send message
Joined: Dec 11 05
Posts: 4
Credit: 383,991
RAC: 0
Message 43541 - Posted 10 Dec 2011 16:08:33 UTC

    BOINC setting 50% Climate & 50% SETI
    But Climate running High Priority in both 50%s
    SETI has been stoped and waiting to run
    Problem has been around for over a month and new Climate jobs continue to displace SETI jobs

    How do I fix so there is proper sharing?
    ____________

    Les Bayliss
    Forum moderator
    Send message
    Joined: Sep 5 04
    Posts: 5127
    Credit: 8,436,525
    RAC: 4,382
    Message 43542 - Posted 10 Dec 2011 19:16:08 UTC - in response to Message 43541.

      Resource share is only an average over a long period of time, say 6 months.

      There can be several reasons why the climate models are constantly running in high priority, but the general reason is that BOINC currently considers that it's not going to have enough time to complete the 2 long models that you have before the deadline. (It doesn't know that there's no actual deadline with this project.)

      Some things to look at, are the settings that you have for the cache, and the amount of time that you run the computer. The cache settings are Computer is connected to the Internet about every, and Maintain enough work for an additional. These should be kept low for this project so that you don't get too many models downloaded.

      One quick fix may be to make sure that you have the project set to No new tasks in the Projects tab, and then to Suspend one of the models, and keep it for later.


      ____________
      Backups: Here

      Belfry
      Send message
      Joined: Apr 19 08
      Posts: 170
      Credit: 3,303,045
      RAC: 1,214
      Message 43543 - Posted 10 Dec 2011 20:46:59 UTC - in response to Message 43542.

        Last modified: 10 Dec 2011 21:03:23 UTC

        Resource share is only an average over a long period of time, say 6 months.

        No, it won't take that long for two projects to reach equilibrium. Ed, the main reason you're seeing this behavior is you're running two hadcm3n's (coupled, full resolution ocean's) on a single-core, hyper-threaded Pentium 4. Your seven seconds per timestep is forcing BOINC to run them in high priority, in an effort to finish them before the four-month deadline. You could turn off hyper-threading, which should lower your s/TS by around 40%, but this would limit your processing to one task at a time and somewhat diminish the machine's multi-tasking performance with other applications. Probably the best solution is to avoid hadcm3n's, changing your climateprediction.net preferences to select only the hadam3p models, of which there are plenty right now :)

        edit: grammar

        Belfry
        Send message
        Joined: Apr 19 08
        Posts: 170
        Credit: 3,303,045
        RAC: 1,214
        Message 43544 - Posted 10 Dec 2011 20:57:09 UTC

          ... with regard to your computation errors, you may want to exclude your anti-virus from scanning the BOINC data directory. This is just my stab in the dark, but it's a fairly common reason for errors mid-task on Windows machines.

          Les Bayliss
          Forum moderator
          Send message
          Joined: Sep 5 04
          Posts: 5127
          Credit: 8,436,525
          RAC: 4,382
          Message 43545 - Posted 10 Dec 2011 21:22:57 UTC - in response to Message 43543.

            6 months isn't about reaching equilibrium, it's about the mental attitude that people need to have when talking about "resource share", when running a mix of WUs, ranging from those project's that have WUs of a few minute/hours, to the very long climate models that take months.


            ____________
            Backups: Here

            Belfry
            Send message
            Joined: Apr 19 08
            Posts: 170
            Credit: 3,303,045
            RAC: 1,214
            Message 43546 - Posted 10 Dec 2011 22:36:20 UTC

              Last modified: 10 Dec 2011 22:43:38 UTC

              I'm running a 1000 hour hadcm3n alongside 8-12 hour WCG tasks on my dual-core laptop right now, and each project gets 50% of CPU time--a core each. There just isn't enough oomph in the OP's machine to crunch one long task requiring 2016 hours (7s/TS) within seventeen weeks of shared time (2856 hours * 50% = 1428 hours). Another solution would be to change the resource allocation to 25% SETI, 75% CPDN, then 2856 * 75% = 2142 would just make the squeeze, but the machine would need to be on 24/7.

              Edit: by the way my first solution of turning off hyper-threading will not produce the desired sharing, since that faster s/TS is now only given half the time. I wasn't thinking.

              Belfry
              Send message
              Joined: Apr 19 08
              Posts: 170
              Credit: 3,303,045
              RAC: 1,214
              Message 43547 - Posted 10 Dec 2011 23:00:41 UTC

                Last modified: 10 Dec 2011 23:10:03 UTC

                Hold on, I messed up. The hyper-threading of course allows you to run two tasks at the same time within BOINC so the 7s/TS does fit within the 2856 hours available. But you're downloading two hadcm3n at a time so that is what's cutting your available time in half

                Best solution still is to stick to the hadam3p's.

                Edited because it's Saturday.

                Les Bayliss
                Forum moderator
                Send message
                Joined: Sep 5 04
                Posts: 5127
                Credit: 8,436,525
                RAC: 4,382
                Message 43548 - Posted 10 Dec 2011 23:17:44 UTC

                  And hyperthreading a P4 only gives about 1.25 processors worth of cores. :(


                  ____________
                  Backups: Here

                  Ed Weber
                  Send message
                  Joined: Dec 11 05
                  Posts: 4
                  Credit: 383,991
                  RAC: 0
                  Message 43549 - Posted 11 Dec 2011 5:12:26 UTC

                    Thanks for all your comments.

                    Unfortunately I don't have enough background to understand many of them.
                    I shall try the route of "no new jobs" and "suspend" one of the pair till the first is finished. Isn't automatic, but run time is long enough it isn't a burden.
                    ____________

                    Ed Weber
                    Send message
                    Joined: Dec 11 05
                    Posts: 4
                    Credit: 383,991
                    RAC: 0
                    Message 43550 - Posted 11 Dec 2011 5:17:44 UTC

                      Forgot to mention computer is running 24/7
                      ____________

                      Belfry
                      Send message
                      Joined: Apr 19 08
                      Posts: 170
                      Credit: 3,303,045
                      RAC: 1,214
                      Message 43551 - Posted 11 Dec 2011 17:37:19 UTC

                        Hi Ed, the crux of the problem is the downloading of two hadcm3n's. It would be great if BOINC would download one task at a time, but this is a shortcoming with BOINC. Even if you set your resource share to 99/1% and your work buffer to 0.1 days, as soon as BOINC requests work for the 1% project it will download the same number of tasks as cores in your machine (real or virtual). BOINC was originally developed at a time when dual-core machines were a dream, and I think the difficulty lies not in rewriting the client, but rather the hundreds of custom server applications and related database interactions.

                        A workaround is to manually request work: suspend network activity in BOINC, change the number of processors BOINC uses to one (50% for dual-core, 25% for quad-core, etc.), activate the network and update the project for which you need work, then suspend network activity again and set processors back to 100%. Yes, a pain.

                        Selecting only hadam3p's is the long-term answer (they're about one-fifth as long), but short-term you could suspend SETI and slog through the two hadcm3n's. The aliens can wait ;)

                        Post to thread

                        Message boards : Number crunching : Climate prediction using twice CPU allocation




                        Copyright © 2002-2014 climateprediction.net