Questions and Answers :
Windows :
Backing up CPDN/BOINC
Message board moderation
Author | Message |
---|---|
Send message Joined: 11 Jan 06 Posts: 2 Credit: 448,290 RAC: 0 |
Couldn\'t find a single \"official\" answer to this in the posts ... other than typical hard-drive failure sorts of hazards, why would I want to backup my CPDN or other BOINC projects? Is there some danger of massive data loss (as in 50 days of work gone poof) if the PC is shutdown improperly - say the power is knocked out in a windstorm? Or would it just rollback to the last save point? Or is it possible/common for the last save point to get slagged? If backups really are in order, how do folks generally do them? I have a farm with about 16 machines so non-manual solutions would be ideal .. |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
The system will usually restart from the last checkpoint. However, if (for example), something went wrong during a checkpoint, or corrupted the processing without causing the process to fail, then the checkpoint will be no use and your (20, 50, 100, ...) days of processing may be wasted. The procedure is to shut down boinc, and then zip up the boinc hierarchy, then restart. It would be easy enough to automate the zip process, I\'m not sure about remotely shutting down / restarting boinc however - any suggestions from anyone? Backups need not be done particularly frequently, it depends how stable your system is (for example, if you usually run your machines for weeks / months between restarts, they\'re highly stable, but if they frequently crash or blue screen, you should probably be doing very frequent backups...). The above backup routine only applies if you only run CPDN on any given machine (since it will confuse boinc if another project is also rolled back which according to the project site has already completed). Backups are discussed in depth on the FAQ, I\'m just going from memory so expect the above description to be \'approximate\'! :-) I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
It is recommended that CPDN be suspended before stoppong boinc. It takes a few seconds for CPDN to send everything to disk. Even stable machines go belly-up occasionally, as I\'m sure you know. A bit of paranoia causes me to run backups on all eight machines every three days and keep two cycles of backups. (And today\'s the day!) Mine are done manually -- because I\'m retired and have the time wait until the Models pass Checkpoints, thereby minimizing rerun. (In Sulfur/Sulphur Cycle, Checkpoints are every 144 Time Steps. [Use the \'8\' key as toggle in the display to get a countdown to the next Checkpoint.]) Case in point: We had a bit of a blow here on the US Washington coast about a week ago. Power outages and power interruptions. Interruptions were rapid on-off sequences that apparently triggered rapid UPS switching sequences that pushed a couple of the UPS units over the edge, putting their machines down. The machines run Coupled Ocean Spinup (200-year models combining Sulfur Cycle and Ocean cycles) and were more than three months into four-month Runs. (P4 3.0s running stand-alone & single-thread.) After some 2500+ CPU hours, one Model restarted at zero, while keeping time already burned. Fortunately, there was a day-old backup... Memory nags me that someone wrote and posted a script to automate backups. I think it was for Linux but am not sure. I\'ll give a look with \"Search\" but you may want to do that also. (I don\'t always use the proper search terms!) EDIT: One for Windoze CPDN here. (I haven\'t tried it.) The one I was thinking about was more complex, if I recall correctly. Perhaps it was on the other forum. http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=1559 "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 11 Jan 06 Posts: 2 Credit: 448,290 RAC: 0 |
That Western WA blow was the first reminder that I needed a better story. My $5 power supplies on the farm didn\'t have the standup time to stay on with the flickers that my non-farm machines survived. None of them reset, thankfully. The second problem was a vine that had apparently slowly been pulling on an extension cord fed into an outside light fixture-adapter circuit. :) You don\'t here THAT one everyday ... I\'ll take a look around for scripts - and read the FAQ, sorry about that. Thanks! Jeff |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
UK_Nick had a script on the Message board,(Hate that new name), and the subject was re-visited late last year. As you say, what words to search for? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
The thread containing UK_Nicks script is <a href=\"http://www.climateprediction.net/board/viewtopic.php?t=2377\"> here,</a> and there is more info in the BOINC Wiki <a href=\"http://boinc-doc.net/boinc-wiki/index.php?title=Backup_BOINC\"> here.</a> The hard part with backups is in restoring them if it becomes necessary. Easy if running only one project, tedious and time consuming if running more than one. (See the Wiki.) But VERY necessary if running climateprediction. Many\'s the time people have posted with tales of woe, having lost several hundred hours of crunching. :( |
©2024 cpdn.org