climateprediction.net (CPDN) home page
Thread 'automatic backup would be nice'

Thread 'automatic backup would be nice'

Questions and Answers : Wish list : automatic backup would be nice
Message board moderation

To post messages, you must log in.

AuthorMessage
Steve Wenner

Send message
Joined: 2 Mar 06
Posts: 27
Credit: 240,040
RAC: 0
Message 23367 - Posted: 24 Jun 2006, 2:11:01 UTC

After a recent crash of BOINC, when I lost six weeks and 85% of my sulpher run, I noticed on these boards that a lot of volunteers are a little frustrated and discouraged by crashes. I wonder if it would be possible for the software to automatically make a periodic backup of the critical files, so the program could restart and pickup where it left off after a crash. I would think that only the parameters for a recent state would need to be saved, so there should be little drain on disk space.

Thanks,
Steve
ID: 23367 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 23368 - Posted: 24 Jun 2006, 3:42:29 UTC

It\'s not the backup that\'s the problem, but the restore.

BOINC is a \'universal\' program, used by a lot of people to run multiple projects.
Some of the info for ALL projects appears, intermingled, in a single file, so when you restore for one project, you restore for all of them.
The same thing applies for people with a single project, but running a hyperthreaded processor, or a dual core processor.

As far as I know, ALL the other projects are of short duration, only lasting hours to a few days, so there isn\'t the need for a backup.
As BOINC is written and supported by the University of California, Berkeley, the home of the biggest project SETI, they aren\'t very interested in building it in.

The files that need to be backed up are scattered through several folders, and it\'s best to simply stop and exit from BOINC, then copy the entire BOINC folder to somewhere else.
Some people on the BBC project use Suitcase for backups.

For those with stable computers, a crash can often be automatically recovered from without problems. My computer has done this several times during power problems without my even being at home.


ID: 23368 · Report as offensive     Reply Quote
ProfileastroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 23369 - Posted: 24 Jun 2006, 3:57:51 UTC
Last modified: 24 Jun 2006, 4:00:25 UTC

It\'s a wee bit more complicated that you think, Steve. Many files are involved and the process is dynamic. It cannot be allowed to get out of sync. It\'s true that this is a CPU-heavy Project but it is also, despite Carl\'s efforts to reduce I/O, a heavy-hitter on the hard disk.

Manual backups are tried and true and are the safest bet.

That said, if you\'re running the BBC flavor of the Project, it\'s long been done -- and documented: BBC auto-backup:
http://bbc.cpdn.org/forum_thread.php?id=2748&PHPSESSID=6e4825ae7d7f83b1fad45b755f0ad6ec#11915
http://bbc.cpdn.org/forum_thread.php?id=2748

[Edit: Oops, you beat me to it, again, Les. My \"typing\" scores again (something like the US Soccer team).]
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 23369 · Report as offensive     Reply Quote
Steve Wenner

Send message
Joined: 2 Mar 06
Posts: 27
Credit: 240,040
RAC: 0
Message 23404 - Posted: 29 Jun 2006, 1:37:13 UTC - in response to Message 23369.  

Thanks Astro and Les,

At least I understand now why it might not be easy to write an autobackup that will work well. I read the post from Richard Rodway about his autobackup for the BBC version (I think I am running that flavor, but I\'m not entirely sure); but, it seems like it might be more trouble than it is worth, and I don\'t think I want to do a beta test.

While I\'ve got you here, I have a related question. I completed one sulpher run and two runs crashed. My question: is there any reason why I can\'t delete the three sulpher folders and any file in the containing folder with the name \"sulpher ...\" (my current project is \"hadcm ...\")? It looks like that would save me two or three gigs of hard disk space.

Thanks,
Steve

It\'s a wee bit more complicated that you think, Steve. Many files are involved and the process is dynamic. It cannot be allowed to get out of sync. It\'s true that this is a CPU-heavy Project but it is also, despite Carl\'s efforts to reduce I/O, a heavy-hitter on the hard disk.

Manual backups are tried and true and are the safest bet.

That said, if you\'re running the BBC flavor of the Project, it\'s long been done -- and documented: BBC auto-backup:
http://bbc.cpdn.org/forum_thread.php?id=2748&PHPSESSID=6e4825ae7d7f83b1fad45b755f0ad6ec#11915
http://bbc.cpdn.org/forum_thread.php?id=2748

[Edit: Oops, you beat me to it, again, Les. My \"typing\" scores again (something like the US Soccer team).]


ID: 23404 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 23405 - Posted: 29 Jun 2006, 2:23:46 UTC

Hi Steve.

There\'s no problem with deleting folders of models that have failed or finished.
The new TCMs will clean up after themselves, provided that they don\'t crash, so that will be a bit easier.

ID: 23405 · Report as offensive     Reply Quote
ProfileastroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 23406 - Posted: 29 Jun 2006, 4:19:02 UTC


Careful, though, not to delete anything with \'SULPC\'. (All those Sulphur Runs were to generate input files for the Coupled Models we\'re running now...)

Best of luck, Steve.
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 23406 · Report as offensive     Reply Quote
Steve Wenner

Send message
Joined: 2 Mar 06
Posts: 27
Credit: 240,040
RAC: 0
Message 23551 - Posted: 9 Jul 2006, 12:06:41 UTC

After my first posting of this issue 10 days ago, I have been running a hadcm model. When I checked this morning the project had vanished from BOINC Manager and a new hadcm model had started running.

Sorry guys, of the four work units (three Sulphur and one Hadcm) I have attempted since I joined March 2 only one finished successfully. Until you can get a more robust way to run ClimatePrediction I\'ll have to bow out; I am afraid that my CPU is burning more BTUs than the information I can provide is worth!

I know the problems you face are immensely difficult, and I believe the ClimatePrediction enterprise is very important. Best of luck to you. I\'ll keep my eye on these message boards and try again in a few months if it seems things might have improved.

Kind regards,
Steve
ID: 23551 · Report as offensive     Reply Quote

Questions and Answers : Wish list : automatic backup would be nice

©2024 cpdn.org