Message boards :
Number crunching :
Server out of space
Message board moderation
Author | Message |
---|---|
Send message Joined: 11 Sep 05 Posts: 5 Credit: 880,340 RAC: 0 |
Hi, since some hours I can't upload the latest result my host just finished crunching since it looks like there's not enough space on server's HD. Below transcript of Boinc messages. Bye darkpella 08/06/2009 14.53.51 climateprediction.net [file_xfer] Started upload of file hadam3p_n7yu_1997_2_006158816_4_2.zip |
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
See "Unable to upload 2 zip files" for details |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Or read the News and Announcements thread at the top of Number Crunching This, like all threads, can be subscribed to, but you do need to make sure that you have email messages ON. Backups: Here |
Send message Joined: 1 Sep 04 Posts: 23 Credit: 5,124,321 RAC: 3,353 |
Maybe it's time for the project to archive? I have tasks from 2004 from long defunct (or at least transmogrified) computers still showing. If, in fact, ALL previous work is still on the servers, then perhaps a judicious moving of tasks to offline storage might be in order? Because this is obviously NOT a new, brilliant idea, I'm curious why it hasn't been done (if it hasn't already been done) :-) |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
...I'm curious why it hasn't been done (if it hasn't already been done) :-)... it may be because the data is online for climate scientists at the CPDN data portal. Cunningly, there is only one copy of the data: as I understand it, the portal is really an index to the data uploaded from our models. However, that does mean that old data is kept. |
Send message Joined: 1 Sep 04 Posts: 23 Credit: 5,124,321 RAC: 3,353 |
Thanks Iain, </choir preaching on> There's no doubt the researchers need access to the data, but does it have to be on the dynamic servers? I assume their access rate is quite a bit less than the processors (us). OK, money, people etc. figure in to this, but Tolu just said he's buying a new server to handle the dynamic load. Perhaps the server might be better used as a researcher access server? Move the old stuff and voila, she works again! </choir preaching off> I don't pretend to know the funding and politics of the project, but I did work for the US Government for 30 years so I DO understand bureaucracy :-) All this just seems a strange to me. Of course I haven't had to tread the academic footsteps that Tolu et. al. (if there are any) have to follow. My simple engineer's mind tells me that just adding servers will eventually fail to scale up to the load. |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
There's no doubt the researchers need access to the data, but does it have to be on the dynamic servers? I assume their access rate is quite a bit less than the processors (us)...... indeed, there is another problem that hits us from time to time, which is that when some researcher develops an interest in the data, 'our' servers get well and truly thrashed. Not ideal, but if things were ideal then climate research would be the backwater it once was ... |
Send message Joined: 1 Sep 04 Posts: 23 Credit: 5,124,321 RAC: 3,353 |
Thanks again Iain, but... Yes, I agree, but can we make it better? It's worth a try. Tolu...Milo? How off base am I? Rick |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Strangely enough, the project people do know about these ideas. They've been discussed privately with some of the moderators. If only they could get rid of the alligators, so they could concentrate on getting rid of the malaria carrying mosquitoes. Then they would be able to ..., erm, ..., what was it again? It's been so long that I forget what the original job was. |
Send message Joined: 1 Sep 04 Posts: 23 Credit: 5,124,321 RAC: 3,353 |
Aye, yup. I understand. I knew I wasn't asking anything new (but perhaps revisiting old thoughts). I just thought that I'd prime the engine again. Sometimes it's worth tickling the process (for whatever that may be worth). Rick |
Send message Joined: 2 Mar 06 Posts: 253 Credit: 363,646 RAC: 0 |
Yes, I agree, but can we make it better? It's worth a try. Tolu...Milo? How off base am I? Well, at present we have a total of around 22.5 TB of data and this is increasing quite rapidly. I don't know the current rate but should be able to calculate it within a week or two. All of these data are on-line so researchers can get to them; basically this means that the big RAID arrays upon which the data sit are set up so that may be accessed via https. There are a few options I've considered to deal with this: 1. Buy more data servers. I'm doing this at the moment but I have to deal with much bureaucracy so it is quite slow, and it takes an actual lack of space to swing things in to action. More space was requested when we started sending out hadam3p jobs but it has taken until now. 2. Archive data on tape. This has all the problems associated with (1) plus the additional one that the general view of tapes around here is that they are a very expensive nuisance and that it would be better to buy more RAID arrays. There is a central university tape backup system but they refuse to handle our data as the volume is too large and moving files around (as I'm having to do now) causes them enormous problems. 3. Persuade someone else to host the data. We are keen to have collaborating universities host upload servers, which means that the bureaucracy problems are simply off-loaded on to them. 4. Delete some of the data. This would be an easy option but no-one wants to do it because it is thought that any of our data could be potentially useful. Carrying on with plan 1 means that there is at least some scalability, even though it's not a perfect solution. |
Send message Joined: 6 Aug 04 Posts: 264 Credit: 965,476 RAC: 0 |
What about buying some external hard disks connected via Ethernet, USB or Firewire? Here is what I read on The Register: Hard disks Tullio |
Send message Joined: 2 Mar 06 Posts: 253 Credit: 363,646 RAC: 0 |
What about buying some external hard disks connected via Ethernet, USB or Firewire? Here is what I read on The Register: IIRC they don't have any RAID capability other than 0 or 1. The former cannot be countenanced and the latter reduces the space too much. There are also difficulties (although not insurmountable) in incorporating them in the the results portal. |
Send message Joined: 1 Sep 04 Posts: 23 Credit: 5,124,321 RAC: 3,353 |
Thanks for the update Milo. You're in the trenches trying to make all this work. I really appreciate that. You have been quite successful so far. Obviously, this is not an easy problem to solve. Is there ANY way the higher-ups can look at the architecture of CPDN and plan for the future? I think I know the answer to this question though...sigh. |
Send message Joined: 2 Mar 06 Posts: 253 Credit: 363,646 RAC: 0 |
Thanks. There is some planning going on, one example of which is that we might get a new - and considerably more powerful - database server if all goes well (still working on this). However, on the science side of CPDN there are lots of physicists doing lots of different projects, all concerned with their own research and each with a separate equipment grant, so it will tend to become a bit disjointed. |
Send message Joined: 15 Mar 09 Posts: 5 Credit: 187,665 RAC: 0 |
Good June! This present technical problem with CPDN system irritated me slightly in the beginning: "Why don't these guys take necessary steps in advance...?" However, I soon calmed down as remembered an old story from my country. There was a certain extremely high ranking military officer, Marshal of Finland C. G. E. Mannerheim, who was the Commander-in-Chief in all Finnish wars last century. It seems, that sometime during WWII, when he was doing some travelling, his car broke down. The discussion was something, as follows. "Driver, why didn't you take this vehicle to workshop in time?" "Sir, do you take your watch to clocksmith before it stops?" A long pause ensued..., then with different tone... "No, actually, I don't..." Regards Pasi Karonen Finland |
Send message Joined: 28 Apr 09 Posts: 18 Credit: 575,431 RAC: 0 |
Does anyone know the current status of the uploader1atm server? The power supply was supposed to be installed on Tuesday, but it is now Friday in the UK and nothing has happened so far as I am aware. I have 17 zip files now which want to go and 6 finished WU which I would like to clear out. Any fresh info woule be helpful Cheers Bill |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
CPDN usually hasn't got as much funding as it needs for hardware or to employ enough programmers. There have only been two programmers since Carl left in June 2007. I once asked Carl whether in an emergency he could ask the Oxford Uni computing staff to help with server problems. He just laughed and said distributed computing is so different from their normal work that they completely refuse to help. This does not surprise me. Tolu, who develops the new models almost single-handed, once told me 'I prefer not to work at the weekend.'. This does not surprise me either! Milo is doing everything he can to find solutions not just for the immediate crisis but for the longer term. The moderators have been concerned for a long time about how BOINC handles files that temporarily cannot be uploaded from Transfers. This BOINC Trac ticket was opened two years ago by one of our moderators, MikeMarsUK. Cpdn news |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Hi Bill I expect you have HadAM3P files to upload. The files from each of these models upload to three different servers. Strange but true. These servers are the three that are disabled: * uploader.oerc (disk space) * cpdn-upload1.comlab (disk space) * uploader1.atm (power supply) On Thursday the power supply had not been delivered to the university store in Oxford. I do not think this server can be up and running before Monday. But in any case we cannot upload our HadAM3P files until these three servers are all running. Milo has been moving large quantities of data to make this possible and as quickly as possible. He said late on Friday that he has uploader.oerc working again, but I don't know when he will enable it for uploads. He knows about the BOINC 2-week file upload deadline. Cpdn news |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,133,755 RAC: 2,026 |
I have a question about the server problem. I just downloaded one of the Mid-Holocene models. I have suspended network activity because I understand that trickles are not being excepted. Will there be a problem if the servers are still down when the HM model reaches the end of phase 1. Does it need to upload a .zip file containing the results of the phase at this point, and will it give up after a certain amount of time if it can‘t. |
©2024 climateprediction.net