climateprediction.net home page
Unable to upload 2 zip files

Unable to upload 2 zip files

Message boards : Number crunching : Unable to upload 2 zip files
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Profile old_user206071
Avatar

Send message
Joined: 29 Oct 06
Posts: 14
Credit: 99,628
RAC: 0
Message 37184 - Posted: 12 Jun 2009, 20:42:39 UTC

One of the 3 .zip-files of my HADAM3P-model has been uploaded some time during the last 24hours...

??


My NEW BOINC-Site

Why people joined BOINC Synergy...
ID: 37184 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 37185 - Posted: 12 Jun 2009, 20:57:04 UTC
Last modified: 12 Jun 2009, 20:57:40 UTC

Milo has moved more than 3TB of data from uploader.oerc and has now got it up and running again. The files from each HadAM3P go to 3 different servers - the three that were down. But the other files from HadAM3Ps can't upload because the two servers they need are still down. So please don't press the Retry Now button in the Transfers tab; the other files cannot upload.

We are still advising members to suspend models before they complete, if this is possible, and to suspend BOINC network activity as much as possible if you have files that cannot upload.

Please watch the News thread for updates on the situation. There's a link in my signature. Subscribe to the News thread.
Cpdn news
ID: 37185 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 942
Credit: 34,147,377
RAC: 4,175
Message 37186 - Posted: 12 Jun 2009, 20:57:23 UTC - in response to Message 37184.  

One of the 3 .zip-files of my HADAM3P-model has been uploaded some time during the last 24hours...

Yes, one of the three upload servers with problems, uploader.oerc.ox.ac.uk, was returned to service last night.

cpdn-upload1.comlab.ox.ac.uk is still stuffed full of data - slowly being transferred to an alternative storage host in Canada - and uploader1.atm.ox.ac.uk is still awaiting a new power supply, delayed somewhere in the congested UK supply chain. The three .zip-files are uploaded to different servers to minimise network congestion: the other two files will have to wait until their respective hosts are re-invigorated or replaced.
ID: 37186 · Report as offensive     Reply Quote
Helmer Bryd

Send message
Joined: 16 Aug 04
Posts: 147
Credit: 7,898,704
RAC: 8,343
Message 37189 - Posted: 12 Jun 2009, 21:57:48 UTC - in response to Message 37106.  
Last modified: 12 Jun 2009, 22:13:18 UTC


Until now every time there's been a file upload problem the programmers in Oxford have solved the server disk space issue before the BOINC 14-day limit. We know that Milo is already doing everything he can. I am confident that he will succeed again.


Thanks, Mo.
The current situation is that there is apparently some money that I can spend, and I have a quote for a suitable server, but before I can send the order I require a signature on a form. When I will be able to get that is at present unknown. Usually, orders for servers take about a week once sent.

Unfortunately, switching uploads to other systems isn't a possibility at the moment as there isn't enough space anywhere.

Ideally I would like to have new servers in place before we run out of space, but managing to secure money in time to do so is very difficult.


OK, Oxford has their priorities, well, I have a lot of results to upload and going on a vacation now so I guess most of them will be lost, well well
ID: 37189 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 37190 - Posted: 12 Jun 2009, 22:51:58 UTC - in response to Message 37189.  

... I have a lot of results to upload and going on a vacation now so I guess most of them will be lost, well well

Not so, I believe. Someone better informed may correct this, but, as I understand it, BOINC will delete an upload the first time it fails to upload after 14 days. So, if you suspend network activity before your vacation and on returning check that all the upload servers are available then you should get one attempt at uploading - which should succeed.
ID: 37190 · Report as offensive     Reply Quote
metalius
Avatar

Send message
Joined: 28 Nov 06
Posts: 89
Credit: 11,373,490
RAC: 2,720
Message 37197 - Posted: 13 Jun 2009, 12:54:01 UTC
Last modified: 13 Jun 2009, 13:00:12 UTC

An additional question.
What will happen with intermediate result files (HADSM3, HADSM3-MH) after 14 days? The same? Will be lost?
And I really don't know, what to do, because at this moment I have tasks from 4 projects almost on all of my machines, so I can not suspend network activity.
Maybe, suspend CPDN at all and wait?
ID: 37197 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 37199 - Posted: 13 Jun 2009, 14:36:40 UTC

If by intermediate you mean zip files part way through the model, then the answer is: The same. It applies to ALL zip files.

It's possible to Suspend in several ways; the project, and individual models, for instance. Best to set the project to No new tasks first, so that you don't get any more just yet to complicate things further.


Backups: Here
ID: 37199 · Report as offensive     Reply Quote
Profile old_user172201
Avatar

Send message
Joined: 7 Mar 06
Posts: 5
Credit: 4,085,123
RAC: 0
Message 37203 - Posted: 13 Jun 2009, 17:37:44 UTC - in response to Message 37189.  
Last modified: 13 Jun 2009, 17:42:02 UTC

OK, Oxford has their priorities, well, I have a lot of results to upload and going on a vacation now so I guess most of them will be lost, well well


@cwhyl: So if you are returning from vacation after the 14 days limit has hit you, this could be a possible workaround for you before you leave:

Stop BOINC, make a backup copy of your BOINC folder (or BOINC and DATA folders if using BOINC 6.x). Open client_state.xml in some kind of a text editor and remove the whole climate prediction project. A CPDN-project starts with the <project> section and ends with the <project_files> section. Don't forget the <active_task> group in the <active_task_set> section at the end of the client_state.xml file. This will only be there for still running projects. The ones that have finished don't have any. To avoid reporting of errors if you did something wrong, set the <user_network_request> option down near the end of the file to 3, that's the value for "no network connection". Now save your new client_state.xml and restart BOINC.

Now every project should run as before except CPDN, which should be missing. If something strange happens, errors or something, stop boinc, delete the BOINC and DATA folders, restore from your backup and try again. In germany we say "Versuch macht kluch" which means something like "Trial makes wise" ;-)

When you return from vacation, you can paste the complete climate project from the backup back into your active client_state.xml, including the <active_task> group(s). Now you can continue where you stopped before the vacation.

This is what I would do in your situation. Hope that helps and you're still at home!
ID: 37203 · Report as offensive     Reply Quote
Profile Rick B

Send message
Joined: 17 Feb 09
Posts: 31
Credit: 1,412,186
RAC: 121
Message 37214 - Posted: 13 Jun 2009, 22:38:51 UTC - in response to Message 37197.  

... so I can not suspend network activity.
Maybe, suspend CPDN at all and wait?


That is what I did. I suspended CPDN all together and let the extra processor time work on my other project. I have had 2 files awaiting transfer for about 7 or 8 days now and now notice that one of the two got though at 06:14:34 AM my time (UTC-4)today but the other file has yet to transfer. Im not sure I want to mess with the client_state.xml so if I lose this wu so be it. Ill keep the project suspended until the server/storage problems are corrected then go back to the wu's I already have. Im sure it will get corrected but a few work units may be lost.


Rick





ID: 37214 · Report as offensive     Reply Quote
Profile Rick B

Send message
Joined: 17 Feb 09
Posts: 31
Credit: 1,412,186
RAC: 121
Message 37215 - Posted: 13 Jun 2009, 22:48:25 UTC - in response to Message 37186.  

... slowly being transferred to an alternative storage host in Canada ...


Hey, Im in Canada, Can I just send my zip files somewhere closer? Maybe its with Jimmy? Sally or Suzie? ...

I digress ... Thats an American/Canadian Thing ... Or maybe its just a good beer thing I am Canadian
Rick





ID: 37215 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 37217 - Posted: 14 Jun 2009, 1:51:07 UTC
Last modified: 14 Jun 2009, 1:59:35 UTC

Hi Rick

The data that's been transferred temporarily to Canada is from models that have already been completed, so I'm afraid your current models' zip files can't be mixed up with that. If you know how slow the connection is between Oxford and the place in Canada, you wouldn't want to upload anything to them!

I don't think anyone's zip files will be lost. For people who've had HadAM3P files stuck in the Transfers tab, the only files left now will be those that go to uploader1.atm which still needs its new power supply. Its PSU failed on Friday 5 June, so the 2-week BOINC time limit for retrying failed uploads won't be reached until Friday 19 June.

But after the 2-week limit BOINC still allows one final upload attempt. So if the worst comes to the worst, members with stuck zip files can suspend network activity before Friday and only allow the files to upload after it's certain that uploader1.atm is up and running.

I've still had some HadAM3P models running since the server problem started, but I've suspended them before they completed.

Hi XJR-Maniac

If cwhyl has turned off his computers or disabled BOINC network activity before going on holiday, my suggestion in this post should also save his zip files from extinction, even if they time out while he's away.

I don't think anyone needs to edit any BOINC files at the moment.
Cpdn news
ID: 37217 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 942
Credit: 34,147,377
RAC: 4,175
Message 37218 - Posted: 14 Jun 2009, 9:20:31 UTC - in response to Message 37217.  

But after the 2-week limit BOINC still allows one final upload attempt. So if the worst comes to the worst, members with stuck zip files can suspend network activity before Friday and only allow the files to upload after it's certain that uploader1.atm is up and running.

The trouble with this is that the (new?) (repaired?) servers are likely to be very busy playing 'catch-up' for the first few days after installation, and even if the servers can handle the load, congestion in the communications infrastructure may cause a few uploads to fail.

Don't all rush at once when the servers come back up - form an orderly queue and wait your turn!
ID: 37218 · Report as offensive     Reply Quote
Profile old_user217043

Send message
Joined: 3 Jan 07
Posts: 10
Credit: 634,737
RAC: 0
Message 37223 - Posted: 15 Jun 2009, 18:56:38 UTC

Query:

I have suspended my HADAM3P tasks, but my RETRY clock still seems to be running. Is that OK?

Thanks!

Dora
ID: 37223 · Report as offensive     Reply Quote
old_user565985

Send message
Joined: 28 Apr 09
Posts: 18
Credit: 575,431
RAC: 0
Message 37224 - Posted: 15 Jun 2009, 20:07:10 UTC

Does anyone know where the new PSU is for the uploader1atm server is? For the love of God, it should have been there last Tuesday. We are nearly a week later on, and still no PSU. If I still lived in the UK I could have easily driven anywhere there and gotten it long ago. I think someone needs to really apply the pressure to the company that supplies these units. Any info would be welcome.
Cheers
Bill
ID: 37224 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 37225 - Posted: 15 Jun 2009, 20:11:02 UTC
Last modified: 15 Jun 2009, 20:16:20 UTC

Dora

No, it's not OK.
Once the 14 day timer has started, it will continue to countdown.
Setting the Network access to off will stop the retrys and the messages about it, but that's all.

Once again a reminder: Watch the News and Announcements thread at the top of this section for posts on advice. Subscribe to it if you want an email message about new posts.
Backups: Here
ID: 37225 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 37226 - Posted: 15 Jun 2009, 20:15:45 UTC

Bill

The delivery company went to the wrong address.
They say that they'll try again Tuesday, UK time.

I'll post about it in News shortly.
ID: 37226 · Report as offensive     Reply Quote
Profile Milo Thurston
Volunteer moderator
Volunteer developer

Send message
Joined: 2 Mar 06
Posts: 253
Credit: 363,646
RAC: 0
Message 37228 - Posted: 15 Jun 2009, 21:21:24 UTC

Concerning the power supply, I filled in the order on Monday, it went off on Thursday and should have been delivered on Wednesday. Unfortunately it has fallen foul of dodgy couriers, which is not uncommon.

It will go in the moment I get it.
ID: 37228 · Report as offensive     Reply Quote
old_user565985

Send message
Joined: 28 Apr 09
Posts: 18
Credit: 575,431
RAC: 0
Message 37229 - Posted: 15 Jun 2009, 21:41:03 UTC

Thx for the update Milo and Les. It is too bad that they could not send some eager grad student to go and get it lol. When I did my PhD at U. Edinburgh I loved driving around for various things. Anyway, I will keep my fingers crossed that all goes well and it comes tomorrow.
Cheers
Bill
ID: 37229 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 37230 - Posted: 15 Jun 2009, 22:24:56 UTC

If they still made Carry On films they could do Carry On Computing and the News thread would provide most of the screenplay.
Cpdn news
ID: 37230 · Report as offensive     Reply Quote
old_user136803

Send message
Joined: 13 Dec 05
Posts: 1
Credit: 137,543
RAC: 0
Message 37231 - Posted: 16 Jun 2009, 3:08:52 UTC

6/15/2009 7:58:04 PM climateprediction.net Temporarily failed upload of hadam3p_n1j7_1982_2_006150477_5_3.zip: HTTP error

I also get this with another that is finished HELP
ID: 37231 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : Unable to upload 2 zip files

©2024 climateprediction.net