climateprediction.net home page
Unable to upload 2 zip files

Unable to upload 2 zip files

Message boards : Number crunching : Unable to upload 2 zip files
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3

AuthorMessage
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 37232 - Posted: 16 Jun 2009, 3:40:51 UTC

About the only help available at present, is to suggest that you read the last half dozen or so posts in the News and Announcements thread at the top of this section of the board.

ID: 37232 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37233 - Posted: 16 Jun 2009, 8:03:36 UTC
Last modified: 16 Jun 2009, 8:06:49 UTC

I had 2 zip files held in the transfer tab for phases 1 & 2.
After several failures, they have now both been uploaded -- that's great.

But phase 3 zip is now held there.
Is the 14 day deadline still set from the original failed upload?
Or is it reset to the attempt on the 3rd phase failed upload attempt?

If the latter is not so, it looks as though I will have to resort to the file editing.

Trickles record is at:
http://climateapps2.oucs.ox.ac.uk/cpdnboinc/trickle.php?resultid=7793392

Keith
ID: 37233 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 37234 - Posted: 16 Jun 2009, 8:48:40 UTC
Last modified: 16 Jun 2009, 8:50:49 UTC

The timer starts at the first attempt to upload a given zip. If this gets through, then the timer for that file is deleted.

In both "slab" and mid-holocene models for instance, zips are created at intervals through the modelling, so it's possible to upload the first one or two, and then turn off the Network access and prevent the 3rd (and 4th), zips from starting. If they don't start, then neither does the timer.

And you can have the situation where the 1st zip fails because Network access was on, but this was turned off before the subsequent zips were created. So only the 1st zip will be lost.

But the problem with this is that loss of any data reduces the usefulness of the model.

Why not just make a copy of the client_state.xml file, and have a look at the copy? If the word "persistent" is in there, then the associated zip will be lost.
ID: 37234 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37235 - Posted: 16 Jun 2009, 12:29:55 UTC - in response to Message 37234.  
Last modified: 16 Jun 2009, 12:34:24 UTC

The timer starts at the first attempt to upload a given zip. If this gets through, then the timer for that file is deleted.

In both "slab" and mid-holocene models for instance, zips are created at intervals through the modelling, so it's possible to upload the first one or two, and then turn off the Network access and prevent the 3rd (and 4th), zips from starting. If they don't start, then neither does the timer.

And you can have the situation where the 1st zip fails because Network access was on, but this was turned off before the subsequent zips were created. So only the 1st zip will be lost.

But the problem with this is that loss of any data reduces the usefulness of the model.

Why not just make a copy of the client_state.xml file, and have a look at the copy? If the word "persistent" is in there, then the associated zip will be lost.


My xml file shows only one persistent occurrence:-
<persistent_file_xfer>
<num_retries>9</num_retries>
<first_request_time>1245038889.928497</first_request_time>
<next_request_time>1245110912.641051</next_request_time>
<time_so_far>682.988051</time_so_far>
<last_bytes_xferred>0.000000</last_bytes_xferred>
</persistent_file_xfer>

I have set my preferences to Network activity only from 2330 BST to 0100 BST.
  (This equates to 2230-0000 UTC in summer and 2330-0100 UTC in winter.)
This usually gives 4 time steps per day on each task running and monitors for me that all is OK.

I assume that 683 seconds is the time elapsed, but would have thought it should be 2 or 3 days.
I can look up how to translate the first and next dates, but am confident I will find I have some time still to spare.

Thanks for your help, as ever.
Keith
ID: 37235 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 37236 - Posted: 16 Jun 2009, 13:00:43 UTC - in response to Message 37235.  

I assume that 683 seconds is the time elapsed, but would have thought it should be 2 or 3 days.
The 683 seconds is the time spent trying to upload (I think): as you say, it doesn't seem to relate to the time since the upload was created.
I can look up how to translate the first and next dates, but am confident I will find I have some time still to spare.
Don't bother: just delete everything between and including <persistent_file_xfer> and </persistent_file_xfer> - it's a lot easier!
ID: 37236 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 942
Credit: 34,151,518
RAC: 4,486
Message 37237 - Posted: 16 Jun 2009, 13:15:03 UTC

First request: Mon, 15 Jun 2009 04:08:09 UTC
Next request: Tue, 16 Jun 2009 00:08:32 UTC

Courtesy of http://www.onlineconversion.com/unix_time.htm
ID: 37237 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37238 - Posted: 16 Jun 2009, 13:24:08 UTC - in response to Message 37236.  
Last modified: 16 Jun 2009, 13:27:30 UTC

I assume that 683 seconds is the time elapsed, but would have thought it should be 2 or 3 days.
The 683 seconds is the time spent trying to upload (I think): as you say, it doesn't seem to relate to the time since the upload was created.
I can look up how to translate the first and next dates, but am confident I will find I have some time still to spare.
Don't bother: just delete everything between and including <persistent_file_xfer> and </persistent_file_xfer> - it's a lot easier!


Iain

Quite correct. It is the 11 mins 22 secs which shows appear in the BOINC Manager Transfers log.

I have tried unsuccessfully to find the right formula in Excel to convert those decimal numbers, but as you say, it will be easier to edit the XML file, and is only of academic interest.

But Richard has now satisfied my curiosity ( Thank, Richard).

And it seems I have until 29th June to spare, so I am fully confident that all "at the office" be running smoothly again by then!!!

Keith
ID: 37238 · Report as offensive     Reply Quote
Simplex0

Send message
Joined: 7 Sep 05
Posts: 12
Credit: 601,646
RAC: 0
Message 37241 - Posted: 16 Jun 2009, 17:13:54 UTC
Last modified: 16 Jun 2009, 17:15:16 UTC

This 2 weeks long chaos reminds me of this.

I wonder if O'rily works for a delivery company nowadays?
ID: 37241 · Report as offensive     Reply Quote
Profile old_user172201
Avatar

Send message
Joined: 7 Mar 06
Posts: 5
Credit: 4,085,123
RAC: 0
Message 37276 - Posted: 17 Jun 2009, 21:26:56 UTC - in response to Message 37203.  

OK, Oxford has their priorities, well, I have a lot of results to upload and going on a vacation now so I guess most of them will be lost, well well


@cwhyl: So if you are returning from vacation after the 14 days limit has hit you, this could be a possible workaround for you before you leave:

Stop BOINC, make a backup copy of your BOINC folder (or BOINC and DATA folders if using BOINC 6.x). Open client_state.xml in some kind of a text editor and remove the whole climate prediction project. A CPDN-project starts with the <project> section and ends with the <project_files> section. Don't forget the <active_task> group in the <active_task_set> section at the end of the client_state.xml file. This will only be there for still running projects. The ones that have finished don't have any. To avoid reporting of errors if you did something wrong, set the <user_network_request> option down near the end of the file to 3, that's the value for "no network connection". Now save your new client_state.xml and restart BOINC.

Now every project should run as before except CPDN, which should be missing. If something strange happens, errors or something, stop boinc, delete the BOINC and DATA folders, restore from your backup and try again. In germany we say "Versuch macht kluch" which means something like "Trial makes wise" ;-)

When you return from vacation, you can paste the complete climate project from the backup back into your active client_state.xml, including the <active_task> group(s). Now you can continue where you stopped before the vacation.

This is what I would do in your situation. Hope that helps and you're still at home!


Sorry, might be too late yet but I forgot to mention something important. You have to delete all CPDN related files and folders before you restart the client, remember, you've got a back up. Sorry for that.

ID: 37276 · Report as offensive     Reply Quote
Profile Rick B

Send message
Joined: 17 Feb 09
Posts: 31
Credit: 1,412,186
RAC: 121
Message 37280 - Posted: 18 Jun 2009, 8:10:51 UTC
Last modified: 18 Jun 2009, 8:19:25 UTC

Looks like I lost one of those files on Day 12;

17/06/2009 12:00:36 PM climateprediction.net Started upload of hadam3p_mwsa_1997_2_006074684_3_3.zip
17/06/2009 12:00:41 PM climateprediction.net [error] Error reported by file upload server: no command
17/06/2009 12:00:41 PM climateprediction.net Giving up on upload of hadam3p_mwsa_1997_2_006074684_3_3.zip: permanent upload error

(Times above are UTC-4)

I thought I had a couple more days as mo.v said
Its PSU failed on Friday 5 June, so the 2-week BOINC time limit for retrying failed uploads won't be reached until Friday 19 June

It is no longer in my transfer tab and the work unit is no longer on my task list but I did get these messages a few hours later ...

17/06/2009 7:54:16 PM climateprediction.net Reporting 1 completed tasks, not requesting new tasks
17/06/2009 7:56:23 PM Project communication failed: attempting access to reference site
17/06/2009 7:56:24 PM Internet access OK - project servers may be temporarily down.
17/06/2009 7:56:26 PM climateprediction.net Scheduler request failed: Failure when receiving data from the peer
17/06/2009 7:57:26 PM climateprediction.net Sending scheduler request: To send trickle-up message.
17/06/2009 7:57:26 PM climateprediction.net Reporting 1 completed tasks, not requesting new tasks
17/06/2009 8:00:01 PM climateprediction.net Scheduler request completed: got 0 new tasks


I will suspend the next one before it finishes but where do we stand currently with the server issues here at CPDN?

Edit: ... I have to learn to read the News and Announcement Thread first!
Rick





ID: 37280 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 37281 - Posted: 18 Jun 2009, 8:32:02 UTC

1) There is still a server problem.
2) Milo was attempting to reroute the url to another server so that he could work on the faulty server.
3) The reroute didn't work.
4) Anyone rushing in to upload because they saw that the server status was green has probably lost the file.
5) Appache has been turned off on the servers to stop people sending more files.
6) Working hours start again soon at Oxford.
7) People WILL be told in the News threads here and here when it's OK to upload again.


Backups: Here
ID: 37281 · Report as offensive     Reply Quote
Profile tullio

Send message
Joined: 6 Aug 04
Posts: 264
Credit: 965,476
RAC: 0
Message 37283 - Posted: 18 Jun 2009, 10:59:21 UTC

I just leave BOINC do its work and never interfere. It started an upload of file hadam3p_n3k9_1970_2_006153107_2_3.zip which disappeared from my BOINC manager tab and I assumed all had gone well. Now I read that this file may have been lost because the upload server was not really working. I had to restart the manager due to a CPDN Beta bug and cannot check its past messages. But Task id 9057337 was received at 14:57:43 UTC on June 17 and was granted 1,982.64 credits, so all seems to be OK. Is it?
Tullio
ID: 37283 · Report as offensive     Reply Quote
Profile Milo Thurston
Volunteer moderator
Volunteer developer

Send message
Joined: 2 Mar 06
Posts: 253
Credit: 363,646
RAC: 0
Message 37285 - Posted: 18 Jun 2009, 11:06:12 UTC - in response to Message 37283.  
Last modified: 18 Jun 2009, 11:14:24 UTC

But Task id 9057337 was received at 14:57:43 UTC on June 17 and was granted 1,982.64 credits, so all seems to be OK. Is it?


Trickles go to a different server and so you'll get credits, although that particular file seems to have been lost along with a small number of others. Please don't worry about this missing data as we should still have enough; a very small percentage of files are occasionally lost or corrupted on upload anyway and this is not an insurmountable problem for the physicists.
ID: 37285 · Report as offensive     Reply Quote
Profile tullio

Send message
Joined: 6 Aug 04
Posts: 264
Credit: 965,476
RAC: 0
Message 37288 - Posted: 18 Jun 2009, 11:59:37 UTC

Thanks Milo. I am glad to know that nothing gets lost. CPDN tasks are among he longest running and I backup the BOINC directory every week on a flash memory cartridge. But I still miss the "dump" and "restore" commands of Berkeley UNIX I had in the eighties on an ONYX computer.
Tullio
ID: 37288 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 37290 - Posted: 18 Jun 2009, 12:48:59 UTC

Milo now has a replacement server for uploader1.atm up and running. Please see the News thread for updates on the current much improved situation and changed advice.
Cpdn news
ID: 37290 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 37298 - Posted: 18 Jun 2009, 18:29:49 UTC - in response to Message 37283.  

. . . I had to restart the manager due to a CPDN Beta bug and cannot check its past messages. . .

Tullio,

Old Messages are archived in stdoutdae.txt file in the boinc data folder. The file can get quote large. (When the file gets too big, boinc renames it stdoutdae.old and starts a new one.)
ID: 37298 · Report as offensive     Reply Quote
Profile tullio

Send message
Joined: 6 Aug 04
Posts: 264
Credit: 965,476
RAC: 0
Message 37305 - Posted: 19 Jun 2009, 5:16:27 UTC - in response to Message 37298.  

Tullio,

Old Messages are archived in stdoutdae.txt file in the boinc data folder. The file can get quote large. (When the file gets too big, boinc renames it stdoutdae.old and starts a new one.)

Yes, thanks. It had signaled an error.
Tullio
ID: 37305 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3

Message boards : Number crunching : Unable to upload 2 zip files

©2024 climateprediction.net