climateprediction.net home page
Missing Zip file 13?

Missing Zip file 13?

Message boards : Number crunching : Missing Zip file 13?
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 44798 - Posted: 3 Sep 2012, 5:45:59 UTC

I just completed WU Hadam3p_eu_2tlp_1996_1_008161279_0.

I know that one of the upload severs is down. Zip files 8,9,10,11 and 12 are backed up in the transfer tab and will remain there until the upload server is back on line. That�s not my problem.

The problem (if there is one) is that I can�t find any trace of zip file 13. It is not in the transfer tab. I can�t find any trace of is in event log either. Did it somehow get lost.

I know that this is a very important file as it contains the dump that allows the server to generate the next segment of the model. I have a recent backup so I can do a restore and run it to the end again if that would create the missing file

ID: 44798 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 44799 - Posted: 3 Sep 2012, 5:52:57 UTC

This is the event log for the above posting. As you can see zip files 11 and 12 were created and tried to upload, but, couldn�t because of the non-working upload server. What I don�t see is any mention of zip 13. The WU is still listed at 100% and �ready to report�.

hadam3p_eu_2tlp_1996_1_008161279_0_11.zip
9/2/2012 12:29:37 PM | climateprediction.net | Started upload of hadam3p_eu_2tlp_1996_1_008161279_0_12.zip
9/2/2012 12:29:42 PM | climateprediction.net | Restarting task hadcm3n_u3sm_1980_40_008026584_4 using hadcm3n version 607 in slot 3
9/2/2012 12:29:42 PM | climateprediction.net | Restarting task hadam3p_eu_9c98_1960_1_008138976_0 using hadam3p_eu version 609 in slot 0
9/2/2012 12:29:59 PM | climateprediction.net | Temporarily failed upload of hadam3p_eu_2tlp_1996_1_008161279_0_11.zip: connect() failed
9/2/2012 12:29:59 PM | climateprediction.net | Backing off 1 hr 17 min 18 sec on upload of hadam3p_eu_2tlp_1996_1_008161279_0_11.zip
9/2/2012 12:29:59 PM | climateprediction.net | Temporarily failed upload of hadam3p_eu_2tlp_1996_1_008161279_0_12.zip: connect() failed
9/2/2012 12:29:59 PM | climateprediction.net | Backing off 16 min 52 sec on upload of hadam3p_eu_2tlp_1996_1_008161279_0_12.zip
9/2/2012 12:30:07 PM | | Project communication failed: attempting access to reference site
9/2/2012 12:30:09 PM | | Internet access OK - project servers may be temporarily down.
9/2/2012 12:37:13 PM | | Suspending network activity - user request


ID: 44799 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1058
Credit: 36,604,672
RAC: 15,671
Message 44800 - Posted: 3 Sep 2012, 9:08:24 UTC

The _13.zip files for EU models are sent to a different upload server from the _1 to _12 files. That server is still running, and my _13 files have uploaded too.

If you are running a recent version of BOINC, it will try every new file transfer once (upload or download), even if other files and the project as a whole is in transfer backoff. That's designed to cope with exactly the situation we're in this morning, with one upload server out of action but others working.

How thoroughly did you search your message log? I'd expect that the _13 file uploaded some time before the section you posted - it's generated before the task reaches completion.

On the other hand, I'm surprised that you say the task is 'ready to report'. Mine (with _13 uploaded but earlier files stuck) are still showing as 'uploading'. That sounds as if the model possibly crashed during the last few minutes. Try to avoid reporting it before the upload server is fixed (supposed to be today, if the delivery arrives in time) - then you can complete the uploads, and look at the outcome afterwards.
ID: 44800 · Report as offensive     Reply Quote

Message boards : Number crunching : Missing Zip file 13?

©2024 cpdn.org