climateprediction.net home page
News and Announcements

News and Announcements

Message boards : Number crunching : News and Announcements
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 11 · Next

AuthorMessage
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 41666 - Posted: 25 Feb 2011, 16:03:20 UTC

CPDN main project - important database maintenance on Monday 28 February 2011

The CPDN main project database will offline on Monday in order to facilitate some much-needed maintenance.

During the maintenance period the BOINC forums will be inaccessible, it will not be possible to create accounts or attach new computers to the project and all scheduler requests will fail (this includes uploading trickles, reporting completed tasks and requesting new work). Upload of result files will be unaffected.

The aim is to reduce the time taken for the database backup which is thought to be the current cause both of the BOINC board and scheduler being unavailable for long periods and the slow connections from clients when they are.

The work will take many hours because it involves running the tortuously slow backup script and then archiving old data in the existing (huge) tables.

The database will certainly be offline all of Monday and possibly longer. Updates on the progress will be posted in the News thread on the phpBB forum.
ID: 41666 · Report as offensive
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 41696 - Posted: 5 Mar 2011, 0:55:17 UTC

CPDN main project

We are now into the final stages of the database maintenance.

Although the BOINC message board is back online the scheduler is still disabled. This means it is still not possible to create accounts or attach new computers to the project and all scheduler requests will continue to fail (this includes uploading trickles, reporting completed tasks and requesting new work).

Upload of most result files is possible, but the final upload file generated by HadAM3P regional tasks (*_13.zip) can't be uploaded at the moment. These files contain the restart dumps required to generate follow-up tasks and are sent to climateapps1.oucs.ox.ac.uk.

By keeping these features disabled the project team can make a direct comparison between the credits calculated before the old database was archived and those calculated using the optimised database.

The project will not be brought fully back to service until the project team are confident that the credit script is working correctly.
ID: 41696 · Report as offensive
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 41734 - Posted: 8 Mar 2011, 12:55:41 UTC

CPDN main project

The scheduler has been restarted and it is now possible to upload trickles, report completed tasks, request new work, create new accounts and attach new computers.

It is still not possible to upload the final file generated by HadAM3P regional tasks (*_13.zip) as climateapps1.oucs.ox.ac.uk is currently out of disk space. Jonathan is working to make more space available.

A significant number of users are currently affected by credit anomalies, mostly with credits below the level calculated before the database maintenance started. We've been here after previous major periods of database maintenance. As before these credit problems will be resolved as a background task by the project team.
ID: 41734 · Report as offensive
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 41750 - Posted: 9 Mar 2011, 11:31:44 UTC

CPDN main project

We are fairly sure that the problem which resulted in 5,429 CPDN users losing varying amounts of credit after the database work has been identified. Jonathan is working to fix this.

Jonathan and Milo are still working to make more space available on climateapps1 to allow completion of stalled uploads of HadAM3P regional restart dumps (the *_13.zip files).
ID: 41750 · Report as offensive
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 41774 - Posted: 11 Mar 2011, 10:31:45 UTC

CPDN main project

Yesterday a bug was found in the most recent batch of regional EU models. An accidental mistake was causing all new models of this type to fail during download with a checksum error. PNW and SAF models are not affected.

EU models have been deprecated (removed) from the server. More will be created. Until they become available you may need to add PNW and SAF to your selected models in your account.

For the time being, all regional model types are only available for computers with Windows. As soon as a version for Linux and Mac becomes available we will announce it here in this thread.

Here is the Server Status page.
Cpdn news
ID: 41774 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 41794 - Posted: 13 Mar 2011, 2:29:26 UTC

It seems that climateapps1.oucs.ox.ac.uk is out of disk space again. Must have been a LOT of "13 zips" waiting.
Other upload servers may also be getting close.

Monday morning, UK time, is the earliest that this will start to get fixed.
ID: 41794 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 41805 - Posted: 15 Mar 2011, 20:18:03 UTC

climateapps1 is currently accepting data, but may be under stress from large amounts of incoming data.


Backups: Here
ID: 41805 · Report as offensive
Profile Iain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1079
Credit: 6,906,534
RAC: 6,466
Message 41856 - Posted: 23 Mar 2011, 13:51:22 UTC
Last modified: 23 Mar 2011, 19:23:55 UTC

The weatherathome applications (HADAM3P regional - EU/PNW/SAF) are now available on all platforms - i.e. Windows, Mac and Linux.

See here.
ID: 41856 · Report as offensive
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 41860 - Posted: 23 Mar 2011, 19:20:06 UTC
Last modified: 23 Mar 2011, 19:25:18 UTC

CPDN Main Project

It appears that the v6.09 HadAM3P regional model graphics application is missing from the download mirror servers (definitely for Windows and Linux, probably for Mac too).

This means that all downloads of Weather at Home tasks will fail until the project team can resolve the problem.

In the meantime it appears that an extension of the HadCM3 Coupled Model Experiment Optimised File I/O geo-engineering experiment has been released for all platforms.

If you are happy to run these longer experiments you will have to modify your project preferences to restrict your application selection to only "UK Met Office HadCM3 Coupled Model Experiment" (the HadAM3P regional tasks seem to have a higher priority and your daily quota will be used up failing to download them). Otherwise you are advised set CPDN to request no new work until the missing file has been added to the download mirror servers.
ID: 41860 · Report as offensive
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 42046 - Posted: 27 Apr 2011, 13:51:51 UTC
Last modified: 27 Apr 2011, 13:54:45 UTC

CPDN Main Project

Some users may have spotted that HadAM3P regional tasks were being shown with a status of "Didn't need". The text has been changed to "No Resubmission" to avoid any confusion and prevent users from thinking they ought to abort tasks which are definitely still required.

Jonathan has posted a full explanation here.

The resubmission job for HadAM3P EU and SAF workunits is currently suspended but the project team hope they to resume it either today or tomorrow and start generating new tasks for those applications.

In the meantime, a new batch of tasks for the HadCM3N (RAPIT) experiment was generated yesterday. This is a second set of control work units with a known perturbation added (dtheta=1). The project scientists will soon be making a post to explain the purpose of this batch.
ID: 42046 · Report as offensive
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 42047 - Posted: 27 Apr 2011, 15:21:52 UTC - in response to Message 42046.  

CPDN Main Project

The promised explanation of the new batch of HadCM3N tasks has been posted here.
ID: 42047 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 42065 - Posted: 29 Apr 2011, 21:32:02 UTC

There has been a set back with the implementation of a new label for models.

No Resubmission is intended to be a flag to stop the automatic generation of the next set of years in a series for a data set, for the 3 varieties of regional models.
It is being used instead of Not needed, which was/is causing confusion due to this label already being used to indicate something else.

However, it seems that the new message, or more likely, some of the associated code changes to fit it into the BOINC server system, is causing BOINC to think that the model(s) was never sent to peoples computers, so people now get a new 'worrying' message: Message from server: Completed result .....: this result wasn't sent (not needed)
Which is obviously wrong, as the model WAS sent AND completed.

As can be seen by looking at the server page for affected models:
The trickles are all there.
The graphs are there. (So the zips are there.)
Some, if not all, credits are there.

So the science data is safely stored on the project's servers.

What is missing:
The messages under Stderr
Run time
CPU time
Client state is still showing as New

So, more work needs to be done.
But there's a 4 day long weekend in the UK, so it'll be Tuesday morning UK time before we can expect anything to change.
Probably after several strong cups of coffee.
Oops. This is England. Make that several strong cups of tea.



Backups: Here
ID: 42065 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 42169 - Posted: 13 May 2011, 22:50:17 UTC

The pool of models is just about empty.
The project people know about it getting low.

Now is a good time to to start a new hobby while waiting for more.
Reading
Gardening
Long distance running



Backups: Here
ID: 42169 · Report as offensive
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 42193 - Posted: 17 May 2011, 22:06:39 UTC

CPDN Main Project

The current lack of work available for download should hopefully be resolved within the next couple of days.

You may not be surprised to hear that the problem has, once again, been disk space. The amount of free space on the main database server started to become critical last week and to ease the process of recovering some disk space the pool of work was allowed to run dry. The project team have been moving data off of the server to make room for the extra data it will have to accommodate when new work is released. This work is ongoing.

In parallel they have also been transferring 2TB of data to the download mirror servers in preparation for the release on a new batch of weather@home (HadAM3P regional) tasks. Work for the PNW region will be included in this batch.

They are also investigating the possibility of moving the daily credit calculation process to the backup database server. If this is possible it would relieve the active server of of its most significant load.
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 42193 · Report as offensive
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 42201 - Posted: 18 May 2011, 19:42:59 UTC

A batch of EU Regional tasks was released today. I haven't tried to download any, so can't confirm that they're actually available. If you have trouble downloading, please post.

"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 42201 · Report as offensive
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 42223 - Posted: 21 May 2011, 0:00:25 UTC

CDPN Main Project

The availability of work is likely to be very intermittent for a few days.

Although the server status page almost always shows no tasks available the Weather@home resubmission process is running and new workunits are being generated from completed HadAM3P regional tasks. Due to the large number of hosts requesting work getting hold of a task is very much down to luck; the tasks created from the resubmission workunits are disappearing as quickly as they become available.

A number of server configuration changes are taking place to reduce the load placed on climateapps2 and no new batches of work will be released until these have been completed:

  1. another server will be assuming the role of the third download mirror server. This will free up a lot of the disk space and network bandwidth on climateapps2. During the reconfiguration process files are being transferred off of climateapps2. This might result in some users experiencing download failures; apologies if you are one of those affected.

  2. as previously mentioned, the daily credit calculation will be moved from climateapps2 to relieve it of intense CPU and database loads required for that process. It will almost certainly be moved to the backup database server. This should stop the problems with scheduler requests and BOINC forum access which can occur when the credit script is running.

  3. climateapps2 will continue to serve as the scheduler and primary database server.


"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 42223 · Report as offensive
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 42274 - Posted: 27 May 2011, 23:06:49 UTC
Last modified: 27 May 2011, 23:10:15 UTC

CPDN Main Project

The server status page is currently showing uploader1.atm.ox.ac.uk as not running. That's where the _3, _6, _9 and _12 upload files for Weather@home (HadAM3P regional) models are sent.

uploader1 was running short of disk space yesterday and Milo took it down for a while to recover some space by moving some FAMOUS upload files to another server. He was hoping this would give sufficient leeway to allow a larger quantity of HadAM3P data to be moved while the system was accepting new uploads, but it looks like this hasn't been possible. Moving upload files between servers is never going to be a trivial process; the uploaded data is being used by researchers around the world and the integrity of the results database used to retrieve the data has to be maintained at all times.

Any completed Weather@home tasks will stay in the "Uploading" state until uploader1 is back up. Tasks in that state cannot be reported until those files have been uploaded but will already have received their full credit because that's determined from the trickles sent to the scheduler. Please do not abort any tasks stuck in this state.

Unfortunately the server reconfiguration exercise started a week ago is taking a lot longer than was anticipated. As mentioned last week, there will be no new batches of tasks until this has been completed. The recent reports of stuck downloads might be due to point #1 in the last news post.

A number of users have reported a discrepancy between the credits reported on the CPDN site and third party statistics sites. This is because the export file retrieved by third party statistics sites hasn't been updated for a couple of weeks. I'm not sure if this is intended or accidental.
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 42274 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 42307 - Posted: 1 Jun 2011, 20:50:14 UTC

A test run has been made for running the credits calcs on the backup server.
This appears to have been successful.
At least, I've received a large increase on BOINCstats. :)

Backups: Here
ID: 42307 · Report as offensive
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 42318 - Posted: 4 Jun 2011, 15:19:44 UTC
Last modified: 4 Jun 2011, 15:21:03 UTC

CPDN Main Project

The server reconfiguration work has almost been completed but unfortunately neither Jonathan nor Andy were aware that the new download server (manticore.oerc.ox.ac.uk) would be inaccessible to users until the relevant port had been opened up on the university's firewall. All files without an application version stamp appear to be affected and they will be stuck in BOINC's transfer queue until the firewall changes have been made. This is unlikely to happen until Monday.
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 42318 · Report as offensive
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 42329 - Posted: 5 Jun 2011, 9:42:15 UTC

CPDN MainProject

If CPDN is the only BOINC project your computer is attached to and all of its tasks have file downloads waiting for the firewall changes BOINC Manager might display a pop-up balloon advising you to run "Advanced - Do network communication". This will cancel all retry timers but leave any project backoff timers running. Any pending scheduler requests will be retried immediately but uploads and downloads will only be retried for files with no project backoff. Forcing network communications as advised will have no effect; the pending downloads will still fail.
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 42329 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 11 · Next

Message boards : Number crunching : News and Announcements

©2024 climateprediction.net