Message boards :
Number crunching :
HadAM3P-PNW disappeared?
Message board moderation
Author | Message |
---|---|
Send message Joined: 17 Nov 07 Posts: 142 Credit: 4,271,370 RAC: 0 |
As of yesterday the "Server Status" page has been showing 0 HadAM3P-PNW tasks. The day before, there were 50,000-odd. Should there be an announcement? BTW I have just received 6 HadAM3P-PNWs, at least two of which are 'new' - first task for the work unit was issued after the Server Status changed to 0. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
No news at the moment. It's the Easter long weekend, so most of Oxford would be closed, and I've been sleeping for the last few hours. I've asked about it, but it may take a while for a reply. The EU pool is down to 9 at the moment as well. Backups: Here |
Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0 |
It looks like there were download problems earlier today. hadam3p_pnw_yyam_2005_1_006899510_0 (from the same WU as one of Greg's tasks) reported the following error at 22 Apr 2011 20:14:20 UTC: app_version download error: couldn't get input files: I've run a browser check on http://climateapps2.oucs.ox.ac.uk/cpdnboinc/download/mirror.php?file=/hadam3p_pnw_graphics_6.09_i686-pc-linux-gnu and it now seems to be available on the 3 mirror servers I'm aware of (http://uploader1.atm.ox.ac.uk, http://climateprediction.net and http://climateapps2.oucs.ox.ac.uk). "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer |
Send message Joined: 17 Nov 07 Posts: 142 Credit: 4,271,370 RAC: 0 |
Oh, OK. The drop from 50,000-plus down to 0 was so sudden, I thought that someone had "pulled the plug" on the PNW project. But I'd expect that they would tell the moderators if so. :-) If you guys don't know anything about the project being cancelled, it must be just a glitch in the server status page. |
Send message Joined: 5 May 10 Posts: 69 Credit: 1,169,103 RAC: 2,258 |
The PNW app has also disappeared from the Applications page. http://climateapps2.oucs.ox.ac.uk/cpdnboinc/apps.php |
Send message Joined: 17 Nov 07 Posts: 142 Credit: 4,271,370 RAC: 0 |
Now all my HadAM3P-PNWs have been marked "Didn't need". What's going on? Edt: correction - two of them are still "in progress", but the other four are "didn't need". Do I cancel the "didn't need"s? I see HadCM3N is back, on the "server status" page. Did I miss the memo? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Presumably, when the data sets were removed from the download data pool, the BOINC server software took that to mean that none of the unreturned results were needed, and wrote the Not needed message into everyone's model pages. Note however, that Not needed isn't the same as not wanted by the researchers, who would still like to get their hands on as much data as possible, please. So if the models are still running, and haven't been killed off by some downloaded signal from Oxford, you should continue to crunch them. ****************** There's been no memo, possibly because it was still the 'weekend' in the UK. What's going on is anyone's guess. There is, however, THIS memo about upgraded security measures on the alternative PHP board. Backups: Here |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
I received this message after #13 upload to Oxford. None of the earlier uploads (to U.Oregon) triggered a red message. "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 28 Mar 11 Posts: 35 Credit: 82,588 RAC: 0 |
DIDN'T NEED flag Hi Everyone, Some of the work units that get processed contain particular parameters that are of interest to the CPDN project. The BOINC system has a method for allowing us to gather more info on certain parameter sets by resubmitting a work unit to the pool of available work units. The DIDN'T NEED flag means that the CPDN project did/do not need to resubmit the work unit for additional processing. The flag can mean a number of things, and is combined with other flags in the database to determine exactly why we don't need to reprocess it. One of the common reasons is that the current run gives us exactly the info we need. It is unfortunate that the flag gives the impression that we are not interested in the work unit - we certainly are interested. We are looking into how we can make this more clear on the work unit info pages. Please accept our apologies for any confusion or consternation this may have caused. EDIT: We have now altered this flag to read "No Resubmission" which is a more accurate reflection of the status of the work unit. Jonathan CPDN SysAdmin |
Send message Joined: 17 Nov 07 Posts: 142 Credit: 4,271,370 RAC: 0 |
Re: Didn't need / No resubmission: Fri 29 Apr 2011 10:02:11 NZST climateprediction.net Started upload of hadam3p_pnw_zjca_1969_1_006969986_2_13.zip Fri 29 Apr 2011 10:02:13 NZST climateprediction.net Computation for task hadam3p_pnw_zjca_1969_1_006969986_2 finished Fri 29 Apr 2011 10:10:59 NZST climateprediction.net Finished upload of hadam3p_pnw_zjca_1969_1_006969986_2_13.zip Fri 29 Apr 2011 11:19:59 NZST climateprediction.net Sending scheduler request: To send trickle-up message. Fri 29 Apr 2011 11:19:59 NZST climateprediction.net Reporting 1 completed tasks, not requesting new tasks Fri 29 Apr 2011 11:20:04 NZST climateprediction.net Scheduler request completed Fri 29 Apr 2011 11:20:04 NZST climateprediction.net Message from server: Completed result hadam3p_pnw_zjca_1969_1_006969986_2 refused: this result wasn't sent (not needed) That suggests that completed work is not getting through, whether or not the scientists want it. I'm still confused. I think I'd rather crunch something else. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
I've just received the same message. It looks like the attempt to introduce a new label into the BOINC system hasn't worked. :( I'll inform the project people. Backups: Here |
Send message Joined: 8 Nov 06 Posts: 18 Credit: 2,425,895 RAC: 0 |
I have had the same. when I look tasks instead of saying "completed" get "No Resubmission" I get the feeling I am completely wasting my computer time this can be seen on other work units. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Those people who feel that this new message means that they're wasting their time should stop crunching climate models, leave the project, and not come back! For the rest of us, the data is stored on the servers, but it's a 4 day long weekend in the UK, so we have to wait until Tuesday morning UK time for the project people to return and kick the BOINC code until it behaves. :) Backups: Here |
Send message Joined: 8 Nov 06 Posts: 18 Credit: 2,425,895 RAC: 0 |
Those people who feel that this new message means that they're wasting their time should stop crunching climate models, leave the project, and not come back! I have crunched this project since the BBC days but if this the new attitude I will take your advice as the messages are no longer in plain English. I have had very few failures and most have been successful and put up with some of the projects problems. |
Send message Joined: 17 Nov 07 Posts: 142 Credit: 4,271,370 RAC: 0 |
Dave, it's likely that this is just a 'learning the ropes' problem for the new project staff. And possibly Les was short of painkillers when he wrote that. The HadCM3N models are working well. Just a couple of niggles: the initial duration estimate is about double the true figure (530 - 600 hours for your C2Qs), and they've a short deadline, which is really only indicative - the researchers will still use models that finish after the deadline. I'm crunching them, now. |
Send message Joined: 8 Nov 06 Posts: 18 Credit: 2,425,895 RAC: 0 |
Greg I've calmed down now but I was annoyed. I had noticed that I had 3 completed models with downloads stuck in the transfer tab I do not check this often. I quickly realised it due to a pnw running under linux with the handler problem. Edited the client_state.xml file which solved the problem but took hours off download. Any when reported completed I got in the messages so I went to my account to see if they were completed and left me confused. |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
PNW's first twelve uploads go to the science database at U.Oregon. No red messages from them, eh? #13 upload, after task completion and full credits are awarded, is a restart dump sent to Oxford so the next segment of the sequence can start, supposedly where the last one ended. (Work was chopped into segments because people accustomed to tasks taking from minutes to a few hours elsewhere whined at length across the boards. Hence some of the current difficulties. [I don't envy the scientists working to understand segment differences run on different CPUs and OSs ...]) As I understand it, the new red message is a consequence of recent security changes to the boards to inhibit spam registrations. Not a secret, explanations were posted. I've been with CPDN since Original Beta, July 2003, and though, early-on (pre-boinc), we had to do some manual uploads, I don't recall any work being lost. (Early 14-day boinc timeout is another thing.) Despite a long history of under-staffing and a plethora of problems, many boinc-related, the Project has a good record of saving all our work. Hang in with Oxford's new IT team through its learning curve or bail out of a wounded but still-flying bird. Your choice. "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 6 Apr 05 Posts: 17 Credit: 744,057 RAC: 0 |
Messages and server problems aside, I'm still wondering why there is no work for PNW, while there is for the other regionals. Any explanation from the science group? =Mike |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
I have no new skinny but there was an issue with Linux tasks. Perhaps the new support team felt it safer to throttle PNW flavor (the one for my area of the planet) rather than tweak possibilities. I have all confidence that Andy and Jonathan will sort it all out in due time. Please hang in there! "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 24 Apr 08 Posts: 6 Credit: 176,830 RAC: 0 |
Just verifying, but it sounds like the msg I got is not quite accurate and is being updated for items being downloaded in the future. But in the mean time, scary msg or not the work is useful to you and should still be allowed to run if its already going After 24-48 hours of being unable to upload this task(other items uploaded ok, trickles I think) I finally got [code]Thu May 5 16:21:52 2011 climateprediction.net Message from server: Completed result hadam3p_eu_wczh_1988_1_006821781_0 refused: this result wasn't sent (not needed) [/code] And I should let my 3 other tasks go thru to completion and not get nervous if I get the same(or similar error msg)? The website doesn't show anything as labeled "In progress" but does have some(4) labeled as "no re-submission" BTW, referring back to a comment earlier in the thread. I personally have absolutely no problem with really long tasks, as long as they are labeled as such and have appropriate deadlines. Now another project I worked with for a bit would give estimates of 5-7 hours and deadlines of a week and then give you tasks that literally ran for weeks with systems dedicated 100% to them and didn't take any checkpoints for most of that time. But you guys don't do that :-) |
©2024 cpdn.org