climateprediction.net home page
Africa v7.22 Errors

Africa v7.22 Errors

Message boards : Number crunching : Africa v7.22 Errors
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
ed2353

Send message
Joined: 15 Feb 06
Posts: 137
Credit: 33,452,399
RAC: 5,451
Message 51307 - Posted: 24 Jan 2015, 22:40:41 UTC - in response to Message 51306.  

Thanks again Les. You are ever helpful in explaining the vagaries of the system.
ID: 51307 · Report as offensive     Reply Quote
Profile Iain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1081
Credit: 6,982,827
RAC: 3,789
Message 51308 - Posted: 24 Jan 2015, 23:48:34 UTC - in response to Message 51305.  

However, the AFR models I downloaded at the beginning of January seem to mis-estimate the run time. About halfway through (about 6 trickles out of 12), the original estimated time has already elapsed and the Remaining time actually starts to increase!

I've not seen that happen before.


Errors in the time remaining counter aren�t uncommon. They are also harmless and don�t effect the outcome of the tasks. The hadam3p_afr currently running on my machine is going to take about 200 to finish. The time remaining counter reads 72 hours and the �elapsed� counter is at 28. It will probably reach �0� at the 50% point.

... not entirely harmless. In a mix of model types the AFR models when run will tend to inflate the estimates for other model types, which means that the host's work buffer will not be filled correctly. This is happening on a machine of mine at the moment.

If it's any consolation, the mis-estimate had been reported to the project - as has the too-generous credit allocation for that model type. No response on either question thus far.
ID: 51308 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,063,325
RAC: 928
Message 51309 - Posted: 25 Jan 2015, 6:11:14 UTC - in response to Message 51308.  

[/quote]
... not entirely harmless. In a mix of model types the AFR models when run will tend to inflate the estimates for other model types, which means that the host's work buffer will not be filled correctly. This is happening on a machine of mine at the moment.

If it's any consolation, the mis-estimate had been reported to the project - as has the too-generous credit allocation for that model type. No response on either question thus far.[/quote]

I see what you mean. The time estimate for hadam3p_anz that are waiting to start have increased by 15 hours and hadam3p_pnw by 8 hours. I don�t think this is a serious problem. I might have to wait a day or so longer to get new tasks.

ID: 51309 · Report as offensive     Reply Quote
Ingleside

Send message
Joined: 5 Aug 04
Posts: 108
Credit: 19,072,610
RAC: 36,507
Message 51317 - Posted: 26 Jan 2015, 20:57:08 UTC
Last modified: 26 Jan 2015, 20:58:46 UTC

Hmm, does the Africa-application have the same fatal bug as the PNW-application, with landing in the "crashes 100 times before giving-up", or is it something else going-on with just some bad wu's?
ID: 51317 · Report as offensive     Reply Quote
ed2353

Send message
Joined: 15 Feb 06
Posts: 137
Credit: 33,452,399
RAC: 5,451
Message 51341 - Posted: 31 Jan 2015, 10:39:00 UTC - in response to Message 51308.  

... not entirely harmless. In a mix of model types the AFR models when run will tend to inflate the estimates for other model types, which means that the host's work buffer will not be filled correctly. This is happening on a machine of mine at the moment.

If it's any consolation, the mis-estimate had been reported to the project - as has the too-generous credit allocation for that model type. No response on either question thus far.
[/quote]

Has this been corrected on the latest batch of AFRs? The estimated times on all my tasks have been more than doubled since I completed an older AFR task.

ID: 51341 · Report as offensive     Reply Quote
Profile Iain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1081
Credit: 6,982,827
RAC: 3,789
Message 51342 - Posted: 31 Jan 2015, 21:38:41 UTC - in response to Message 51341.  

... not entirely harmless. In a mix of model types the AFR models when run will tend to inflate the estimates for other model types, which means that the host's work buffer will not be filled correctly. This is happening on a machine of mine at the moment.

If it's any consolation, the mis-estimate had been reported to the project - as has the too-generous credit allocation for that model type. No response on either question thus far.


Has this been corrected on the latest batch of AFRs? The estimated times on all my tasks have been more than doubled since I completed an older AFR task.


... if you run only AFR the run-times will tend to the correct value through the application of BOINC's normal per-machine adjustments. A mismatch only arises where there is a mix of model types. For example, running ANZ/EU/PNW will make AFR models seem "cheap" and you will get "too many" of them; if you then run AFR models you won't initially get ANZ/EU/PNW because they will be judged "expensive" - when there's a big enough work deficit then the ANZ/EU/PNW will indeed download but you will get "too few" of them.
ID: 51342 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,063,325
RAC: 928
Message 51355 - Posted: 2 Feb 2015, 22:15:27 UTC

The hadam3p_afr WU�s have completely screwed up the time remaining estimates. Since I completed 1 _afr task the day before yesterday all other task estimates have more or less doubled. Hadam3p_anz tasks that take about 200 hours on that machine are reading 478 hours. Hadcm3n tasks are reading over 1000 hours, when I know form experience that they take about 475 hours. Every other type is about double there former estimates.

As these tasks run will they self-correct or do we need to edit the files somehow to fix this. And is there a fix in works for the hadam3p_afr so it doesn�t do this again.

ID: 51355 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 943
Credit: 34,192,107
RAC: 7,153
Message 51356 - Posted: 2 Feb 2015, 22:29:58 UTC - in response to Message 51355.  
Last modified: 2 Feb 2015, 22:30:38 UTC

The hadam3p_afr WU�s have completely screwed up the time remaining estimates. Since I completed 1 _afr task the day before yesterday all other task estimates have more or less doubled. Hadam3p_anz tasks that take about 200 hours on that machine are reading 478 hours. Hadcm3n tasks are reading over 1000 hours, when I know form experience that they take about 475 hours. Every other type is about double there former estimates.

As these tasks run will they self-correct or do we need to edit the files somehow to fix this. And is there a fix in works for the hadam3p_afr so it doesn�t do this again.

All that will have happened is that your DCF (Duration Correction Factor) for this project will now be attuned to the AFR estimate, rather than the estimates appropriate to the other model types.

DCF is a BOINC tool designed to avoid tasks missing deadlines. As such, it isn't of much importance to CPDN, but it still exists. To avoid those deadlines, DCF is pessimistic - it assumes the worst possible case. So, the moment it sees a task running longer than estimate, is assumes that every task will be as badly behaved.

As new tasks finish in shorter than the expected time, DCF will automatically be adjusted back down again - but more cautiously, at just a 10% adjustment each time. The graph is like a sawtooth.
ID: 51356 · Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : Number crunching : Africa v7.22 Errors

©2024 climateprediction.net