Message boards :
Number crunching :
Complete, but still running.
Message board moderation
Author | Message |
---|---|
Send message Joined: 31 Aug 04 Posts: 145 Credit: 2,061,673 RAC: 314 |
My current model completed this morning some time, when I looked at about 08:00 it was 100%, remaining ---, but was still running. I've seen similar before as finished jobs write and compress their result files etc. but it is still running now some 10 hours later. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Send message Joined: 17 Aug 04 Posts: 289 Credit: 44,103,664 RAC: 0 |
Hi adrianxw, is this your Model your speaking about ? hadcm3n_zi2g_1880_40_008249779 I put up this link in case it might help the Forum Moderators. |
Send message Joined: 31 Aug 04 Posts: 145 Credit: 2,061,673 RAC: 314 |
Yes, that is the one. It is still running now. <edit> Something else I just noticed, it was sending trickles up regularly, but they seemed to stop a few days ago. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
That model has failed well short of the finish. It's from 22 November 2012, and has failed on all computers that ran it. Just abort it and save electricity. |
Send message Joined: 12 Feb 08 Posts: 66 Credit: 4,877,652 RAC: 0 |
The last trickle is at 75% time-step. The model must have crashed at one of those "sensitive" 25,50,75% steps but still somehow continued even though it was no longer doing any usefull work. |
Send message Joined: 24 Apr 08 Posts: 6 Credit: 176,830 RAC: 0 |
I appear to be having a similar problem and was wondering if mine might also might be one of those problematic tasks. The detail page is at http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=15613742 and it's name is hadcm3n_4it3_1940_40_008311085_1 It has been at 100% for a few days and still running and consuming CPU (according to my laptop stats), but the numbers on the properties page are not increasing. I'm guess this one should be taken out back and shot, but wanted to confirm first, just in case. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
The last Timestep for that one is 1,036,800, so it's finished, but it just doesn't want to stop. There have been a few like that, and I've had 1 or 2 myself. So, yes, you'll have to kill it off yourself. Backups: Here |
Send message Joined: 24 Apr 08 Posts: 6 Credit: 176,830 RAC: 0 |
Thought so but thanks for the confirmation |
Send message Joined: 5 May 10 Posts: 69 Credit: 1,169,103 RAC: 2,258 |
I've recently aborted an hadcm3n which was pronounced finished after its penultimate trickle-up but which kept going for a day or so until I zapped it. There's still a 1.6 GB folder in the project folder with the name of the aborted task, which resetting the project hasn't cleared. I presume it's OK to delete this and the associated XML file manually? NG |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
That model didn't finish. But that's yet another failure mode with this model type: get very close to the end and then fail for some reason. Bad luck there. (I guess that you're talking about the BOINC manager saying finished; i.e. at 100%. This happens when it stops getting info back from the model, even if the model hasn't actually finished.) So yes, you have to get rid of that folder and file manually. |
Send message Joined: 5 May 10 Posts: 69 Credit: 1,169,103 RAC: 2,258 |
Thanks, Les. Yes I did mean it was showing in BOINC Manager as completed. Bad luck there. While it would have been nice to have completed it properly and to have uploaded the final data, I suppose I did get a lot further with it than the two previous recipients! :) NG |
Send message Joined: 5 May 10 Posts: 69 Credit: 1,169,103 RAC: 2,258 |
I wrote: I've recently aborted an hadcm3n which was pronounced finished after its penultimate trickle-up but which kept going for a day or so until I zapped it. I notice that the information about the task has now been updated on the Web site. The trickle at timestep 1,036,800 was returned and I've received the full credit. It seems only the last batch of data wasn't uploaded. The Sterr is full of raving about environment variables being ignored. http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=15793995 NG |
Send message Joined: 16 Jan 10 Posts: 1084 Credit: 7,623,793 RAC: 4,729 |
... The Sterr is full of raving about environment variables being ignored.That's a Mac thing and neither CPDN- nor BOINC-specific - at least, last time I looked. |
©2024 cpdn.org