climateprediction.net home page
Should I continue crunching this work unit?

Should I continue crunching this work unit?

Message boards : Number crunching : Should I continue crunching this work unit?
Message board moderation

To post messages, you must log in.

AuthorMessage
vdquang

Send message
Joined: 5 Mar 06
Posts: 5
Credit: 405,068
RAC: 0
Message 38327 - Posted: 20 Nov 2009, 3:06:20 UTC

I started crunching \"hadsm3mh_kp2_006022174_2\" long time ago. Initially it indicated that crunching this work would last about 800 hours. But now, after 2481 hours of crunching, the progress accounts for only 4.451% and the time remained to complete this work becomes 3515 hours! (Let\'s see the below)
---------------------------
Application: UK Met Office HADSM3 Mid-Holocene 6.02
Name: hadsm3mh_kp2_006022174_2
CPU time: 2481:27:53
Progress: 4.451%
To complete: 3515:09:58
Report deadline: 2/12/2010
---------------------------
By the way, the progress remains around 4.451% while the time to complete is increasing all the time. Should I continue crunching this work?
ID: 38327 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2184
Credit: 64,822,615
RAC: 5,275
Message 38328 - Posted: 20 Nov 2009, 5:24:05 UTC

I would not continue crunching that work. I would abort it. It appears to be looping, i.e. continually restarting at the same point and never making any progress.
ID: 38328 · Report as offensive     Reply Quote
vdquang

Send message
Joined: 5 Mar 06
Posts: 5
Credit: 405,068
RAC: 0
Message 38329 - Posted: 20 Nov 2009, 6:33:17 UTC - in response to Message 38328.  
Last modified: 20 Nov 2009, 6:34:12 UTC

Thanks, I have aborted it
ID: 38329 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 38331 - Posted: 20 Nov 2009, 8:53:41 UTC

That\'s interesting. The three computers that progressed further and in one case completed the task were all AMD, whereas vdquang has an Intel. The workunit is here.
Cpdn news
ID: 38331 · Report as offensive     Reply Quote
vdquang

Send message
Joined: 5 Mar 06
Posts: 5
Credit: 405,068
RAC: 0
Message 39300 - Posted: 23 Mar 2010, 5:52:24 UTC - in response to Message 38331.  

Hi,
Perhaps I hit again another looping work unit. Initially it stated that crunching the unit \"hadsm3fub_k95o_006469046\" would last about 1000 hours (sorry, I don\'t remember the exact number). But now it seems to crunch indefinite time. Let\'s see 2 notes below (I registered 5 days ago and today):
1/ 18 march 2010
CPU time: 981:36:05
Progress: 21.252%
To completion: 1078:17:30
2/ Today (23 march 2010)
CPU time: 1061:35:01
Progress: 21.551%
To completion: 1141:03:51
Is it really a looping work unit or not?

ID: 39300 · Report as offensive     Reply Quote
transient

Send message
Joined: 3 Oct 06
Posts: 43
Credit: 8,017,057
RAC: 0
Message 39301 - Posted: 23 Mar 2010, 6:10:51 UTC

Is this the task you\'re talking about? http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=10336578 It trickled for the last time January 6th. It used to trickle about once a day. I would abort it.
ID: 39301 · Report as offensive     Reply Quote
metalius
Avatar

Send message
Joined: 28 Nov 06
Posts: 89
Credit: 11,968,919
RAC: 2,997
Message 39303 - Posted: 23 Mar 2010, 7:46:52 UTC - in response to Message 39300.  
Last modified: 23 Mar 2010, 7:47:22 UTC

Is it really a looping work unit or not?

It may be a rewinding task, we are talking about such tasks here.
Look to the current speed, if it is not slow, your task may be finished with \"Success\".
ID: 39303 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 39308 - Posted: 23 Mar 2010, 10:52:35 UTC
Last modified: 23 Mar 2010, 11:04:58 UTC

Hi Vdquang

You have an \'iceworld\'. The phenomenon is described here by Geophi.

All computers on the same platform with the same operating system can be expected to hit this problem at the same point. That\'s what\'s happened with this workunit and you\'re not the only person with exactly the same iceworld problem at the same point. If you look at the model\'s graphics they will be monochrome showing only the default colour.

The workunit is here; Vdquang\'s model is #4 in the list. One very fast computer is managing to send in a few trickles. But look at the sec/TS ie the speed (or, rather, how slow it is).

If you restore a backup the same problem will happen again at the same point. Please abort the model.

If you can please look quickly at your graphics for HadSM or HadSM MH models at least twice a week to check that you can see all the normal colours. Normal graphics indicate normal progress producing good data.
Cpdn news
ID: 39308 · Report as offensive     Reply Quote
vdquang

Send message
Joined: 5 Mar 06
Posts: 5
Credit: 405,068
RAC: 0
Message 39325 - Posted: 24 Mar 2010, 4:47:55 UTC - in response to Message 39308.  

Reply to transient (message ID 39301):
Is this the task you\'re talking about? http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=10336578 It trickled for the last time January 6th. It used to trickle about once a day. I would abort it.

Oh, I have never paid my attention to tricle information. It is right that this work unit tricled for the last time on 6th January.
-----------------

Reply to mo.v (message ID 39308):
Hi Vdquang
You have an \'iceworld\'. The phenomenon is described here by Geophi.
............
Please abort the model.
............

OK, I am going to abort it now.

ID: 39325 · Report as offensive     Reply Quote

Message boards : Number crunching : Should I continue crunching this work unit?

©2024 cpdn.org