climateprediction.net home page
Relation between CPU L2 cache size and crunching speed

Relation between CPU L2 cache size and crunching speed

Message boards : Number crunching : Relation between CPU L2 cache size and crunching speed
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user462689

Send message
Joined: 23 Jul 07
Posts: 4
Credit: 387,306
RAC: 0
Message 34218 - Posted: 4 Jul 2008, 14:38:11 UTC
Last modified: 4 Jul 2008, 14:39:58 UTC

Mow much does the amount of L2 cache on CPUs affect the crunching speeds on climatepridection.net? Assume that all other factors are equal.

In climatepridiction.net, How much faster will an Intel E4400 (2GHz and 2MB L2 cache) fare over an Intel E2180 (2GHz and 1MB L2 cache)? Has anyone done any tests investigating the impact of L2 cache size on crunching speeds on climatepridiction.net?
ID: 34218 · Report as offensive     Reply Quote
Profile tullio

Send message
Joined: 6 Aug 04
Posts: 264
Credit: 965,476
RAC: 0
Message 34220 - Posted: 5 Jul 2008, 4:41:02 UTC

I am running a hadam3h model on an Opteron 1210 with 2 cores with 1 MB L2 cache each. RAM is 2 GB, OS Linux. The RAM usage of the model varies violently from 5% to 20%. This is the only case in 6 BOINC applications (Einstein, SETI, QMC, climateprediction.net, CPDN Beta, LHC) in which I see the RAM usage so variable in short times and I am not able to explain it.
Tullio
ID: 34220 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 34221 - Posted: 5 Jul 2008, 4:58:52 UTC

It\'s inherent in the design of the application. It\'s also the reason participants won\'t receive one unless they have at least 1.5 GB RAM.

It\'s a higher resolution Model than the others (so far) and it grabs what it needs when it needs it, then releases what it doesn\'t need when the \"surge\" (sorry) is done. Until next time. (It is beautifully displayed in Linux memory-use graphics.)

"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 34221 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 34229 - Posted: 6 Jul 2008, 14:17:23 UTC

It\'s discussed in the final posts by MikeMars and Geophi in this thread.
Cpdn news
ID: 34229 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2184
Credit: 64,822,615
RAC: 5,275
Message 34232 - Posted: 6 Jul 2008, 19:38:00 UTC

It\'s been a long time, but when running the original spinups that preceded the coupled models (hadcm3\'s), I had an Athlon64 3400+ and 3700+ (both Socket 754). Both ran at 2.4 GHz, had RAM running the same memory timings, and the only difference was the L2 cache on the 3700+ was 1 MB vs. 512 KB on the 3400+. The 3700+ ran the spinup at about 1.74 s/TS while the 3400+ ran it at 1.83 s/TS. So, assuming no significant difference in speed due to model parameters, the 3700+ was about 5% faster.

On the original seasonal experiment using the hadam3 model, I ran a 3 GHz Pentium 4 with 512 KB L2 cache, and later a 3 GHz Pentium 4 with 2 MB L2 cache. The 512 KB cache processor ran it at about 22.5 s/TS while the 2 MB cache processor ran it at 19.3 s/TS. Of course there was a difference in RAM as well, as the CPU with the larger cache was also paired with DDR2 memory as opposed to DDR1.
ID: 34232 · Report as offensive     Reply Quote
DJStarfox

Send message
Joined: 27 Jan 07
Posts: 300
Credit: 3,288,263
RAC: 26,370
Message 34245 - Posted: 8 Jul 2008, 17:56:58 UTC - in response to Message 34229.  

It\'s discussed in the final posts by MikeMars and Geophi in this thread.


I also asked and got a reply about that here.
http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=5935&nowrap=true
ID: 34245 · Report as offensive     Reply Quote
old_user534678

Send message
Joined: 1 Sep 08
Posts: 5
Credit: 2,509
RAC: 0
Message 34857 - Posted: 1 Sep 2008, 16:22:14 UTC

Maybe I\'m OT, what means s/TS and where I can read this value for my system?
thanks
ID: 34857 · Report as offensive     Reply Quote
KAMasud

Send message
Joined: 6 Oct 06
Posts: 204
Credit: 7,608,986
RAC: 0
Message 34858 - Posted: 1 Sep 2008, 16:44:46 UTC

In Boinc manager click on \"show graphics\", when the graphics window is open then press the key \"Z\". If you would like to, press the key\"H\" for help.
Regards
Masud.
ID: 34858 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 34859 - Posted: 1 Sep 2008, 17:17:50 UTC - in response to Message 34857.  

Maybe I\'m OT, what means s/TS and where I can read this value for my system?
thanks


... KSMasud has suggested one good place; another is on the Web page for your model, after the model has sent a \'trickle\' (after one model year).

s/TS is simply \'seconds per timestep\' - i.e. the number of CPU seconds used by the model, divided by the number of timesteps the model has completed. It gives an estimate of the speed of the model on your computer. Different computers have different values for s/TS and different model types also have different values (HADSM3 smallest, HADCM3 larger, HADAM3 largest).

If you run your model all the time, then s/TS multipled by the number of timesteps in the completed model would give the minimum time that the model would take to finish.
ID: 34859 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 34860 - Posted: 1 Sep 2008, 17:20:00 UTC
Last modified: 1 Sep 2008, 17:20:37 UTC

The \'sec/TS\' abbreviation means seconds per timestep. As well as seeing the value in each model\'s graphics display, it\'s in the last column of the web page for each model. For example, when your new model has produced a trickle, you\'ll see it there.

The 3 types of model typically produce different sec/TS values.

Edit - Iain answered first!
Cpdn news
ID: 34860 · Report as offensive     Reply Quote
old_user534678

Send message
Joined: 1 Sep 08
Posts: 5
Credit: 2,509
RAC: 0
Message 34894 - Posted: 4 Sep 2008, 15:49:20 UTC - in response to Message 34860.  
Last modified: 4 Sep 2008, 15:50:37 UTC

Thanks for the answers, so (if I have understood correctly :)) the lower the value the faster is the machine, is right?
Now it says 1.28 s/TS but it\'s going down..
ID: 34894 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 34895 - Posted: 4 Sep 2008, 16:49:47 UTC
Last modified: 4 Sep 2008, 16:51:29 UTC

Yes, that\'s right.

The figure shown on the trickle page (e.g. 7606940) is the average over the whole model run, so it changes quite slowly when things cause the PC to speed up or slow down. If you want to find out what\'s happening right now, then take the difference between the CPU Time figures and divide by the number of steps in a trickle.

Model 7606940 is a \'slab\' model (HADSM3), which has 10,802 timesteps per trickle, so for the trickles submitted at 04 Sep 2008 14:48:32 and 04 Sep 2008 09:47:03:

- the average sec/TS is 152,546 / 118,822 = 1.2838 sec/TS

- the \'current\' sec/TS is (152,546 - 140,186) / 10,802 = 1.1442 sec /TS

So, the machine has sped up quite a bit, and the average speed is catching up with the real speed.
ID: 34895 · Report as offensive     Reply Quote
old_user534678

Send message
Joined: 1 Sep 08
Posts: 5
Credit: 2,509
RAC: 0
Message 34904 - Posted: 5 Sep 2008, 12:14:45 UTC

Now I\'ve push up my CPU (Athlon X2 4850e) to 2.8 Ghz and s/TS has go down to 1-1.01
it is good? does the Core 2 cpu get better figures at the same speed?
ID: 34904 · Report as offensive     Reply Quote
Profile Ananas
Volunteer moderator

Send message
Joined: 31 Oct 04
Posts: 336
Credit: 3,316,482
RAC: 0
Message 34907 - Posted: 5 Sep 2008, 14:51:07 UTC
Last modified: 5 Sep 2008, 14:56:56 UTC

Hard to tell. On a Q9450 I had ~ 0.8580 (FSB set to 400), which would be about 4%-5% faster (calculated for 2.8GHz), but it has fast dual channel RAM, which sure plays a role too.

I don\'t think that the AMD CPUs are really slower, as long as a program is not especially optimized for Intel.
ID: 34907 · Report as offensive     Reply Quote
old_user534678

Send message
Joined: 1 Sep 08
Posts: 5
Credit: 2,509
RAC: 0
Message 34908 - Posted: 5 Sep 2008, 15:32:19 UTC - in response to Message 34907.  

Hard to tell. On a Q9450 I had ~ 0.8580 (FSB set to 400), which would be about 4%-5% faster (calculated for 2.8GHz), but it has fast dual channel RAM, which sure plays a role too.

I don\'t think that the AMD CPUs are really slower, as long as a program is not especially optimized for Intel.


Currently I have one stick of 2gb DDR2-800 so single-channel only...
on Rosetta 2 months ago I hadn\'t seen changes on performances between single or dual channel (2x2Gb)
what about CPDN?
the Intel aren\'t faster even with the other two types of WUs?

P.S= Excuse for my English but it isn\'t my native language, I\'m Italian ;)
ID: 34908 · Report as offensive     Reply Quote
Profile Ananas
Volunteer moderator

Send message
Joined: 31 Oct 04
Posts: 336
Credit: 3,316,482
RAC: 0
Message 34909 - Posted: 5 Sep 2008, 16:27:41 UTC
Last modified: 5 Sep 2008, 16:29:16 UTC

From my experience I would say, that there is a relation between memory throughput and crunching speed.


If I crunch two CPDN models on a dual CPU or dual core computer, it crunches a bit slower than having only one CPDN model combined with a different project.

This effect has been quite strong on old P3 Tualatin boxes and even worse on Athlon MP, but I can even see it on current dualcore and quadcore 45nm CPUs.

So if you have two main projects, it is more efficient to run CPDN and the other project concurrent instead of altering between the projects.


POEM might be an exception as POEM needs a high memory throughput itself for a good speed. Very good combinations are SIMAP+CPDN and Spinhenge+CPDN. As you can see the change within 2 trickles, it\'s fairly easy to figure out, which \"co-project\" matches CPDN good.


As I cannot think of any other limited ressource, that (without graphics usage) two concurrent workunits would compete for, my conclusion is that it has to be the RAM throughput.

Of course, HD throughput is a shared ressource as well, but CPDN does not torture the HD so that cannot be the factor.


p.s.: I\'m not a native English speaker either :-)
ID: 34909 · Report as offensive     Reply Quote
old_user534678

Send message
Joined: 1 Sep 08
Posts: 5
Credit: 2,509
RAC: 0
Message 34910 - Posted: 5 Sep 2008, 16:45:59 UTC
Last modified: 5 Sep 2008, 16:50:50 UTC

For the moment I\'ve chosen to give a core to CPDN and the other to Folding@Home so no switching between the two projects.

In October or November this system will become an HTPC and I think to get a Q6600 platform to improve work/power-consumption ratio.

But, after the RAC will be stable, I\'ll do some tests adding the other 2Gb stick.
ID: 34910 · Report as offensive     Reply Quote

Message boards : Number crunching : Relation between CPU L2 cache size and crunching speed

©2024 cpdn.org