Relation between CPU L2 cache size and crunching speed

Author	Message
old_user462689 Send message Joined: 23 Jul 07 Posts: 4 Credit: 387,306 RAC: 0	Message 34218 - Posted: 4 Jul 2008, 14:38:11 UTC Last modified: 4 Jul 2008, 14:39:58 UTC Mow much does the amount of L2 cache on CPUs affect the crunching speeds on climatepridection.net? Assume that all other factors are equal. In climatepridiction.net, How much faster will an Intel E4400 (2GHz and 2MB L2 cache) fare over an Intel E2180 (2GHz and 1MB L2 cache)? Has anyone done any tests investigating the impact of L2 cache size on crunching speeds on climatepridiction.net? ID: 34218 · Reply Quote

tullio Send message Joined: 6 Aug 04 Posts: 264 Credit: 965,476 RAC: 0	Message 34220 - Posted: 5 Jul 2008, 4:41:02 UTC I am running a hadam3h model on an Opteron 1210 with 2 cores with 1 MB L2 cache each. RAM is 2 GB, OS Linux. The RAM usage of the model varies violently from 5% to 20%. This is the only case in 6 BOINC applications (Einstein, SETI, QMC, climateprediction.net, CPDN Beta, LHC) in which I see the RAM usage so variable in short times and I am not able to explain it. Tullio ID: 34220 · Reply Quote

astroWX Volunteer moderator Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0	Message 34221 - Posted: 5 Jul 2008, 4:58:52 UTC It\'s inherent in the design of the application. It\'s also the reason participants won\'t receive one unless they have at least 1.5 GB RAM. It\'s a higher resolution Model than the others (so far) and it grabs what it needs when it needs it, then releases what it doesn\'t need when the \"surge\" (sorry) is done. Until next time. (It is beautifully displayed in Linux memory-use graphics.) "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. ID: 34221 · Reply Quote

mo.v Volunteer moderator Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0	Message 34229 - Posted: 6 Jul 2008, 14:17:23 UTC It\'s discussed in the final posts by MikeMars and Geophi in this thread. Cpdn news ID: 34229 · Reply Quote

geophi Volunteer moderator Send message Joined: 7 Aug 04 Posts: 2184 Credit: 64,822,615 RAC: 5,275	Message 34232 - Posted: 6 Jul 2008, 19:38:00 UTC It\'s been a long time, but when running the original spinups that preceded the coupled models (hadcm3\'s), I had an Athlon64 3400+ and 3700+ (both Socket 754). Both ran at 2.4 GHz, had RAM running the same memory timings, and the only difference was the L2 cache on the 3700+ was 1 MB vs. 512 KB on the 3400+. The 3700+ ran the spinup at about 1.74 s/TS while the 3400+ ran it at 1.83 s/TS. So, assuming no significant difference in speed due to model parameters, the 3700+ was about 5% faster. On the original seasonal experiment using the hadam3 model, I ran a 3 GHz Pentium 4 with 512 KB L2 cache, and later a 3 GHz Pentium 4 with 2 MB L2 cache. The 512 KB cache processor ran it at about 22.5 s/TS while the 2 MB cache processor ran it at 19.3 s/TS. Of course there was a difference in RAM as well, as the CPU with the larger cache was also paired with DDR2 memory as opposed to DDR1. ID: 34232 · Reply Quote

DJStarfox Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370	Message 34245 - Posted: 8 Jul 2008, 17:56:58 UTC - in response to Message 34229. It\'s discussed in the final posts by MikeMars and Geophi in this thread. I also asked and got a reply about that here. http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=5935&nowrap=true ID: 34245 · Reply Quote

old_user534678 Send message Joined: 1 Sep 08 Posts: 5 Credit: 2,509 RAC: 0	Message 34857 - Posted: 1 Sep 2008, 16:22:14 UTC Maybe I\'m OT, what means s/TS and where I can read this value for my system? thanks ID: 34857 · Reply Quote

KAMasud Send message Joined: 6 Oct 06 Posts: 204 Credit: 7,608,986 RAC: 0	Message 34858 - Posted: 1 Sep 2008, 16:44:46 UTC In Boinc manager click on \"show graphics\", when the graphics window is open then press the key \"Z\". If you would like to, press the key\"H\" for help. Regards Masud. ID: 34858 · Reply Quote

Iain Inglis Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317	Message 34859 - Posted: 1 Sep 2008, 17:17:50 UTC - in response to Message 34857. Maybe I\'m OT, what means s/TS and where I can read this value for my system? thanks ... KSMasud has suggested one good place; another is on the Web page for your model, after the model has sent a \'trickle\' (after one model year). s/TS is simply \'seconds per timestep\' - i.e. the number of CPU seconds used by the model, divided by the number of timesteps the model has completed. It gives an estimate of the speed of the model on your computer. Different computers have different values for s/TS and different model types also have different values (HADSM3 smallest, HADCM3 larger, HADAM3 largest). If you run your model all the time, then s/TS multipled by the number of timesteps in the completed model would give the minimum time that the model would take to finish. ID: 34859 · Reply Quote

mo.v Volunteer moderator Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0	Message 34860 - Posted: 1 Sep 2008, 17:20:00 UTC Last modified: 1 Sep 2008, 17:20:37 UTC The \'sec/TS\' abbreviation means seconds per timestep. As well as seeing the value in each model\'s graphics display, it\'s in the last column of the web page for each model. For example, when your new model has produced a trickle, you\'ll see it there. The 3 types of model typically produce different sec/TS values. Edit - Iain answered first! Cpdn news ID: 34860 · Reply Quote

old_user534678 Send message Joined: 1 Sep 08 Posts: 5 Credit: 2,509 RAC: 0	Message 34894 - Posted: 4 Sep 2008, 15:49:20 UTC - in response to Message 34860. Last modified: 4 Sep 2008, 15:50:37 UTC Thanks for the answers, so (if I have understood correctly :)) the lower the value the faster is the machine, is right? Now it says 1.28 s/TS but it\'s going down.. ID: 34894 · Reply Quote

Iain Inglis Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317	Message 34895 - Posted: 4 Sep 2008, 16:49:47 UTC Last modified: 4 Sep 2008, 16:51:29 UTC Yes, that\'s right. The figure shown on the trickle page (e.g. 7606940) is the average over the whole model run, so it changes quite slowly when things cause the PC to speed up or slow down. If you want to find out what\'s happening right now, then take the difference between the CPU Time figures and divide by the number of steps in a trickle. Model 7606940 is a \'slab\' model (HADSM3), which has 10,802 timesteps per trickle, so for the trickles submitted at 04 Sep 2008 14:48:32 and 04 Sep 2008 09:47:03: - the average sec/TS is 152,546 / 118,822 = 1.2838 sec/TS - the \'current\' sec/TS is (152,546 - 140,186) / 10,802 = 1.1442 sec /TS So, the machine has sped up quite a bit, and the average speed is catching up with the real speed. ID: 34895 · Reply Quote

old_user534678 Send message Joined: 1 Sep 08 Posts: 5 Credit: 2,509 RAC: 0	Message 34904 - Posted: 5 Sep 2008, 12:14:45 UTC Now I\'ve push up my CPU (Athlon X2 4850e) to 2.8 Ghz and s/TS has go down to 1-1.01 it is good? does the Core 2 cpu get better figures at the same speed? ID: 34904 · Reply Quote

Ananas Volunteer moderator Send message Joined: 31 Oct 04 Posts: 336 Credit: 3,316,482 RAC: 0	Message 34907 - Posted: 5 Sep 2008, 14:51:07 UTC Last modified: 5 Sep 2008, 14:56:56 UTC Hard to tell. On a Q9450 I had ~ 0.8580 (FSB set to 400), which would be about 4%-5% faster (calculated for 2.8GHz), but it has fast dual channel RAM, which sure plays a role too. I don\'t think that the AMD CPUs are really slower, as long as a program is not especially optimized for Intel. ID: 34907 · Reply Quote

old_user534678 Send message Joined: 1 Sep 08 Posts: 5 Credit: 2,509 RAC: 0	Message 34908 - Posted: 5 Sep 2008, 15:32:19 UTC - in response to Message 34907. Hard to tell. On a Q9450 I had ~ 0.8580 (FSB set to 400), which would be about 4%-5% faster (calculated for 2.8GHz), but it has fast dual channel RAM, which sure plays a role too. I don\'t think that the AMD CPUs are really slower, as long as a program is not especially optimized for Intel. Currently I have one stick of 2gb DDR2-800 so single-channel only... on Rosetta 2 months ago I hadn\'t seen changes on performances between single or dual channel (2x2Gb) what about CPDN? the Intel aren\'t faster even with the other two types of WUs? P.S= Excuse for my English but it isn\'t my native language, I\'m Italian ;) ID: 34908 · Reply Quote

Ananas Volunteer moderator Send message Joined: 31 Oct 04 Posts: 336 Credit: 3,316,482 RAC: 0	Message 34909 - Posted: 5 Sep 2008, 16:27:41 UTC Last modified: 5 Sep 2008, 16:29:16 UTC From my experience I would say, that there is a relation between memory throughput and crunching speed. If I crunch two CPDN models on a dual CPU or dual core computer, it crunches a bit slower than having only one CPDN model combined with a different project. This effect has been quite strong on old P3 Tualatin boxes and even worse on Athlon MP, but I can even see it on current dualcore and quadcore 45nm CPUs. So if you have two main projects, it is more efficient to run CPDN and the other project concurrent instead of altering between the projects. POEM might be an exception as POEM needs a high memory throughput itself for a good speed. Very good combinations are SIMAP+CPDN and Spinhenge+CPDN. As you can see the change within 2 trickles, it\'s fairly easy to figure out, which \"co-project\" matches CPDN good. As I cannot think of any other limited ressource, that (without graphics usage) two concurrent workunits would compete for, my conclusion is that it has to be the RAM throughput. Of course, HD throughput is a shared ressource as well, but CPDN does not torture the HD so that cannot be the factor. p.s.: I\'m not a native English speaker either :-) ID: 34909 · Reply Quote

old_user534678 Send message Joined: 1 Sep 08 Posts: 5 Credit: 2,509 RAC: 0	Message 34910 - Posted: 5 Sep 2008, 16:45:59 UTC Last modified: 5 Sep 2008, 16:50:50 UTC For the moment I\'ve chosen to give a core to CPDN and the other to Folding@Home so no switching between the two projects. In October or November this system will become an HTPC and I think to get a Q6600 platform to improve work/power-consumption ratio. But, after the RAC will be stable, I\'ll do some tests adding the other 2Gb stick. ID: 34910 · Reply Quote