climateprediction.net home page
More FPU or Integer Power needed?

More FPU or Integer Power needed?

Message boards : Number crunching : More FPU or Integer Power needed?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
ojum-le

Send message
Joined: 5 May 07
Posts: 27
Credit: 6,369,307
RAC: 0
Message 43613 - Posted: 27 Dec 2011, 11:29:44 UTC - in response to Message 43595.  

your calculation is not correct i think.
You have to multipy results with the core-number:
Integer:
11954 (4core)= 11954x4=47816
6690 (8core) = 6690x8=53520 (some little increase)

FPU:
4360 (4core) = 4360x4=17440
3767 (8core) = 3767x8=30136 (giant increase, same TDP)

It seems Intel SMT could boost your calculating power, especially FPU-calculations got most profit of SMT.

But this is theoretical Benchmark, and how we know, HadCM3N-Models 'hate' Intel-SMT.


Maybe, the best Benchmark for CPDN is the 481.wrf at the SPEC CPU-Benchmark. http://www.spec.org/cpu2006/Docs/481.wrf.html

Intel's I7 3960x generates 233 wrf-Points! Best Value ever for signle-CPU.
http://www.spec.org/cpu2006/results/res2011q4/cpu2006-20111121-18924.html

Otherwise i think, CPDN-Executes doesn't use all Compiler-optimizations for modern and different CPU-types. Is there any outlook to run the models on nVidia's TESLA-Chip?
ID: 43613 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 43614 - Posted: 27 Dec 2011, 19:19:21 UTC

Is there any outlook to run the models on nVidia's TESLA-Chip?

Only if the UK Met Office, whose programs we run, stop using supercomputers for their weather and climate modelling and research, and re-write everything for graphics processors.
Then there'd be a chance of us using them too.


Backups: Here
ID: 43614 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 43615 - Posted: 27 Dec 2011, 20:41:59 UTC

Adding to Les' point, and his point is fundamental, CPDN is compiled to include the maximum available personal computer technologies. The project is not focused on the latest technology, which would exclude many participants. (That said, sometimes a technology becomes so old, used by so few participants, that it becomes such a drag on maintenance that it is dropped. Pre-SSE2 machines are a case in point.)

"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 43615 · Report as offensive     Reply Quote
old_user662966

Send message
Joined: 22 Sep 11
Posts: 24
Credit: 243,945
RAC: 0
Message 43617 - Posted: 28 Dec 2011, 0:00:35 UTC - in response to Message 43613.  

your calculation is not correct i think.
You have to multipy results with the core-number:
Integer:
11954 (4core)= 11954x4=47816
6690 (8core) = 6690x8=53520 (some little increase)

FPU:
4360 (4core) = 4360x4=17440
3767 (8core) = 3767x8=30136 (giant increase, same TDP)

It seems Intel SMT could boost your calculating power, especially FPU-calculations got most profit of SMT.

But this is theoretical Benchmark, and how we know, HadCM3N-Models 'hate' Intel-SMT.


Maybe, the best Benchmark for CPDN is the 481.wrf at the SPEC CPU-Benchmark. http://www.spec.org/cpu2006/Docs/481.wrf.html

Intel's I7 3960x generates 233 wrf-Points! Best Value ever for signle-CPU.
http://www.spec.org/cpu2006/results/res2011q4/cpu2006-20111121-18924.html

Otherwise i think, CPDN-Executes doesn't use all Compiler-optimizations for modern and different CPU-types. Is there any outlook to run the models on nVidia's TESLA-Chip?


Hi ojum-le,

I think we're making the same point, though I didn't follow through with the core multiplications as you have to get the total core benchmark performance. But, methinks there's still one more comparative computation that needs to be made...

Had there been no hyperthread benchmark penalties on either Integer or FPU computation rates we would result with the following numbers...

Integer:
11954 (4core)= 11954x4=47816
6690 (8core) = 6690x8=53520 (some little increase)
11954 per core with no hyperthread penalty would = 11954x8=95656 (Again, 53520 = ~44% performance hit)

FPU:
4360 (4core) = 4360x4=17440
3767 (8core) = 3767x8=30136 (giant increase, same TDP)
4360 per core with no hyperthread penalty would = 4360x8=34880 (Again, 30136 = ~14% performance hit)

So two side-by-side i7 2600K @ 3.4GHz machines running 4 WUs each will outperform one i7 2600K @ 3.4GHz running 8 WUs based on these results. Though I should think that based on a cost ratio between the two setups that it would be less expensive per WU to do 8WUs across one machine vs. two... :)

Expositions of any mathematical errors or poor logic on my part are always welcomed!

:)
ID: 43617 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 24,535,832
RAC: 2,045
Message 52117 - Posted: 27 Jun 2015, 14:17:45 UTC

Hello fellows,

I've been trying to do some comparisons of CPUs running CPDN in order to get some sense of a possible upgrades. I was looking at WUprop@Home data and found it not very usefull, so I did some checking of reported tasks here at CPDN. I'm mostly interested in the heavy Linux models so here are some numbers - best to worst. FPS = floating point speed as measured and reported on CPDN.

UK Met Office HadAM3P and HadRM3P model with MOSES II and TRIFFID Europe v7.01 - timesteps 138539

i7-4770K CPU @ 3.50GHz - running 6.07 days (FPS 4134.52 - 4280.02)
i7-3770 CPU @ 3.40GHz - running 9.44 days (FPS 3682.65)
Core 2 Duo CPU T8300 @ 2.40GHz - running 12.88 - 15.33 days (FPS 2563.6)
Core 2 Duo CPU E7300 @ 2.66GHz - running 13.81 days (FPS 3081.75)
i7-4790K CPU @ 4.00GHz - running 11.11 - 14.26 days (FPS 2606.34 - 4747.16)

UK Met Office HadAM3P (global only) with MOSES II landsurface scheme v7.03 - timesteps 348548

i7-4770K CPU @ 3.50GHz - running 4.71 days (FPS 4280.02)
i7-3770K CPU @ 3.50GHz - running 5.03 days (FPS 3682.65)
i7-4790K CPU @ 4.00GHz - running 6.02 days (FPS 4134.52)
i7-3770 CPU @ 3.40GHz - running 6.08 days (FPS 3224.14)
Core 2 Duo CPU T8300 @ 2.40GHz - running 9.68 days (FPS 2563.6)
Core 2 Duo CPU E7300 @ 2.66GHz - running 12.15 - 15.9 days (FPS 3081.75)

According to my search i7-4770K is performing better than i7-4790K (sample 4 CPUs-2x2, checked numerous tasks). It would be nice if we can have more CPUs added. As I said WUprop@Home data is insufficient (or I just look at wrong places) that is why I put this here.

cheers


ID: 52117 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 52118 - Posted: 27 Jun 2015, 17:25:13 UTC - in response to Message 52117.  

According to my search i7-4770K is performing better than i7-4790K (sample 4 CPUs-2x2, checked numerous tasks).

That data can't be accurate. I have both an i7-4770 and and i7-4790 (non-K models) that run at their default speeds, 3.7 GHz and 3.8 GHz respectively. At present, they are running identical tasks, 3 ATLAS and 3 WCG/CEP2, along with supporting two identical GPU's each on GPUGrid. The difference in performance is just as you would suspect: a small advantage (a couple of percent or so) for the i7-4790.

I have run them both on CPDN also, and it is a little hard to make direct comparisons due to the variations in the work units, but you are welcome to try:
i7-4770: http://climateapps2.oerc.ox.ac.uk/cpdnboinc/results.php?hostid=1350074
i7-4790: http://climateapps2.oerc.ox.ac.uk/cpdnboinc/results.php?hostid=1351652

They look just as you would expect to me.
ID: 52118 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 24,535,832
RAC: 2,045
Message 52119 - Posted: 27 Jun 2015, 17:52:17 UTC - in response to Message 52118.  
Last modified: 27 Jun 2015, 18:00:07 UTC

Well the ones I give are from Linux machines and from only 4 CPUs observed that run the heavy models. Since it is odd to me as well that is why I ask people to share their numbers.

In fact there is ~30% difference in the run time of UK Met Office HadCM3 short v7.24 models on your i7-4790 and some were done slower than your i7-4770
http://climateapps2.oerc.ox.ac.uk/cpdnboinc/results.php?hostid=1351652


Cheers
edit: You can see here how one i7-4790K is performing compared to your i7-4770 on the UK Met Office HadCM3 short v7.24
ID: 52119 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 52120 - Posted: 27 Jun 2015, 21:27:03 UTC

A better comparison may be the Average (sec/TS) figure.
This is what the model is actually achieving, not what BOINC is reporting for a FPS.


ID: 52120 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 24,535,832
RAC: 2,045
Message 52121 - Posted: 28 Jun 2015, 16:31:26 UTC - in response to Message 52120.  

The RunAverage (sec/TS) can be calculated as running_time/Timesteps and is in the range (-5.61% +15.20%) compared to Average (sec/Ts) (CPU Time/Timesteps), so the order does not really change.

However I noticed that for the global only i mistakenly reported
i7-4790K CPU @ 4.00GHz - running 6.02 days (FPS 4134.52) the accurate is
i7-4770K CPU @ 3.50GHz - running 6.02 days (FPS 4134.52) (range 4.71-6.02 days)

I could not find i7-4790K linux machine running UK Met Office HadAM3P (global only) with MOSES II landsurface scheme v7.03
ID: 52121 · Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : Number crunching : More FPU or Integer Power needed?

©2024 climateprediction.net