ATI - 6.10.13 - GFLOPS - How accurate?


log in

Advanced search

Message boards : Number crunching : ATI - 6.10.13 - GFLOPS - How accurate?

Author Message
Profile BigWaveSurfer
Send message
Joined: 29 Nov 01
Posts: 166
Credit: 8,131,737
RAC: 9,108
United States
Message 939894 - Posted: 14 Oct 2009, 17:19:21 UTC

I installed 6.10.13 on my one system with an ATI card (yes, I know SETI does not have an ap for ATI cards....yet) but I was surprised by the GFLOPS the card pulled. BOINC says the card (ATI Radeon HD 2600 1GB) is 174 GFLOPS, is that accurate?! My OC'ed 9600GT is only 41 GFLOPS, that is a huge difference. Just curious, thanks!
____________

Profile ignorance is no excuse
Avatar
Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,433,274
RAC: 0
Korea, North
Message 939895 - Posted: 14 Oct 2009, 17:23:19 UTC - in response to Message 939894.

I was using my 2600XT 1GB on Collatz until it blew up a few weeks ago. It could run a Collatz WU in about 2 hours. My 4770 can do the same work in about 12 minutes using their optimized app 2.05b . its free credits so I wouldn't knock how slow that card is. Though I would upgrade the HSF if you intend on work collatz or Milkyway on it.
____________
In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

End terrorism by building a school

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8500
Credit: 49,940,486
RAC: 50,946
United Kingdom
Message 939905 - Posted: 14 Oct 2009, 18:13:19 UTC - in response to Message 939894.

I installed 6.10.13 on my one system with an ATI card (yes, I know SETI does not have an ap for ATI cards....yet) but I was surprised by the GFLOPS the card pulled. BOINC says the card (ATI Radeon HD 2600 1GB) is 174 GFLOPS, is that accurate?! My OC'ed 9600GT is only 41 GFLOPS, that is a huge difference. Just curious, thanks!

I think that this is probably yet another example of the difference between "marketing" flops and "working" (BOINC) flops.

According to that GPUGrid chart, a 9600GT is rated by NVidia at 312 GFlops - no allowance for the overclocking. Those are what I call "marketing flops".

Unfortunately, I suspect that BOINC v6.10.13 is reporting "marketing" flops for ATI cards, and "working" flops for NVidia cards.

This is unfair, and is going to cause confusion for a long time to come - until there is a project which can process the same work on either card, and where we have some degree of confidence that the two applications are compiled with the same degree of optimisation. I doubt that will happen until OpenCL compilers are available for both manufacturers, and a project develops an application in OpenCL that can be compiled for both cards from a common codebase. Only then will we have a true comparison.

Profile ignorance is no excuse
Avatar
Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,433,274
RAC: 0
Korea, North
Message 939907 - Posted: 14 Oct 2009, 18:38:32 UTC - in response to Message 939905.

Agreed the Collatz project used optimized apps for both the ATI and CUda cards. Crunch3r has done a great job on them. It apprears that he's put a lot of effort in maximizing the ATI cards though. My 4770 runs Collatz much faster than a Cuda 260 and 275. I imagine that they are working on a better app as we speak for the Cuda cards
____________
In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

End terrorism by building a school

Profile Ageless
Avatar
Send message
Joined: 9 Jun 99
Posts: 12300
Credit: 2,598,508
RAC: 1,119
Netherlands
Message 939937 - Posted: 14 Oct 2009, 20:23:14 UTC - in response to Message 939894.

Funny, I asked Andreas (Gipsel/Cluster P.) about that the other day. His answer to me:

"The numbers for nvidia cards base on some kind of characterization or if you want benchmarking with the SETI application if I remember right.

The ATI numbers are the theoretical single precision peak performance. And yes, in that sense they are correct. So it is basically the same as saying one CPU core running at 3GHz is capable of 6 GFlops in single precision (using the x87 FPU and for a more recent CPUs, a Pentium3 is only capable of 1 flop per cycle) or 24 GFlops using SSE (only Phenom/AthlonII and Core2/Core i7, for older ones it is 12 GFlops).

To sum it up, the nvidia and ATI numbers are not comparable right now. I guess the CUDA runtime is also giving some information about the number of cores in the GPU. So it should be easy to get more comparable numbers to ATI. The peak SP performance is simply: cores * 2 * clock frequency

One could include the apocryphal "missing MUL" of the nvidia cards by using cores * 3 * frequency (and for the GTX2xx line it has some credibility, but prior GPUs didn't have the capability, even when nvidias marketing claimed otherwise), but the next generation will be officially back to 2 flops per core and cycle. In ATIs case one has to deduce the number of units in the GPU from the number of SIMD blocks and the size of those, maybe one has to do something similar for nvidia cards too. In the end, one should arrive for instance for a GTX285 running stock clocks at:

240 cores * 3 flops * 1,476 GHz = 1,063 GFlops

If one wants to include also double precision numbers, these are 1/5 of the SP peak in case of ATI and for nvidia 1/12 of the SP throughput (will be more with the next generation).

So a HD5870 has 2720 GFlops in single precision and 544 GFlops in double precision. The Milkyway app is actually capable of really using about 400 DP GFlops of it, so it is a real figure and not just made up by the marketing department. Such a card is a real monster for those puny WUs ;). A GTX285 has the already mentioned 1063 GFlops in SP and a measly 88.6 GFlops with DP (and can use roughly 50GFlops of it in Milkyway after an nvidia engineer helped the project a bit)."
____________
Jord

Fighting for the correct use of the apostrophe, together with Weird Al Yankovic

OzzFan
Volunteer tester
Avatar
Send message
Joined: 9 Apr 02
Posts: 13614
Credit: 30,360,707
RAC: 21,233
United States
Message 939992 - Posted: 14 Oct 2009, 23:37:27 UTC

Very interesting information. Thanks Jord!
____________

Profile ignorance is no excuse
Avatar
Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,433,274
RAC: 0
Korea, North
Message 940157 - Posted: 15 Oct 2009, 14:37:00 UTC

yes surprising information to say the least.
____________
In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

End terrorism by building a school

Steve
Send message
Joined: 18 May 99
Posts: 94
Credit: 68,888
RAC: 0
United States
Message 940195 - Posted: 15 Oct 2009, 18:35:30 UTC

More info on the ati 5870 internals that may matter to those that understand the technical details of everything and what may limit what or allow for what. (src: beyond3d)

best non-gaming review I've found so far anyhow - tho they do cover that slightly - I'm not in the "understand the intricate tech details" category tho it was useful to learn what marketing specs vs actual specs are, once sliced and diced.

madasczik
Avatar
Send message
Joined: 13 May 09
Posts: 12
Credit: 1,693,704
RAC: 0
United States
Message 940223 - Posted: 15 Oct 2009, 21:42:09 UTC

Cool, I was wondering why only NVIDIA cards were showing up on computer stats, felt left out with my dual ATI HD4870 X 2. It's great to see that it's being worked on. Will be nice having more cores working for the cause...going to give the 6.10.13 build a try. I wonder if it's going to see all 4 GPUs since the cards are running in quad crossfire mode... Collatz looks very promising, going to give that whirl too.

Profile ML1
Volunteer tester
Send message
Joined: 25 Nov 01
Posts: 8418
Credit: 4,134,797
RAC: 1,483
United Kingdom
Message 940226 - Posted: 15 Oct 2009, 21:54:01 UTC - in response to Message 940195.
Last modified: 15 Oct 2009, 21:54:31 UTC

More info on the ati 5870 internals that may matter to those that understand the technical details of everything and what may limit what or allow for what. (src: beyond3d)

best non-gaming review I've found so far anyhow - tho they do cover that slightly - I'm not in the "understand the intricate tech details" category tho it was useful to learn what marketing specs vs actual specs are, once sliced and diced.

At 40nm, over 2 billion transistors, and 188W peak power for a single piece of silicon, that all adds up to an impressive feat of design.

The question there though is for how well the various bottlenecks balance out. Also, how flexible is that architecture for performing more general OpenCL (CUDA-esq) operations?

One aspect that I noticed is that ATI appear to have more of a dedicated pipeline architecture whereas the nVidia architecture appears to be nearer to that of a more general purpose highly parallel array processor. Any GPU programmers able to comment on the pros/cons for programming them?


Happy fast crunchin',
Martin
____________
See new freedom: Mageia4
Linux Voice See & try out your OS Freedom!
The Future is what We make IT (GPLv3)

Profile Chirag Patel
Volunteer tester
Avatar
Send message
Joined: 13 Sep 05
Posts: 48
Credit: 8,165,270
RAC: 7,493
India
Message 941284 - Posted: 19 Oct 2009, 7:53:57 UTC

Doesn't 6.10.13 report nvidia's "marketing" FLOPS as well? The numbers in both cases (ATI and nVidia) are the peak single-precision float performance.
____________

Profile MarkJProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 08
Posts: 939
Credit: 24,083,884
RAC: 74,852
Australia
Message 941307 - Posted: 19 Oct 2009, 10:09:15 UTC - in response to Message 941284.
Last modified: 19 Oct 2009, 10:13:20 UTC

Doesn't 6.10.13 report nvidia's "marketing" FLOPS as well? The numbers in both cases (ATI and nVidia) are the peak single-precision float performance.


No. The ATI was the peak speed and the nvidia was a figure from BOINC based upon the speed the "reference" card could do. Or as others referred to them marketing flops and BOINC flops. This is one reason why the ATI appears to be faster if you look at just the numbers given at the BOINC startup.

Its been changed in 6.10.14. From the change log...

- client/scheduler: standardize the FLOPS estimate between NVIDIA and ATI. Make them both peak FLOPS, according to the formula supplied by the manufacturer.

____________
BOINC blog

Message boards : Number crunching : ATI - 6.10.13 - GFLOPS - How accurate?

Copyright © 2014 University of California