ATI - 6.10.13 - GFLOPS - How accurate?

Message boards : Number crunching : ATI - 6.10.13 - GFLOPS - How accurate?
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile BigWaveSurfer

Send message
Joined: 29 Nov 01
Posts: 186
Credit: 36,311,381
RAC: 141
United States
Message 939894 - Posted: 14 Oct 2009, 17:19:21 UTC

I installed 6.10.13 on my one system with an ATI card (yes, I know SETI does not have an ap for ATI cards....yet) but I was surprised by the GFLOPS the card pulled. BOINC says the card (ATI Radeon HD 2600 1GB) is 174 GFLOPS, is that accurate?! My OC'ed 9600GT is only 41 GFLOPS, that is a huge difference. Just curious, thanks!
ID: 939894 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 939895 - Posted: 14 Oct 2009, 17:23:19 UTC - in response to Message 939894.  

I was using my 2600XT 1GB on Collatz until it blew up a few weeks ago. It could run a Collatz WU in about 2 hours. My 4770 can do the same work in about 12 minutes using their optimized app 2.05b . its free credits so I wouldn't knock how slow that card is. Though I would upgrade the HSF if you intend on work collatz or Milkyway on it.


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 939895 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 939905 - Posted: 14 Oct 2009, 18:13:19 UTC - in response to Message 939894.  

I installed 6.10.13 on my one system with an ATI card (yes, I know SETI does not have an ap for ATI cards....yet) but I was surprised by the GFLOPS the card pulled. BOINC says the card (ATI Radeon HD 2600 1GB) is 174 GFLOPS, is that accurate?! My OC'ed 9600GT is only 41 GFLOPS, that is a huge difference. Just curious, thanks!

I think that this is probably yet another example of the difference between "marketing" flops and "working" (BOINC) flops.

According to that GPUGrid chart, a 9600GT is rated by NVidia at 312 GFlops - no allowance for the overclocking. Those are what I call "marketing flops".

Unfortunately, I suspect that BOINC v6.10.13 is reporting "marketing" flops for ATI cards, and "working" flops for NVidia cards.

This is unfair, and is going to cause confusion for a long time to come - until there is a project which can process the same work on either card, and where we have some degree of confidence that the two applications are compiled with the same degree of optimisation. I doubt that will happen until OpenCL compilers are available for both manufacturers, and a project develops an application in OpenCL that can be compiled for both cards from a common codebase. Only then will we have a true comparison.
ID: 939905 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 939907 - Posted: 14 Oct 2009, 18:38:32 UTC - in response to Message 939905.  

Agreed the Collatz project used optimized apps for both the ATI and CUda cards. Crunch3r has done a great job on them. It apprears that he's put a lot of effort in maximizing the ATI cards though. My 4770 runs Collatz much faster than a Cuda 260 and 275. I imagine that they are working on a better app as we speak for the Cuda cards


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 939907 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 939937 - Posted: 14 Oct 2009, 20:23:14 UTC - in response to Message 939894.  

Funny, I asked Andreas (Gipsel/Cluster P.) about that the other day. His answer to me:

"The numbers for nvidia cards base on some kind of characterization or if you want benchmarking with the SETI application if I remember right.

The ATI numbers are the theoretical single precision peak performance. And yes, in that sense they are correct. So it is basically the same as saying one CPU core running at 3GHz is capable of 6 GFlops in single precision (using the x87 FPU and for a more recent CPUs, a Pentium3 is only capable of 1 flop per cycle) or 24 GFlops using SSE (only Phenom/AthlonII and Core2/Core i7, for older ones it is 12 GFlops).

To sum it up, the nvidia and ATI numbers are not comparable right now. I guess the CUDA runtime is also giving some information about the number of cores in the GPU. So it should be easy to get more comparable numbers to ATI. The peak SP performance is simply: cores * 2 * clock frequency

One could include the apocryphal "missing MUL" of the nvidia cards by using cores * 3 * frequency (and for the GTX2xx line it has some credibility, but prior GPUs didn't have the capability, even when nvidias marketing claimed otherwise), but the next generation will be officially back to 2 flops per core and cycle. In ATIs case one has to deduce the number of units in the GPU from the number of SIMD blocks and the size of those, maybe one has to do something similar for nvidia cards too. In the end, one should arrive for instance for a GTX285 running stock clocks at:

240 cores * 3 flops * 1,476 GHz = 1,063 GFlops

If one wants to include also double precision numbers, these are 1/5 of the SP peak in case of ATI and for nvidia 1/12 of the SP throughput (will be more with the next generation).

So a HD5870 has 2720 GFlops in single precision and 544 GFlops in double precision. The Milkyway app is actually capable of really using about 400 DP GFlops of it, so it is a real figure and not just made up by the marketing department. Such a card is a real monster for those puny WUs ;). A GTX285 has the already mentioned 1063 GFlops in SP and a measly 88.6 GFlops with DP (and can use roughly 50GFlops of it in Milkyway after an nvidia engineer helped the project a bit)."
ID: 939937 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 939992 - Posted: 14 Oct 2009, 23:37:27 UTC

Very interesting information. Thanks Jord!
ID: 939992 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 940157 - Posted: 15 Oct 2009, 14:37:00 UTC

yes surprising information to say the least.


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 940157 · Report as offensive
Steve

Send message
Joined: 18 May 99
Posts: 94
Credit: 68,888
RAC: 0
United States
Message 940195 - Posted: 15 Oct 2009, 18:35:30 UTC

More info on the ati 5870 internals that may matter to those that understand the technical details of everything and what may limit what or allow for what. (src: beyond3d)

best non-gaming review I've found so far anyhow - tho they do cover that slightly - I'm not in the "understand the intricate tech details" category tho it was useful to learn what marketing specs vs actual specs are, once sliced and diced.
ID: 940195 · Report as offensive
madasczik
Avatar

Send message
Joined: 13 May 09
Posts: 12
Credit: 1,693,704
RAC: 0
United States
Message 940223 - Posted: 15 Oct 2009, 21:42:09 UTC

Cool, I was wondering why only NVIDIA cards were showing up on computer stats, felt left out with my dual ATI HD4870 X 2. It's great to see that it's being worked on. Will be nice having more cores working for the cause...going to give the 6.10.13 build a try. I wonder if it's going to see all 4 GPUs since the cards are running in quad crossfire mode... Collatz looks very promising, going to give that whirl too.
ID: 940223 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20147
Credit: 7,508,002
RAC: 20
United Kingdom
Message 940226 - Posted: 15 Oct 2009, 21:54:01 UTC - in response to Message 940195.  
Last modified: 15 Oct 2009, 21:54:31 UTC

More info on the ati 5870 internals that may matter to those that understand the technical details of everything and what may limit what or allow for what. (src: beyond3d)

best non-gaming review I've found so far anyhow - tho they do cover that slightly - I'm not in the "understand the intricate tech details" category tho it was useful to learn what marketing specs vs actual specs are, once sliced and diced.

At 40nm, over 2 billion transistors, and 188W peak power for a single piece of silicon, that all adds up to an impressive feat of design.

The question there though is for how well the various bottlenecks balance out. Also, how flexible is that architecture for performing more general OpenCL (CUDA-esq) operations?

One aspect that I noticed is that ATI appear to have more of a dedicated pipeline architecture whereas the nVidia architecture appears to be nearer to that of a more general purpose highly parallel array processor. Any GPU programmers able to comment on the pros/cons for programming them?


Happy fast crunchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 940226 · Report as offensive
jenesuispasbavard
Volunteer tester
Avatar

Send message
Joined: 13 Sep 05
Posts: 49
Credit: 12,385,974
RAC: 0
United States
Message 941284 - Posted: 19 Oct 2009, 7:53:57 UTC

Doesn't 6.10.13 report nvidia's "marketing" FLOPS as well? The numbers in both cases (ATI and nVidia) are the peak single-precision float performance.
ID: 941284 · Report as offensive
MarkJ Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 08
Posts: 1139
Credit: 80,854,192
RAC: 5
Australia
Message 941307 - Posted: 19 Oct 2009, 10:09:15 UTC - in response to Message 941284.  
Last modified: 19 Oct 2009, 10:13:20 UTC

Doesn't 6.10.13 report nvidia's "marketing" FLOPS as well? The numbers in both cases (ATI and nVidia) are the peak single-precision float performance.


No. The ATI was the peak speed and the nvidia was a figure from BOINC based upon the speed the "reference" card could do. Or as others referred to them marketing flops and BOINC flops. This is one reason why the ATI appears to be faster if you look at just the numbers given at the BOINC startup.

Its been changed in 6.10.14. From the change log...

- client/scheduler: standardize the FLOPS estimate between NVIDIA and ATI. Make them both peak FLOPS, according to the formula supplied by the manufacturer.

BOINC blog
ID: 941307 · Report as offensive

Message boards : Number crunching : ATI - 6.10.13 - GFLOPS - How accurate?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.