amd gflops: theoretical vs. "real world"

Author	Message
merle van osdol Send message Joined: 23 Oct 02 Posts: 809 Credit: 1,980,117 RAC: 0	Message 1738822 - Posted: 1 Nov 2015, 16:35:24 UTC Last modified: 1 Nov 2015, 16:38:26 UTC Wikipedia has a page for amd r9 200 series cards (it includes r7). In there you will find a comparison of each cards speeds according to single and double precision gflops (theoretical). I remember someone last year, perhaps Hal, saying that there is a big difference between theoretical and actual gflops. Where do I find out where the actual gflops are detailed. I mean like a nice tabulated table where you can make an intelligent decision on what card to get. I suppose too that it also depends on the mfg. eg. sapphire, gigabyte, etc. Is there such a place? I usually buy Sapphire but I have one machine (I bought used) where I have gigabyte and I try to keep each machine all from the same mfg. When I first started crunching I had been buying XFX. The reason I ask is because I have an r7 265 that is just as fast as my r9 270x. I don't recall the mfg. Thanks merle - vote yes for freedom of speech ID: 1738822 ·

ML1 Volunteer moderator Volunteer tester Send message Joined: 25 Nov 01 Posts: 20289 Credit: 7,508,002 RAC: 20	Message 1738837 - Posted: 1 Nov 2015, 17:55:59 UTC - in response to Message 1738822. That depends on what applications you wish to compare for that hardware. For example for s@h, look at what RAC others get for your card. For games, compare the games benchmarks (anything above 50fps should be more than fast enough). And if you take a look at some of the primegrid RACs, that is likely as close as you can get to theoretical 100% performance utilisation. Let us know what you find! Happy fast crunchin Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) ID: 1738837 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1738844 - Posted: 1 Nov 2015, 18:17:20 UTC - in response to Message 1738822. Last modified: 1 Nov 2015, 18:19:36 UTC Hi Merle, Yeah the theoretical peak GFlops ratings do apply directly only in pretty controlled laboratory conditions, that don;t really match practical code. For most of the computation on this project, that nature of the tasks being mostly single precision, and on the biggest/fastest GPUs somewhat memory bound in large parts of computation, a reasonable starting ballpark guesstimate is around 5% (1/20th) of the theoretical peak. For a better figure, you can look inside a task file , at the wu_rsc_fpops_est (or similar, I forget the exact tag name at the moment) figure, and divide that by the elapsed time. It's not a perfect figure for a number of reasons, mainly that estimate is generic and doesn't specifically cater to the way a given GPU application might process differently than another, and some parts of CPU time will be contaminating the reading, but it should reflect actual consistently enough to extract a more precise figure than the ballpark 5% starting figure. There will also be minor variation depending on the data content and running conditions, so gathering enough runs of different tasks to build a picture of average and variance might be a better method that single runs. In general you should see elapsed times of the least polluted runs approach an imaginary best possible time, which you can use for best case comparisons. I have some designs put away for a simpler gui based tool and online database submission for generic and specific tests, though that's on the shelf until after a major Cuda multibeam update, and work lets up a bit. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1738844 ·

Graham Middleton Send message Joined: 1 Sep 00 Posts: 1520 Credit: 86,815,638 RAC: 0	Message 1738865 - Posted: 1 Nov 2015, 19:25:47 UTC I always view GFLOPS, FLOPS, etc as extensions of the old MIPS acronym for Meaningless Indicator of Processor Speed. :-) Happy Crunching, Graham ID: 1738865 ·

merle van osdol Send message Joined: 23 Oct 02 Posts: 809 Credit: 1,980,117 RAC: 0	Message 1739703 - Posted: 4 Nov 2015, 20:56:48 UTC - in response to Message 1738837. That depends on what applications you wish to compare for that hardware. For example for s@h, look at what RAC others get for your card. For games, compare the games benchmarks (anything above 50fps should be more than fast enough). And if you take a look at some of the primegrid RACs, that is likely as close as you can get to theoretical 100% performance utilisation. Let us know what you find! Happy fast crunchin Martin Under tasks it says device peak gflops for devices 250x and 270x are both the same at 211.73. Makes no sense at all? merle - vote yes for freedom of speech ID: 1739703 ·

ML1 Volunteer moderator Volunteer tester Send message Joined: 25 Nov 01 Posts: 20289 Credit: 7,508,002 RAC: 20	Message 1739769 - Posted: 5 Nov 2015, 0:50:06 UTC - in response to Message 1739703. That depends on what applications you wish to compare for that hardware. For example for s@h, look at what RAC others get for your card. For games, compare the games benchmarks (anything above 50fps should be more than fast enough). And if you take a look at some of the primegrid RACs, that is likely as close as you can get to theoretical 100% performance utilisation. Let us know what you find! Happy fast crunchin Martin Under tasks it says device peak gflops for devices 250x and 270x are both the same at 211.73. Makes no sense at all? I'm sure others (such as Jason or Ageless or others) can comment better than me for Boinc... ;-) If you have two GPU cards in the same host, then Boinc reports only the Boinc number for the first card found. All your other GPUs are assumed to be the same. For a mixed bunch, could give cause for some confused scheduling numbers! Happy fast crunchin Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) ID: 1739769 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1739843 - Posted: 5 Nov 2015, 8:35:24 UTC - in response to Message 1739769. Last modified: 5 Nov 2015, 8:39:18 UTC There's that, and that I see some call on the boinc developer mailing lists related to computing estimated peak flops on generic OpenCL devices that don't support Cuda (which has a 'relatively' consistent API). I don't have a response for them yet, but do have some ideas how it might be achieved (good enough for government work at least). Will have to sort through those ideas down the road and check if they already came up with something workable "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1739843 ·

merle van osdol Send message Joined: 23 Oct 02 Posts: 809 Credit: 1,980,117 RAC: 0	Message 1739879 - Posted: 5 Nov 2015, 12:29:02 UTC - in response to Message 1739843. There's that, and that I see some call on the boinc developer mailing lists related to computing estimated peak flops on generic OpenCL devices that don't support Cuda (which has a 'relatively' consistent API). I don't have a response for them yet, but do have some ideas how it might be achieved (good enough for government work at least). Will have to sort through those ideas down the road and check if they already came up with something workable One more reason for me to switch to Nividia I guess. merle - vote yes for freedom of speech ID: 1739879 ·

ChrisD Volunteer tester Send message Joined: 25 Sep 99 Posts: 158 Credit: 2,496,342 RAC: 0	Message 1740883 - Posted: 9 Nov 2015, 17:36:46 UTC Last modified: 9 Nov 2015, 17:53:08 UTC Have You tried to locate a machine with just one Graphics Card model 270X, for example, and checked how fast that machine crunches MB WU's? One of my crunchers have a HD7970 Graphics card, and this card turns out MB WU's in 4+ or 9 Mins approximately. How many GFlops? I am not sure, but by comparing the throughput of different cards, some kind of performance table should be constructable. Btw. Why do You want to buy an NVidia Card? What is wrong with Your ATI?? :) ChrisD edit: just checked the user next to me in the list. He uses an NVidia Card. WU 4507948519 is processed in 23 minutes and was awarded 93 credits. my Tahiti cruncher processed WU 4508938169 in 8 mins 40 secs and was awarded 84 credits. my Hawaii equipped cruncher processed WU 4509138835 in 7 mins and 45 secs and was awarded 99 credits. You do the Math ;) ID: 1740883 ·

catavalon21 Send message Joined: 2 Nov 01 Posts: 13 Credit: 7,238,152 RAC: 48	Message 1746385 - Posted: 2 Dec 2015, 23:09:37 UTC - in response to Message 1740883. I guess I am surprised at how SETI credits do seem to be less than many other projects, though I understand it depends on what processing actually is taking place. I have a stock HD7850, and on some (likely integer only) apps it gets more than one unit of credit per second (greater than 3600 per hour of run time). For S@H it is significantly less. For this card running S@H V7, recent WUs have taken roughly 698 seconds to complete (run time, not cpu time), and grant 45 or so units of credit. For a GPU that is running at 90% saturation (with practically nothing else running, THOUGH it's an older Core 2 Duo 6550 @2.33 GHz), it just must not be doing something that SETI likes. Milkyway@home, which supposedly likes double precision, same box generates points FAR better (and more points per minute of run time) than GTX 760 in wife's box with a E8500. Still stuck on SETI, but ... ID: 1746385 ·

betreger Send message Joined: 29 Jun 99 Posts: 11361 Credit: 29,581,041 RAC: 66	Message 1746405 - Posted: 3 Dec 2015, 1:21:00 UTC - in response to Message 1746385. I guess I am surprised at how SETI credits do seem to be less than many other projects, though I understand it depends on what processing actually is taking place. Seti uses creditnew most other projects just award credit per task and many of them are inflated. If you want max credit Seti is not the place to be. ID: 1746405 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.