GPU FLOPS: Theory vs Reality

Message boards : Number crunching : GPU FLOPS: Theory vs Reality
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · Next

AuthorMessage
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 28459
Credit: 261,360,520
RAC: 489
Australia
Message 2021871 - Posted: 5 Dec 2019, 22:47:49 UTC - in response to Message 2021859.  

Hi,

Reality is now that the top 200 are almost all Linux machines running with the special app. GPU FLOPS are of no interest as long as they are wasted.

If someone is willing to run a curiosity check on those machines and make a rough estimate of performance and wattage I'd be pleased.
The wattage can be taken from the manufacturer data or from a questionnaire from actual users running "nvidia-smi -l"

That would guide the Santa to deliver the right present to the best behaving children and adults who do not like to obey M$ or other rules.


Petri
You've done a great there Petri. (Y)

Well when one considers that my 2 little old i5 rigs with dual GTX 1060's are currently in 100th and 102nd spots by RAC (they were in the top 70 to start with, but they're slowly being pushed down by other Linux rigs with better hardware) and that there are only 3 rigs above me running Windows (and they're 40+ multicore rigs with even more expensive/powerful GPU's) then that must say something. ;-)

Cheers.
ID: 2021871 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2021887 - Posted: 6 Dec 2019, 1:05:49 UTC

only 5 non-special app hosts in the top 100, the fastest being RueiKe's 64-core Epyc + Vega VII system on Linux

the fastest Windows system is Keldon's 4x Quadro P4000 system on Win 10

like 35-40 non-special app hosts in the top 200.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2021887 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13399
Credit: 208,696,464
RAC: 304
Australia
Message 2021913 - Posted: 6 Dec 2019, 6:10:00 UTC - in response to Message 2021859.  

That would guide the Santa to deliver the right present to the best behaving children and adults who do not like to obey M$ or other rules.
Or better yet port your efforts over to Windows to make more use of all that hardware.
Grant
Darwin NT
ID: 2021913 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 28459
Credit: 261,360,520
RAC: 489
Australia
Message 2021921 - Posted: 6 Dec 2019, 8:57:22 UTC - in response to Message 2021913.  
Last modified: 6 Dec 2019, 8:57:42 UTC

That would guide the Santa to deliver the right present to the best behaving children and adults who do not like to obey M$ or other rules.
Or better yet port your efforts over to Windows to make more use of all that hardware.
Or maybe we should get a fundraiser together to hire a Nvidia Windows Cuda expert to work on it against Petri's Linux efforts. ;-)

Cheers.
ID: 2021921 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2021930 - Posted: 6 Dec 2019, 11:47:01 UTC - in response to Message 2021921.  

That would guide the Santa to deliver the right present to the best behaving children and adults who do not like to obey M$ or other rules.
Or better yet port your efforts over to Windows to make more use of all that hardware.
Or maybe we should get a fundraiser together to hire a Nvidia Windows Cuda expert to work on it against Petri's Linux efforts. ;-)

That port could be done, but AFAIK the optimization will not be high as with Linux due the way the Windows works.
Anyway the original code is available if anyone wants to play with it and try this porting.
ID: 2021930 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 2021932 - Posted: 6 Dec 2019, 12:23:00 UTC

Someone who was very disappointed with a gx 1660 Super offered it for $110 buy-it-now on eBay..... it was Friday a payday. Crunch....

Tom
A proud member of the OFA (Old Farts Association).
ID: 2021932 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 2021934 - Posted: 6 Dec 2019, 13:33:17 UTC - in response to Message 2018703.  

It looks as if the Rtx 2070 Super hits the sweet spot of very high production and more efficient than any other RTX card?
A proud member of the OFA (Old Farts Association).
ID: 2021934 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14532
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2021935 - Posted: 6 Dec 2019, 13:40:13 UTC - in response to Message 2021859.  

If someone is willing to run a curiosity check on those machines and make a rough estimate of performance and wattage I'd be pleased.
The wattage can be taken from the manufacturer data or from a questionnaire from actual users running "nvidia-smi -l"
I haven't done that, but I'll toss in some direct wall-power readings (Killa-Watt) from host 8747061:

Idle (fans, OS, single SSD): 42 watts
CPU (4 integer BOINC tasks): 104 watts
Full (2x GTX 1660 Ti, sauce v10.1): 365 watts peak

So I make the draw for GPU crunching, including fans working a lot harder and -nobs CPU support, to be about 260 watts the pair, or 130 watts each. SETI RAC (GPUs only) is currently 141,032, current position #56.
ID: 2021935 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13399
Credit: 208,696,464
RAC: 304
Australia
Message 2021968 - Posted: 6 Dec 2019, 21:52:43 UTC - in response to Message 2021934.  
Last modified: 6 Dec 2019, 22:06:23 UTC

It looks as if the Rtx 2070 Super hits the sweet spot of very high production and more efficient than any other RTX card?
Nope- look at Shaggie's graph; that would be the RTX 2060 Super.
The 2070 Super puts out slightly more work (2nd place), but uses a bit more than slightly more power, so it's efficiency isn't as good (7th place). I expect it would be the same for the RTX 2080 Super.
The RTX 2060 Super is in 4th place for output (not much separates 2nd to 5th), but it's in 2nd place for efficiency (not much separates 2nd to 4th).

The GTX 1660Ti is in 1st place for efficiency, by a good margin. But it's in 9th place for output, and quite a bit behind the others- I would expect the GTX 1660 Super will displace this card for performance & efficiency.


That makes the RTX 2060 Super the current performance/efficiency king; only just behind the GTX 1660Ti for efficiency but miles ahead of it for output.
Grant
Darwin NT
ID: 2021968 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 2021981 - Posted: 7 Dec 2019, 0:20:14 UTC - in response to Message 2021930.  

That would guide the Santa to deliver the right present to the best behaving children and adults who do not like to obey M$ or other rules.
Or better yet port your efforts over to Windows to make more use of all that hardware.
Or maybe we should get a fundraiser together to hire a Nvidia Windows Cuda expert to work on it against Petri's Linux efforts. ;-)

That port could be done, but AFAIK the optimization will not be high as with Linux due the way the Windows works.
Anyway the original code is available if anyone wants to play with it and try this porting.


Hi,

My guess is that you'd get near the same speed-up on Windows too or even ...

a) Autopulse would use half the time it uses now.
b) Pulsefind would benefit greatly.
c) Doing a lot of the signal finding asynchronously would cut a lot of processing time and release the stress from the CPU.
(
d) Keeping or re-engineering Raistmers great work with SOG might make the Windows version even faster than the current Linux one.
e) Some, if not all, optimizations could be implemented on OpenCL to get a united codebase for 3rd party GPUs too.
)

I highly recommend someone with a big heart, a lot of time and some enthusiasm and knowledge of Windows platform development (and some courage) to take a step forward. I'll help and explain the mysteries of some of my optimizations when I can.

Petri.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 2021981 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1849
Credit: 268,616,081
RAC: 1,349
United States
Message 2021982 - Posted: 7 Dec 2019, 0:23:42 UTC - in response to Message 2021981.  

My guess is that you'd get near the same speed-up on Windows too or even ....
That would be a pretty amazing contribution to the cause. Wish I had the skills ...
ID: 2021982 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13399
Credit: 208,696,464
RAC: 304
Australia
Message 2021987 - Posted: 7 Dec 2019, 0:47:27 UTC - in response to Message 2021981.  

e) Some, if not all, optimizations could be implemented on OpenCL to get a united codebase for 3rd party GPUs too.
Unfortunately support for OpenCL seems to be getting less & less.
Nvidia's current Windows drivers have issues with previously working OpenCL code, and they haven't developed support for anything later than v1.2 (at least nothing mainstream for general release).
AMDs current drivers for all their current Navi hardware have issues supporting OpenCL on multiple platforms.
And Apple is dropping support for it completely.
Grant
Darwin NT
ID: 2021987 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2021990 - Posted: 7 Dec 2019, 1:01:45 UTC - in response to Message 2021968.  

It looks as if the Rtx 2070 Super hits the sweet spot of very high production and more efficient than any other RTX card?
Nope- look at Shaggie's graph; that would be the RTX 2060 Super.
The 2070 Super puts out slightly more work (2nd place), but uses a bit more than slightly more power, so it's efficiency isn't as good (7th place). I expect it would be the same for the RTX 2080 Super.
The RTX 2060 Super is in 4th place for output (not much separates 2nd to 5th), but it's in 2nd place for efficiency (not much separates 2nd to 4th).

The GTX 1660Ti is in 1st place for efficiency, by a good margin. But it's in 9th place for output, and quite a bit behind the others- I would expect the GTX 1660 Super will displace this card for performance & efficiency.


That makes the RTX 2060 Super the current performance/efficiency king; only just behind the GTX 1660Ti for efficiency but miles ahead of it for output.


I think some of the scoring of efficiency to be flawed. there's a huge variance user to user due to different settings that can be used. also he's using spec values of TDP as his estimation for power consumption, which is totally flawed in itself.

looking at the performance and ACTUAL power draw measured from the wall for many different Turing cards (1660, 2070, 2080, 2080ti). The performance per watt is nearly the same for all cards. The Super cards are the same architecture and don't have any efficiency enhancements.

honestly I think the Titan V that petri has is likely to be the most power efficient card out there. looking at his run times, its around the same crunching speed as a 2080ti, but only uses about 150W. that's incredible.

I think the only Turing card that would challenge that would be the Tesla T4. 70W TDP, and the same amount of SMs as a 2070 Super (slower core and memory clock speeds). Too bad its elusive and insanely expensive. There is one user running 2 of them Here but only running the stock apps with no optimization and heavily bottlenecked (probably leaves boinc compute at 100% CPU use). Online reviews put performance around the 2070, but hard to be sure. I'd love to see that thing on the special app.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2021990 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13399
Credit: 208,696,464
RAC: 304
Australia
Message 2021991 - Posted: 7 Dec 2019, 1:15:59 UTC - in response to Message 2021990.  

I think some of the scoring of efficiency to be flawed. there's a huge variance user to user due to different settings that can be used.
Which shows up in the graphs by the difference in widths of the plot, the lowest to highest value.


also he's using spec values of TDP as his estimation for power consumption, which is totally flawed in itself.
But it is the same for all of the cards, so it is an accurate and valid representation of the performance of the card for it's rated power usage, and so makes the charts a valid & useful indicator of performance & comparison between different hardware.


looking at the performance and ACTUAL power draw measured from the wall for many different Turing cards (1660, 2070, 2080, 2080ti). The performance per watt is nearly the same for all cards. The Super cards are the same architecture and don't have any efficiency enhancements.
And if you look at the position & widths of the plots, and take a rough visual average of them you can see that in the graphs, and also see that the output from increased number of compute units doesn't scale perfectly linearly, and there are differences in performance due to differing clock & memory speeds, as well as differing memory bandwidths between certain models of card.



No, it's not perfect. But it is representative of actual performance & power usage and a dammed useful tool.
Grant
Darwin NT
ID: 2021991 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 2022096 - Posted: 7 Dec 2019, 13:21:00 UTC - in response to Message 2021982.  

My guess is that you'd get near the same speed-up on Windows too or even ....
That would be a pretty amazing contribution to the cause. Wish I had the skills ...


Its never too late to learn something new. Of course this would be a pretty big "first project" for a newbie software developer.... ;)

Tom
A proud member of the OFA (Old Farts Association).
ID: 2022096 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 2022097 - Posted: 7 Dec 2019, 13:24:51 UTC - in response to Message 2021968.  

It looks as if the Rtx 2070 Super hits the sweet spot of very high production and more efficient than any other RTX card?
Nope- look at Shaggie's graph; that would be the RTX 2060 Super.


And on eBay at least it costs ~$150 less than the 2070 Super!

Hmmm..... If I wasn't trying to live within a power budget enforced by my limited income and my UPS I might be tempted.

Tom
A proud member of the OFA (Old Farts Association).
ID: 2022097 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 2022396 - Posted: 8 Dec 2019, 20:55:18 UTC
Last modified: 8 Dec 2019, 20:56:37 UTC

As part of a "two for one" test I have been running my largest/fastest/top performing box with just a Gtx 1660 Super so I can more easily pick out the gpu results. I do understand this is not a "fair/random" task test so don't assume I am presuming that.

The slowest task so far has been 2 minutes and 3 seconds. Most of the tasks are just bit faster than that. You can tell I am running 1 gpu because the task reports only one. Haven't gotten up to a full page of results yet.

The other test was on the Mining Rig I just bought. Results are in the Bitcoin/Mining thread.

Tom
A proud member of the OFA (Old Farts Association).
ID: 2022396 · Report as offensive
Ragamuffin

Send message
Joined: 2 Apr 05
Posts: 6
Credit: 154,983
RAC: 0
Sweden
Message 2022490 - Posted: 9 Dec 2019, 16:28:57 UTC - in response to Message 2018712.  

I think You should keep the old ones for 2 reasons..

First of all it is interesting to see the development over time and secondly, maybe some people still use them and get a reason to update?

Keep up the good work!

Best regards

Roland in Sweden
ID: 2022490 · Report as offensive
wujj123456

Send message
Joined: 5 Sep 04
Posts: 40
Credit: 20,877,975
RAC: 219
China
Message 2022590 - Posted: 10 Dec 2019, 3:26:19 UTC - in response to Message 2022490.  

I think You should keep the old ones for 2 reasons..

First of all it is interesting to see the development over time and secondly, maybe some people still use them and get a reason to update?

Keep up the good work!

Best regards

Roland in Sweden

I feel for people who care enough to check the graph, they would likely have realized the power efficiency is so bad now that upgrading is probably cheaper than burning that electricity. It's interesting to see how far we've been all the way though, so probably just keep one or two representative/popular hardware for the older generations, not all of the models?
ID: 2022590 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 2023613 - Posted: 18 Dec 2019, 22:25:45 UTC - in response to Message 2021932.  

Someone who was very disappointed with a gx 1660 Super offered it for $110 buy-it-now on eBay..... it was Friday a payday. Crunch....

Tom


It wasn't a Gtx 1660 Super. Just a Gtx 1660 but I just confirmed it does boot. After I take a look at Shaggie's list I will cancel my request for a refund (probably).

Tom
A proud member of the OFA (Old Farts Association).
ID: 2023613 · Report as offensive
Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · Next

Message boards : Number crunching : GPU FLOPS: Theory vs Reality


 
©2022 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.