GPU FLOPS: Theory vs Reality

Message boards : Number crunching : GPU FLOPS: Theory vs Reality
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 . . . 20 · Next

AuthorMessage
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1970175 - Posted: 13 Dec 2018, 20:11:06 UTC

@Shaggie76

Nice work. We have a new winner in performance x watt as expected, at least with OpenCL.
But one question is unanswered: What happening if the same work is done with the ones who run Linux special builds?
Most of the top Seti host are actually running this CUDA Special Sauce builds.
ID: 1970175 · Report as offensive
Profile Shaggie76
Avatar

Send message
Joined: 9 Oct 09
Posts: 282
Credit: 271,858,118
RAC: 196
Canada
Message 1970196 - Posted: 13 Dec 2018, 22:18:46 UTC - in response to Message 1970155.  

1. I feel like there is a lot of background with how you compile this data. For example, is the credit/hour and credit/watt-hour calculated all-time, or within a timeframe? Do you have a running list of notes somewhere?
It's all open source on github. The credit/hour is based on data crawled from the SETI tasks web pages and the power usage is coarse estimate based on the published TDP for stock cards (I picked the data from the wikiepedia tables).

2. I think I browsed one of your earlier posts, and a different NVIDIA card (the 970)? at one point had a higher credit/watt-hour rating. I'm curious what changed for that.
It depends on the mix of short vs log work units at the time of the scan -- Arecibo work tends to be much faster but as we shift to more Greenbank data the performance characteristics change.

3. Do we know what CPU was used in tandem with the GPU credits? I know the CPU provides a minor role in the crunching of the WU, but I wonder if there is a significant difference between one CPU and another.
I'm not tracking that data but the core script tracks CPU used by the GPU work units if you run it manually. If I recall correctly AMD Open CL and NVIDIA CUDA builds use less CPU than NVIDIA OpenCL builds but this may not matter so much with hyperthreading because even though the GPU tasks's CPU side is busy polling another thread can get work done on the same core because the ALU/FPU units aren't very busy.

Personally prefer 100% GPU where possible because the work/watt is more favorable (and power is the limiting factor for me).
ID: 1970196 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1970204 - Posted: 13 Dec 2018, 22:53:07 UTC

Hi Shaggie76, I just looked through the github repository and believe there is no reason why you couldn't scan for anonymous hosts running the special app like Juan is always requesting. I see the -anon parameter option listed along with referencing a unique host ID.

If you just passed a known anonymous host ID running the special app through to aggregate.pl that would be a proof of concept I would think. Do I have the correct grasp on the program?
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1970204 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13835
Credit: 208,696,464
RAC: 304
Australia
Message 1970265 - Posted: 14 Dec 2018, 6:05:03 UTC - in response to Message 1970134.  

At last I have data:

Looks like I might get myself an RTX 2070 for my birthday in a few months.
GTX 1080 Ti (or better) crunching performance, at a maximum of 185W v 250W.

Interesting to see that the GTX 750Ti is still in the top 10 for efficiency.
Grant
Darwin NT
ID: 1970265 · Report as offensive
Profile Shaggie76
Avatar

Send message
Joined: 9 Oct 09
Posts: 282
Credit: 271,858,118
RAC: 196
Canada
Message 1970295 - Posted: 14 Dec 2018, 13:39:54 UTC - in response to Message 1970204.  
Last modified: 14 Dec 2018, 13:40:23 UTC

Hi Shaggie76, I just looked through the github repository and believe there is no reason why you couldn't scan for anonymous hosts running the special app like Juan is always requesting. I see the -anon parameter option listed along with referencing a unique host ID.

If you just passed a known anonymous host ID running the special app through to aggregate.pl that would be a proof of concept I would think. Do I have the correct grasp on the program?
Sure, you can do whatever you want, but like I've said before I am not interested in ranking this app until it's official (and when it is it'll just show up in my charts as the CUDA app like some of the vintage NVIDIA cards already do).

If it doesn't produce the same results as the official build then using it is prioritizing "winning internet points" over "doing science" and I think that's losing sight of the real purpose of this project.
ID: 1970295 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1970335 - Posted: 14 Dec 2018, 17:54:39 UTC - in response to Message 1970295.  

Hi Shaggie76, I just looked through the github repository and believe there is no reason why you couldn't scan for anonymous hosts running the special app like Juan is always requesting. I see the -anon parameter option listed along with referencing a unique host ID.

If you just passed a known anonymous host ID running the special app through to aggregate.pl that would be a proof of concept I would think. Do I have the correct grasp on the program?
Sure, you can do whatever you want, but like I've said before I am not interested in ranking this app until it's official (and when it is it'll just show up in my charts as the CUDA app like some of the vintage NVIDIA cards already do).

If it doesn't produce the same results as the official build then using it is prioritizing "winning internet points" over "doing science" and I think that's losing sight of the real purpose of this project.

The developer of the app has gone to great lengths to ensure the app DOES in fact produce the same results as any official build. Just like the Lunatics apps weren't considered "official" builds initially . . . . until they were promoted to be the stock apps. Also the "special" apps have been vetted countless times in the standard benchmark/validation applications to show they produce the same results as the stock apps. I have done so myself as well as many others. We even have a new developer that has created a modernized benchmark tool that is very easy to use and prove that the special apps verify against the stock apps all the time. To prove it yourself, just look at any host running the special app and you will see zero or very low Invalid task counts and at the same level as any stock app.

So the special app produces just as valid "science" as any stock application.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1970335 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20989
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1970346 - Posted: 14 Dec 2018, 19:21:11 UTC - in response to Message 1970335.  

So the special app produces just as valid "science" as any stock application.

Just to add:

The "special apps" or "optimized apps" are just that: Optimized.

They give the same results as far as is possible by making more efficient use of the hardware. An easy example is in how the new code takes advantage of more recent CPU features and for utilizing the compute features of GPUs.

(And to do that, there's been some spectacular work done for the maths and testing...)


Happy fast crunchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1970346 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5126
Credit: 276,046,078
RAC: 462
Message 1973135 - Posted: 3 Jan 2019, 5:28:06 UTC - in response to Message 1963448.  

A good start for vega would be.

-sbs 2048 -period_iterations_num 1 -spike_fft_thresh 4096 -high_perf -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64


@Mike,
What would be a good start for the gpu side of the Amd Ryzen 5 2400G (integrated cpu/gpu) under Windows 10 (stock Seti apps)?

I have a few tasks done. I think I have got the gpu set for its highest OC now although not earlier. I think it will now run maybe 40 minutes per gpu task. At least that is the latest it is getting ready to upload.

Here is a proposed command line.

-sbs 192 -spike_fft_thresh 2048 -tune 1 2 1 16 -period_iterations_num 10 -high_perf -high_prec_timer -tt 1600 -hp


The spike/tune are from the "entry level" examples in the documentation. I can't figure out what a good -sbs should be for an "entry level" gpu. And the rest of the parameters are from my Nvidia experience.

Presuming the gpu is still the fastest processor on this setup, I want to raise its productivity as high as possible. I will work on tweaking the cpu production later.

I am not even sure that "entry level" is the right description for this Vega 11 gpu or not.

Tom
http://setiathome.berkeley.edu/show_host_detail.php?hostid=8645775
A proud member of the OFA (Old Farts Association).
ID: 1973135 · Report as offensive
MarkJ Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 08
Posts: 1139
Credit: 80,854,192
RAC: 5
Australia
Message 1973164 - Posted: 3 Jan 2019, 8:30:29 UTC - in response to Message 1970265.  

At last I have data:

Looks like I might get myself an RTX 2070 for my birthday in a few months.
GTX 1080 Ti (or better) crunching performance, at a maximum of 185W v 250W.

Interesting to see that the GTX 750Ti is still in the top 10 for efficiency.

The RTX 2060 is rumoured to being announced at CES2019 on the 7th of Jan with availability mid-January. It should be approx USD $150 cheaper than the RTX 2070. The Einstein apps however don’t work with RTX at the moment.
BOINC blog
ID: 1973164 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13835
Credit: 208,696,464
RAC: 304
Australia
Message 1973167 - Posted: 3 Jan 2019, 8:45:18 UTC - in response to Message 1973164.  
Last modified: 3 Jan 2019, 8:46:43 UTC

The RTX 2060 is rumoured to being announced at CES2019 on the 7th of Jan with availability mid-January. It should be approx USD $150 cheaper than the RTX 2070. The Einstein apps however don’t work with RTX at the moment.

I'm just hoping the prices start coming down.
$850 is the starting price of RTX 2070s at the moment. The model i'm interested in is $900. I could (probably) go up to $800, but $900?
Sheesh!

Even at $2,000-$2,500 the RTX 2080Tis are almost permanently out of stock. I'm hoping over the next few months as more models come out, and production continues to ramp up/demand eases, prices will settle down from the ridiculous to just painful levels.
*fingers crossed*
Grant
Darwin NT
ID: 1973167 · Report as offensive
Profile Shaggie76
Avatar

Send message
Joined: 9 Oct 09
Posts: 282
Credit: 271,858,118
RAC: 196
Canada
Message 1984636 - Posted: 12 Mar 2019, 1:27:02 UTC

New this scan: NVIDIA GeForce RTX 2060's with excellent credit/watt and excellent performance.

There were some GTX 1660 Ti's in the scan but not quiet enough to qualify. I'll re-scan in a few weeks and maybe there'll be enough by then.

ID: 1984636 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13835
Credit: 208,696,464
RAC: 304
Australia
Message 1984689 - Posted: 12 Mar 2019, 6:46:00 UTC

Thanks for the update, greatly appreciated.
Will be interesting to see where the GTX 1660Ti ends up.
Grant
Darwin NT
ID: 1984689 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9956
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1984692 - Posted: 12 Mar 2019, 7:21:17 UTC

Will be interesting to see where the GTX 1660Ti ends up.

Indeed it will, first time I have had a card that is too new to be included ;-)
ID: 1984692 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1984742 - Posted: 12 Mar 2019, 13:20:05 UTC
Last modified: 12 Mar 2019, 13:20:22 UTC

I know i ask before but is possible to run the same scan with the hosts who run Lnux builds instead of OpenCl builds?

We could expect the similar performance or due the way the optimized Linux builds works the numbers will look different?

Is just curiosity
ID: 1984742 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1984752 - Posted: 12 Mar 2019, 13:56:09 UTC - in response to Message 1984742.  

One problem with that is a good majority of the Linux optimized uses also run multiple GPUs which are not shown in the graphs.
ID: 1984752 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1984786 - Posted: 12 Mar 2019, 19:16:57 UTC - in response to Message 1984752.  

One problem with that is a good majority of the Linux optimized uses also run multiple GPUs which are not shown in the graphs.

In this case we will never really know the real difference on the Linux boxes. Because the way the Petri highly optimized builds works it uses the memory in a different way than the standard builds. So i imagine, the type and the speed of the memory could make a huge difference there.
ID: 1984786 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1984788 - Posted: 12 Mar 2019, 19:25:16 UTC - in response to Message 1984752.  

One problem with that is a good majority of the Linux optimized uses also run multiple GPUs which are not shown in the graphs.


is he only scanning single-GPU systems?

i can understand the case of having 2 different GPUs in the system. where you can't just take the average. but in the event that all GPUs in use were the same type, you could just average them. But i understand you wont know if all GPUs are the same or which is which unless you inspect the stderr.txt file.

just curious.

it would be nice to see special app comparisons though.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1984788 · Report as offensive
Profile Kissagogo27 Special Project $75 donor
Avatar

Send message
Joined: 6 Nov 99
Posts: 716
Credit: 8,032,827
RAC: 62
France
Message 1984913 - Posted: 13 Mar 2019, 10:27:31 UTC

i think i remember that he scans only non anonymous hosts ..
ID: 1984913 · Report as offensive
Profile Bill Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 30 Nov 05
Posts: 282
Credit: 6,916,194
RAC: 60
United States
Message 1985794 - Posted: 18 Mar 2019, 15:19:47 UTC - in response to Message 1984788.  

is he only scanning single-GPU systems?
I am only speculating here, but perhaps he's grabbing information from the tasks themselves, which I assume identifies which GPU completed the task. I would also speculate that by doing so it would not know how many GPUs the computer is running. Just a guess without doing any back-tracking through this thread.

Seeing how the RTX 2060 lines up compared to the RTX 2070 is interesting. In particular, these two GPUs are pretty close in line on the credit/watt, and the RTX 2080/2080 Ti are nowhere near as efficient. I wonder why that is, and I wonder why the low end of the bar is as low as it is for the RTX 2080s. Obviously more data would need to be collected, but I feel like the 2080s are an aberration. I would be curious to see if the 1660 and 1660 Tis line up more with the 2060/2070, or the 2080/2080Ti (with respect to credit/watt).
Seti@home classic: 1,456 results, 1.613 years CPU time
ID: 1985794 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1985802 - Posted: 18 Mar 2019, 15:46:28 UTC - in response to Message 1985794.  

Pretty sure he just scrapes the https://setiathome.berkeley.edu/gpu_list.php web page. So he is not looking at any card's stderr.txt output. No way to see how many cards a host is running. Then he just looks up the card's published TDP specification to match power used to credit produced.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1985802 · Report as offensive
Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 . . . 20 · Next

Message boards : Number crunching : GPU FLOPS: Theory vs Reality


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.