GTX is dead long live RTX !!

Message boards : Number crunching : GTX is dead long live RTX !!
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 12 · Next

AuthorMessage
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1959931 - Posted: 12 Oct 2018, 18:57:15 UTC - in response to Message 1959855.  

I can't look at a SIV install since going all Linux, but if I remember correctly that info would be in Graphics >> GPU CUDA. Try using the Menu Search function and search on keyword CUDA if I haven't remembered correctly.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1959931 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1959932 - Posted: 12 Oct 2018, 19:01:38 UTC - in response to Message 1959855.  

Thanks Bruce, it's OK. The 10598 GFLOPS peak in your logs looks right: both Ray Hinchliffe of SIV, and archae86 at Einstein, have purchased RTX 2080 cards since I asked the question, and we've compared notes already.
ID: 1959932 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13727
Credit: 208,696,464
RAC: 304
Australia
Message 1960209 - Posted: 14 Oct 2018, 1:31:12 UTC

For those wondering what an RTX 2080 is capable of, check out Bruce's runtimes (there were a few issues early on, but it appears those have been resolved).

From the APR value, it looks like it's running 2 WUs at a time and it's running SoG with period_iterations_num=20
For comparison, my GTX 1070s are running 1 WU at a time using SoG, aggressive settings including period_iterations_num=1

BLC23s are taking me 7min 22sec one at a time. On the RTX 2080, 2 at a time, with period_iterations_num=20 (not like my 1), 4min 10sec.
So (very roughly) 7min 20sec v 2min 10 sec for a given WU.

Arecibo VLARs are around 8min 45+ for me, around 2min 30 for the RTX2080.
Grant
Darwin NT
ID: 1960209 · Report as offensive
Bruce
Volunteer tester

Send message
Joined: 15 Mar 02
Posts: 123
Credit: 124,955,234
RAC: 11
United States
Message 1960227 - Posted: 14 Oct 2018, 3:42:26 UTC
Last modified: 14 Oct 2018, 3:44:57 UTC

Hi Grant,
Did a slight retune and that cleared up the warning messages. My current run times are now for one WU at a time. May try that for a while and see how it goes. I might try dropping period_iterations down later on. This is my daily driver, so can't get super aggressive. Thanks for the info.
Bruce
ID: 1960227 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13727
Credit: 208,696,464
RAC: 304
Australia
Message 1960231 - Posted: 14 Oct 2018, 4:27:36 UTC - in response to Message 1960227.  

Hi Grant,
Did a slight retune and that cleared up the warning messages. My current run times are now for one WU at a time. May try that for a while and see how it goes. I might try dropping period_iterations down later on. This is my daily driver, so can't get super aggressive. Thanks for the info.

No worries.
Even so, it's output is almost double mine, even with the higher -period_iterations_num setting.
Grant
Darwin NT
ID: 1960231 · Report as offensive
W3Perl Project Donor
Volunteer tester

Send message
Joined: 29 Apr 99
Posts: 251
Credit: 3,696,783,867
RAC: 12,606
France
Message 1960252 - Posted: 14 Oct 2018, 9:37:26 UTC - in response to Message 1960209.  

For those wondering what an RTX 2080 is capable of, check out Bruce's runtimes (there were a few issues early on, but it appears those have been resolved).

From the APR value, it looks like it's running 2 WUs at a time and it's running SoG with period_iterations_num=20
For comparison, my GTX 1070s are running 1 WU at a time using SoG, aggressive settings including period_iterations_num=1

BLC23s are taking me 7min 22sec one at a time. On the RTX 2080, 2 at a time, with period_iterations_num=20 (not like my 1), 4min 10sec.
So (very roughly) 7min 20sec v 2min 10 sec for a given WU.

Arecibo VLARs are around 8min 45+ for me, around 2min 30 for the RTX2080.


Not so good ;)
On linux, running Petri's app with a GTX 1070, BLC23 take 2min 15.....
Does anyone have the BLC23 run time for a RTX 2080 using Petri's app ?
ID: 1960252 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13727
Credit: 208,696,464
RAC: 304
Australia
Message 1960253 - Posted: 14 Oct 2018, 9:40:58 UTC - in response to Message 1960252.  

Not so good ;)

For SoG and Windows, it is.
Grant
Darwin NT
ID: 1960253 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1960321 - Posted: 14 Oct 2018, 21:33:30 UTC

Hi,
I have not compiled one for RTX yet. TBar may have done so or Gianfranco,
It may be possible to run the same code on RTX that runs on Titan V, but if it throws errors or does not run, I'll compile one.
Just ask.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1960321 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1960326 - Posted: 14 Oct 2018, 21:44:29 UTC - in response to Message 1960252.  
Last modified: 14 Oct 2018, 21:46:18 UTC

For those wondering what an RTX 2080 is capable of, check out Bruce's runtimes (there were a few issues early on, but it appears those have been resolved).

From the APR value, it looks like it's running 2 WUs at a time and it's running SoG with period_iterations_num=20
For comparison, my GTX 1070s are running 1 WU at a time using SoG, aggressive settings including period_iterations_num=1

BLC23s are taking me 7min 22sec one at a time. On the RTX 2080, 2 at a time, with period_iterations_num=20 (not like my 1), 4min 10sec.
So (very roughly) 7min 20sec v 2min 10 sec for a given WU.

Arecibo VLARs are around 8min 45+ for me, around 2min 30 for the RTX2080.


Not so good ;)
On linux, running Petri's app with a GTX 1070, BLC23 take 2min 15.....
Does anyone have the BLC23 run time for a RTX 2080 using Petri's app ?


On GTX1080Ti BLC23 is 67 seconds. The Ti is a power hog and a way slower than the Titan V with 49 seconds and Volta having constantly 90W less power draw.

I hope someone with interest in Linux would get a RTX so that we would get nvidia-smi -l output of the power draw and some facts about the run times. I'll compile an executable for who ever can get a RTX2080 or a RTX2080-Ti to try with. EDIT: Corrected another typo from RTX1080 to 2080.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1960326 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65735
Credit: 55,293,173
RAC: 49
United States
Message 1960355 - Posted: 15 Oct 2018, 4:25:34 UTC

From what I've seen Titan V's are really expensive, a 1080Ti might be a power hog, but then if it uses 225-250w, that's not much more than My Asus 970 Turbo I've read, supposedly the Asus 970 Turbo uses 225w, even an RTX 2080 is less than a Titan V(Volta) in purchase price, ymmv on card prices.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1960355 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13727
Credit: 208,696,464
RAC: 304
Australia
Message 1960363 - Posted: 15 Oct 2018, 6:36:16 UTC - in response to Message 1960321.  

It may be possible to run the same code on RTX that runs on Titan V,

From a couple of the articles on the Turing architecture I've looked at, that's most likely to be the case.
Diving straight into the microarchitecture, the new Turing SM looks very different to the Pascal SM, but those who’ve been keeping track of Volta will notice a lot of similarities to the NVIDIA’s more recent microarchitecture. In fact, on a high-level, the Turing SM is fundamentally the same, with the notable exception of a new IP block: the RT Core.

Like Volta, the Turing SM is partitioned into 4 sub-cores (or processing blocks) with each sub-core having a single warp scheduler and dispatch unit, as opposed Pascal’s 2 partition setup with two dispatch ports per sub-core warp scheduler. There are some fairly major implications with change, and broadly-speaking this means that Volta/Turing loses the capability to issue a second, non-dependent instruction from a thread for a single clock cycle. Turing is presumably identical to Volta performing instructions over two cycles but with schedulers that can issue an independent instruction every cycle, so ultimately Turing can maintain 2-way instruction level parallelism (ILP) this way, while still having twice the amount of schedulers over Pascal.

Like we saw in Volta, these changes go hand-in-hand with the new scheduling/execution model with independent thread scheduling that Turing also has, though differences were not disclosed at this time. Rather than per-warp like Pascal, Volta and Turing have per-thread scheduling resources, with a program counter and stack per-thread to track thread state, as well as a convergence optimizer to intelligently group active same-warp threads together into SIMT units. So all threads are equally concurrent, regardless of warp, and can yield and reconverge.

Source- Annandtech.
Grant
Darwin NT
ID: 1960363 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1960400 - Posted: 15 Oct 2018, 14:07:47 UTC

Hi,

The same executable may work.
However here is a link to source https://drive.google.com/open?id=14SD3SU6K5iGYY3GM3aVWSb5Y19N8v5wZ
and to an executable that has sm_75 (RTX2080&Ti) enabled https://drive.google.com/open?id=1b8LYfoCmzHjAjbdCHT0ig-wHxXNOYIjZ.

My compilation may need certain versions of some system libraries. TBar can compile one that is more compatible.

Petri
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1960400 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1960438 - Posted: 15 Oct 2018, 18:17:53 UTC - in response to Message 1960400.  

TBar already posted a download link to a CUDA10 executable. But that only has the executable, nothing else. No way of knowing whether it is compiled for sm_75 or not since their was no documentation at all. The only way of knowing would be to run it with a Turing card and see what BOINC shows for it CC capability.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1960438 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65735
Credit: 55,293,173
RAC: 49
United States
Message 1960457 - Posted: 15 Oct 2018, 20:08:17 UTC

ID: 1960457 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1960477 - Posted: 15 Oct 2018, 21:45:21 UTC - in response to Message 1960438.  
Last modified: 15 Oct 2018, 21:50:24 UTC

TBar already posted a download link to a CUDA10 executable. But that only has the executable, nothing else. No way of knowing whether it is compiled for sm_75 or not since their was no documentation at all. The only way of knowing would be to run it with a Turing card and see what BOINC shows for it CC capability.
Hmmmm, a couple of US would like to know what you were doing this weekend. First, Wildcard started running Stock, and never ran the Special App until October. Just look at the first couple of posts in that thread, I was watching him from around day three of his arrival. Second, here is the post from the User Forum;
On my machine I don't see much difference. See if it's any better on the higher end cards.
It has sm_75 code, so it's a little bigger. It will work with the 750Ti & higher in 14.04.1 and higher.
It still uses API 7.11.0 and S_LIMIT 127. You need the driver from here, https://www.nvidia.com/Download/driverResults.aspx/138279/en-us
It's here, http://www.arkayn.us/lunatics/setiathome_x41p_V0.97b2_x86_64-pc-linux-gnu_cuda100.7z
It clearly says it has sm75, along with stating which cards it works with and minimum OS. Besides that, All My builds are compiled with the code=compute at the end so they will work with any forthcoming Cards, similar to how CUDA 32 and 50 are still working in Windows. The first time it runs with a 'New' GPU it will take about 20 seconds to compile and store code from the Driver and then work from then on with the 'New' GPU. So basically, even the CUDA 8 zi3v should work with Turing the same way CUDA50 works with Pascal.
ID: 1960477 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1960481 - Posted: 15 Oct 2018, 22:53:35 UTC - in response to Message 1960477.  

Thanks for the clarification TBar. I am lucky if I remember what I had for breakfast this morning let alone a post from several weeks ago. I just went to the downloaded archive and saw nothing but the executable and no documentation so unable to quickly read and see whether your new app has sm_75 capabilities. Couldn't find it at the usual places on CA so didn't have a clue how I got it. Thanks for the link to the original post. Good to know it is capable of running anything anyone wants to throw at it.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1960481 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1960568 - Posted: 16 Oct 2018, 23:38:30 UTC
Last modified: 17 Oct 2018, 0:12:47 UTC

RTX 2070 reviews are up today. Interesting that the AIB cards are releasing at MSRP of $499. The PCPer review shows that EVGA 2070 Black Edition has the same exact power draw as the GTX 1080 at 170W. The 2070 should move up towards the top of Shaggie76's credit per watt/hour charts. But PCPer didn't run any of their normal compute benchmarks for some reason on this card. Hope they follow up with that eventually.

[Edit] At least AnandTech did their usual Compute benchmarks. But they tested the Nvidia 2070 FE card which is clocked higher than the AIB cards and is $100 more than the AIB cards.

For the Folding@Home single precision benchmark the 2070 FE matched the results from the 1080Ti FE card.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1960568 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13727
Credit: 208,696,464
RAC: 304
Australia
Message 1960621 - Posted: 17 Oct 2018, 6:54:49 UTC - in response to Message 1960568.  

For the Folding@Home single precision benchmark the 2070 FE matched the results from the 1080Ti FE card.

Yeah, depending on the benchmark it looks like it bounces between the GTX 1080 to GTX 1080Ti performance levels. But given current pricing, you'd be better off with a new GTX 1080.
Grant
Darwin NT
ID: 1960621 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1960625 - Posted: 17 Oct 2018, 7:22:38 UTC - in response to Message 1956906.  

May we ask who has one, so that we can look at the performance of it?
The testing was done by Ray Hinchliffe, as mentioned in message 1956627. He provides BOINC-specific extensions to SIV, but I don't know if he runs SETI.

He was running SETI in the past and this is his account ("red-ray"):
https://setiathome.berkeley.edu/show_user.php?userid=9653891

("Last contact 23 Sep 2018" - got a task or two but didn't finish it)
 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1960625 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1960681 - Posted: 17 Oct 2018, 18:41:57 UTC

We should know shortly how well a 2080Ti does with the CUDA10 app when Vyper gets his card this week.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1960681 · Report as offensive
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 12 · Next

Message boards : Number crunching : GTX is dead long live RTX !!


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.