GTX is dead long live RTX !!

Author	Message
Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1959931 - Posted: 12 Oct 2018, 18:57:15 UTC - in response to Message 1959855. I can't look at a SIV install since going all Linux, but if I remember correctly that info would be in Graphics >> GPU CUDA. Try using the Menu Search function and search on keyword CUDA if I haven't remembered correctly. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1959931 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1959932 - Posted: 12 Oct 2018, 19:01:38 UTC - in response to Message 1959855. Thanks Bruce, it's OK. The 10598 GFLOPS peak in your logs looks right: both Ray Hinchliffe of SIV, and archae86 at Einstein, have purchased RTX 2080 cards since I asked the question, and we've compared notes already. ID: 1959932 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13727 Credit: 208,696,464 RAC: 304	Message 1960209 - Posted: 14 Oct 2018, 1:31:12 UTC For those wondering what an RTX 2080 is capable of, check out Bruce's runtimes (there were a few issues early on, but it appears those have been resolved). From the APR value, it looks like it's running 2 WUs at a time and it's running SoG with period_iterations_num=20 For comparison, my GTX 1070s are running 1 WU at a time using SoG, aggressive settings including period_iterations_num=1 BLC23s are taking me 7min 22sec one at a time. On the RTX 2080, 2 at a time, with period_iterations_num=20 (not like my 1), 4min 10sec. So (very roughly) 7min 20sec v 2min 10 sec for a given WU. Arecibo VLARs are around 8min 45+ for me, around 2min 30 for the RTX2080. Grant Darwin NT ID: 1960209 ·

Bruce Volunteer tester Send message Joined: 15 Mar 02 Posts: 123 Credit: 124,955,234 RAC: 11	Message 1960227 - Posted: 14 Oct 2018, 3:42:26 UTC Last modified: 14 Oct 2018, 3:44:57 UTC Hi Grant, Did a slight retune and that cleared up the warning messages. My current run times are now for one WU at a time. May try that for a while and see how it goes. I might try dropping period_iterations down later on. This is my daily driver, so can't get super aggressive. Thanks for the info. *Bruce* ID: 1960227 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13727 Credit: 208,696,464 RAC: 304	Message 1960231 - Posted: 14 Oct 2018, 4:27:36 UTC - in response to Message 1960227. Hi Grant, Did a slight retune and that cleared up the warning messages. My current run times are now for one WU at a time. May try that for a while and see how it goes. I might try dropping period_iterations down later on. This is my daily driver, so can't get super aggressive. Thanks for the info. No worries. Even so, it's output is almost double mine, even with the higher -period_iterations_num setting. Grant Darwin NT ID: 1960231 ·

W3Perl Volunteer tester Send message Joined: 29 Apr 99 Posts: 251 Credit: 3,696,783,867 RAC: 12,606	Message 1960252 - Posted: 14 Oct 2018, 9:37:26 UTC - in response to Message 1960209. For those wondering what an RTX 2080 is capable of, check out Bruce's runtimes (there were a few issues early on, but it appears those have been resolved). From the APR value, it looks like it's running 2 WUs at a time and it's running SoG with period_iterations_num=20 For comparison, my GTX 1070s are running 1 WU at a time using SoG, aggressive settings including period_iterations_num=1 BLC23s are taking me 7min 22sec one at a time. On the RTX 2080, 2 at a time, with period_iterations_num=20 (not like my 1), 4min 10sec. So (very roughly) 7min 20sec v 2min 10 sec for a given WU. Arecibo VLARs are around 8min 45+ for me, around 2min 30 for the RTX2080. Not so good ;) On linux, running Petri's app with a GTX 1070, BLC23 take 2min 15..... Does anyone have the BLC23 run time for a RTX 2080 using Petri's app ? ID: 1960252 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13727 Credit: 208,696,464 RAC: 304	Message 1960253 - Posted: 14 Oct 2018, 9:40:58 UTC - in response to Message 1960252. Not so good ;) For SoG and Windows, it is. Grant Darwin NT ID: 1960253 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1960321 - Posted: 14 Oct 2018, 21:33:30 UTC Hi, I have not compiled one for RTX yet. TBar may have done so or Gianfranco, It may be possible to run the same code on RTX that runs on Titan V, but if it throws errors or does not run, I'll compile one. Just ask. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1960321 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1960326 - Posted: 14 Oct 2018, 21:44:29 UTC - in response to Message 1960252. Last modified: 14 Oct 2018, 21:46:18 UTC For those wondering what an RTX 2080 is capable of, check out Bruce's runtimes (there were a few issues early on, but it appears those have been resolved). From the APR value, it looks like it's running 2 WUs at a time and it's running SoG with period_iterations_num=20 For comparison, my GTX 1070s are running 1 WU at a time using SoG, aggressive settings including period_iterations_num=1 BLC23s are taking me 7min 22sec one at a time. On the RTX 2080, 2 at a time, with period_iterations_num=20 (not like my 1), 4min 10sec. So (very roughly) 7min 20sec v 2min 10 sec for a given WU. Arecibo VLARs are around 8min 45+ for me, around 2min 30 for the RTX2080. Not so good ;) On linux, running Petri's app with a GTX 1070, BLC23 take 2min 15..... Does anyone have the BLC23 run time for a RTX 2080 using Petri's app ? On GTX1080Ti BLC23 is 67 seconds. The Ti is a power hog and a way slower than the Titan V with 49 seconds and Volta having constantly 90W less power draw. I hope someone with interest in Linux would get a RTX so that we would get nvidia-smi -l output of the power draw and some facts about the run times. I'll compile an executable for who ever can get a RTX2080 or a RTX2080-Ti to try with. EDIT: Corrected another typo from RTX1080 to 2080. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1960326 ·

zoom3+1=4 Volunteer tester Send message Joined: 30 Nov 03 Posts: 65735 Credit: 55,293,173 RAC: 49	Message 1960355 - Posted: 15 Oct 2018, 4:25:34 UTC From what I've seen Titan V's are really expensive, a 1080Ti might be a power hog, but then if it uses 225-250w, that's not much more than My Asus 970 Turbo I've read, supposedly the Asus 970 Turbo uses 225w, even an RTX 2080 is less than a Titan V(Volta) in purchase price, ymmv on card prices. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's ID: 1960355 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13727 Credit: 208,696,464 RAC: 304	Message 1960363 - Posted: 15 Oct 2018, 6:36:16 UTC - in response to Message 1960321. It may be possible to run the same code on RTX that runs on Titan V, From a couple of the articles on the Turing architecture I've looked at, that's most likely to be the case. Diving straight into the microarchitecture, the new Turing SM looks very different to the Pascal SM, but those whoâ€™ve been keeping track of Volta will notice a lot of similarities to the NVIDIAâ€™s more recent microarchitecture. In fact, on a high-level, the Turing SM is fundamentally the same, with the notable exception of a new IP block: the RT Core. Like Volta, the Turing SM is partitioned into 4 sub-cores (or processing blocks) with each sub-core having a single warp scheduler and dispatch unit, as opposed Pascalâ€™s 2 partition setup with two dispatch ports per sub-core warp scheduler. There are some fairly major implications with change, and broadly-speaking this means that Volta/Turing loses the capability to issue a second, non-dependent instruction from a thread for a single clock cycle. Turing is presumably identical to Volta performing instructions over two cycles but with schedulers that can issue an independent instruction every cycle, so ultimately Turing can maintain 2-way instruction level parallelism (ILP) this way, while still having twice the amount of schedulers over Pascal. Like we saw in Volta, these changes go hand-in-hand with the new scheduling/execution model with independent thread scheduling that Turing also has, though differences were not disclosed at this time. Rather than per-warp like Pascal, Volta and Turing have per-thread scheduling resources, with a program counter and stack per-thread to track thread state, as well as a convergence optimizer to intelligently group active same-warp threads together into SIMT units. So all threads are equally concurrent, regardless of warp, and can yield and reconverge. Source- Annandtech. Grant Darwin NT ID: 1960363 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1960400 - Posted: 15 Oct 2018, 14:07:47 UTC Hi, The same executable may work. However here is a link to source https://drive.google.com/open?id=14SD3SU6K5iGYY3GM3aVWSb5Y19N8v5wZ and to an executable that has sm_75 (RTX2080&Ti) enabled https://drive.google.com/open?id=1b8LYfoCmzHjAjbdCHT0ig-wHxXNOYIjZ. My compilation may need certain versions of some system libraries. TBar can compile one that is more compatible. Petri To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1960400 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1960438 - Posted: 15 Oct 2018, 18:17:53 UTC - in response to Message 1960400. TBar already posted a download link to a CUDA10 executable. But that only has the executable, nothing else. No way of knowing whether it is compiled for sm_75 or not since their was no documentation at all. The only way of knowing would be to run it with a Turing card and see what BOINC shows for it CC capability. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1960438 ·

zoom3+1=4 Volunteer tester Send message Joined: 30 Nov 03 Posts: 65735 Credit: 55,293,173 RAC: 49	Message 1960457 - Posted: 15 Oct 2018, 20:08:17 UTC Yeah for Linux only. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's ID: 1960457 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1960477 - Posted: 15 Oct 2018, 21:45:21 UTC - in response to Message 1960438. Last modified: 15 Oct 2018, 21:50:24 UTC TBar already posted a download link to a CUDA10 executable. But that only has the executable, nothing else. No way of knowing whether it is compiled for sm_75 or not since their was no documentation at all. The only way of knowing would be to run it with a Turing card and see what BOINC shows for it CC capability. Hmmmm, a couple of US would like to know what you were doing this weekend. First, Wildcard started running Stock, and never ran the Special App until October. Just look at the first couple of posts in that thread, I was watching him from around day three of his arrival. Second, here is the post from the User Forum; On my machine I don't see much difference. See if it's any better on the higher end cards. It has sm_75 code, so it's a little bigger. It will work with the 750Ti & higher in 14.04.1 and higher. It still uses API 7.11.0 and S_LIMIT 127. You need the driver from here, https://www.nvidia.com/Download/driverResults.aspx/138279/en-us It's here, http://www.arkayn.us/lunatics/setiathome_x41p_V0.97b2_x86_64-pc-linux-gnu_cuda100.7z It clearly says it has sm75, along with stating which cards it works with and minimum OS. Besides that, All My builds are compiled with the code=compute at the end so they will work with any forthcoming Cards, similar to how CUDA 32 and 50 are still working in Windows. The first time it runs with a 'New' GPU it will take about 20 seconds to compile and store code from the Driver and then work from then on with the 'New' GPU. So basically, even the CUDA 8 zi3v should work with Turing the same way CUDA50 works with Pascal. ID: 1960477 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1960481 - Posted: 15 Oct 2018, 22:53:35 UTC - in response to Message 1960477. Thanks for the clarification TBar. I am lucky if I remember what I had for breakfast this morning let alone a post from several weeks ago. I just went to the downloaded archive and saw nothing but the executable and no documentation so unable to quickly read and see whether your new app has sm_75 capabilities. Couldn't find it at the usual places on CA so didn't have a clue how I got it. Thanks for the link to the original post. Good to know it is capable of running anything anyone wants to throw at it. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1960481 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1960568 - Posted: 16 Oct 2018, 23:38:30 UTC Last modified: 17 Oct 2018, 0:12:47 UTC RTX 2070 reviews are up today. Interesting that the AIB cards are releasing at MSRP of $499. The PCPer review shows that EVGA 2070 Black Edition has the same exact power draw as the GTX 1080 at 170W. The 2070 should move up towards the top of Shaggie76's credit per watt/hour charts. But PCPer didn't run any of their normal compute benchmarks for some reason on this card. Hope they follow up with that eventually. [Edit] At least AnandTech did their usual Compute benchmarks. But they tested the Nvidia 2070 FE card which is clocked higher than the AIB cards and is $100 more than the AIB cards. For the Folding@Home single precision benchmark the 2070 FE matched the results from the 1080Ti FE card. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1960568 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13727 Credit: 208,696,464 RAC: 304	Message 1960621 - Posted: 17 Oct 2018, 6:54:49 UTC - in response to Message 1960568. For the Folding@Home single precision benchmark the 2070 FE matched the results from the 1080Ti FE card. Yeah, depending on the benchmark it looks like it bounces between the GTX 1080 to GTX 1080Ti performance levels. But given current pricing, you'd be better off with a new GTX 1080. Grant Darwin NT ID: 1960621 ·

BilBg Volunteer tester Send message Joined: 27 May 07 Posts: 3720 Credit: 9,385,827 RAC: 0	Message 1960625 - Posted: 17 Oct 2018, 7:22:38 UTC - in response to Message 1956906. May we ask who has one, so that we can look at the performance of it? The testing was done by Ray Hinchliffe, as mentioned in message 1956627. He provides BOINC-specific extensions to SIV, but I don't know if he runs SETI. He was running SETI in the past and this is his account ("red-ray"): https://setiathome.berkeley.edu/show_user.php?userid=9653891 ("Last contact 23 Sep 2018" - got a task or two but didn't finish it) Â - ALF - "Find out what you don't do well ..... then don't do it!" :) Â ID: 1960625 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1960681 - Posted: 17 Oct 2018, 18:41:57 UTC We should know shortly how well a 2080Ti does with the CUDA10 app when Vyper gets his card this week. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1960681 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.