Special App and Kepler Architecture

Message boards : Number crunching : Special App and Kepler Architecture
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1979007 - Posted: 6 Feb 2019, 22:51:58 UTC - in response to Message 1978990.  

for $120, you could have bought a GTX 1060 3GB, which would use ~100w on SETI, and still be faster than the 690 since it can use the latest special app.


And those prices or lower are on eBay. :)

I also am seeing some pretty good prices on gtx 1070 Ti's.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1979007 · Report as offensive
Profile J3P-0
Avatar

Send message
Joined: 1 Dec 11
Posts: 45
Credit: 25,258,781
RAC: 180
United States
Message 1979010 - Posted: 6 Feb 2019, 22:59:25 UTC - in response to Message 1979006.  

I looked at the 1060's but the cuda cores were only 1280 and the GTX 690 is a dual GPU with 3072 cuda cores and a 512 memory bus,

Some things to keep in mind for future purchases.

It's not just the number of cores, but the type of cores. There have been a lot of architectural improvements since the GTX 600 series. And even with a wider memory bus, dual GPUs on a single card tend to be at a disadvantage when it comes to memory bandwidth compared to a single GPU of the same type, even if it has a narrower memory bus (the memory bus on the GTX 690 is really 256bit. One 256 bit bus for each GPU= 512bit in marketing speak). And once again, there have ben considerable improvements over the years since the GTX 600 series came out.

The GTX 690 is 2*GTX 680s, but with GTX 670 clock speeds.

Looking at Shaggie's graphs, the GTX 1060 3GB puts out 400+ credits per hour (stock)- at 120W
A GTX 680 around 300, so the GTX 690 would be around 600- at 300W
This is with Windows on the SoG application. Running LINUX and the Special Application the GTX 1060 3GB would produce 3-4times as much credit, for 120W or less.
(the GTX 1050Ti puts out the same amount of work as a GTX 680, but for less power. For the same power usage as 1 GTX 690, you could run 4 GTX 1050Tis and put out more than double the work).

Shaggie's graphs show the work produced for the power used, the GTX 680 is one of the poorest performers (the GTX 690 would rate even lower). The GTX 1060 3GB is #10 in the top ten for efficiency (although it will eventually get bumped lower now the RTX 2060 has been released).
A more recent card might cost a lot more upfront to buy, but it will cost a lot, lot less to run.


Thanks, at first glance the 690 looked promising to me, the 256bit per gpu was higher than the 196bit for the 1060, so on the surface, I thought that 256 bit with the 3072 cuda cores it would do a lot better. I didn't realize the older archetecture would be that much of a hindrance. I will have to reevaluate and devise a new plan :) HA!
ID: 1979010 · Report as offensive
Profile J3P-0
Avatar

Send message
Joined: 1 Dec 11
Posts: 45
Credit: 25,258,781
RAC: 180
United States
Message 1979011 - Posted: 6 Feb 2019, 23:01:11 UTC - in response to Message 1979007.  

for $120, you could have bought a GTX 1060 3GB, which would use ~100w on SETI, and still be faster than the 690 since it can use the latest special app.


And those prices or lower are on eBay. :)

I also am seeing some pretty good prices on gtx 1070 Ti's.

Tom


I looked on ebay but also had it in my mind I wanted dual GPU cards lol - maybe that was a bad plan now since it was so old.
ID: 1979011 · Report as offensive
Profile J3P-0
Avatar

Send message
Joined: 1 Dec 11
Posts: 45
Credit: 25,258,781
RAC: 180
United States
Message 1979014 - Posted: 6 Feb 2019, 23:13:54 UTC - in response to Message 1979006.  

I looked at the 1060's but the cuda cores were only 1280 and the GTX 690 is a dual GPU with 3072 cuda cores and a 512 memory bus,

Some things to keep in mind for future purchases.

It's not just the number of cores, but the type of cores. There have been a lot of architectural improvements since the GTX 600 series. And even with a wider memory bus, dual GPUs on a single card tend to be at a disadvantage when it comes to memory bandwidth compared to a single GPU of the same type, even if it has a narrower memory bus (the memory bus on the GTX 690 is really 256bit. One 256 bit bus for each GPU= 512bit in marketing speak). And once again, there have ben considerable improvements over the years since the GTX 600 series came out.

The GTX 690 is 2*GTX 680s, but with GTX 670 clock speeds.

Looking at Shaggie's graphs, the GTX 1060 3GB puts out 400+ credits per hour (stock)- at 120W
A GTX 680 around 300, so the GTX 690 would be around 600- at 300W
This is with Windows on the SoG application. Running LINUX and the Special Application the GTX 1060 3GB would produce 3-4times as much credit, for 120W or less.
(the GTX 1050Ti puts out the same amount of work as a GTX 680, but for less power. For the same power usage as 1 GTX 690, you could run 4 GTX 1050Tis and put out more than double the work).

Shaggie's graphs show the work produced for the power used, the GTX 680 is one of the poorest performers (the GTX 690 would rate even lower). The GTX 1060 3GB is #10 in the top ten for efficiency (although it will eventually get bumped lower now the RTX 2060 has been released).
A more recent card might cost a lot more upfront to buy, but it will cost a lot, lot less to run.


I noticed the GTX690 isn't even on the chart or the Titan Z, is it because they count WU on each GPU separately instead of combined?
ID: 1979014 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1979015 - Posted: 6 Feb 2019, 23:17:36 UTC - in response to Message 1979014.  

I noticed the GTX690 isn't even on the chart or the Titan Z, is it because they count WU on each GPU separately instead of combined?

More likely there just aren't enough of them around (there has to be a minimum number of individual cards returning valid work for that model to make the charts).
Grant
Darwin NT
ID: 1979015 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1979017 - Posted: 6 Feb 2019, 23:18:27 UTC - in response to Message 1979004.  


I gotta ask Ian, are you really running 63 GPU's ?

[63] NVIDIA GeForce GTX 1080 Ti (4095MB) driver: 410.66


No. That system has 7 GPUs (6x 1080ti, 1x 1080)
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1979017 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1979019 - Posted: 6 Feb 2019, 23:18:35 UTC
Last modified: 6 Feb 2019, 23:21:21 UTC

The biggest problem with dual GPU cards here is that they have a much shorter life span than singles do due to the amount of heat that they have to dissipate.

Cheers.
ID: 1979019 · Report as offensive
Profile J3P-0
Avatar

Send message
Joined: 1 Dec 11
Posts: 45
Credit: 25,258,781
RAC: 180
United States
Message 1979020 - Posted: 6 Feb 2019, 23:21:41 UTC - in response to Message 1979017.  


I gotta ask Ian, are you really running 63 GPU's ?

[63] NVIDIA GeForce GTX 1080 Ti (4095MB) driver: 410.66


No. That system has 7 GPUs (6x 1080ti, 1x 1080)


Weird, under your account is stating {63} Nvidia for coprocessors
ID: 1979020 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1979023 - Posted: 6 Feb 2019, 23:24:32 UTC - in response to Message 1979020.  

Weird, under your account is stating {63} Nvidia for coprocessors

Some people have worked out how to get around the 100 WU server side limits.
Grant
Darwin NT
ID: 1979023 · Report as offensive
Profile J3P-0
Avatar

Send message
Joined: 1 Dec 11
Posts: 45
Credit: 25,258,781
RAC: 180
United States
Message 1979027 - Posted: 6 Feb 2019, 23:40:36 UTC - in response to Message 1979023.  

Weird, under your account is stating {63} Nvidia for coprocessors

Some people have worked out how to get around the 100 WU server side limits.


oh, meaning that they run more than one WU per GPU? Where would one go to figure this out :)
ID: 1979027 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1979029 - Posted: 6 Feb 2019, 23:44:47 UTC - in response to Message 1979027.  
Last modified: 6 Feb 2019, 23:45:36 UTC

Weird, under your account is stating {63} Nvidia for coprocessors
Some people have worked out how to get around the 100 WU server side limits.
oh, meaning that they run more than one WU per GPU? Where would one go to figure this out :)
No that is not the reason, the reason is to make sure that enough work is on hand to get through server outages without running out of GPU work. ;-)

Cheers.
ID: 1979029 · Report as offensive
Profile J3P-0
Avatar

Send message
Joined: 1 Dec 11
Posts: 45
Credit: 25,258,781
RAC: 180
United States
Message 1979032 - Posted: 6 Feb 2019, 23:51:53 UTC - in response to Message 1979029.  
Last modified: 6 Feb 2019, 23:53:53 UTC

Weird, under your account is stating {63} Nvidia for coprocessors
Some people have worked out how to get around the 100 WU server side limits.
oh, meaning that they run more than one WU per GPU? Where would one go to figure this out :)
No that is not the reason, the reason is to make sure that enough work is on hand to get through server outages without running out of GPU work. ;-)

Cheers.


Ah, gotcha, Tuesdays I run out of work on my 1080, I can't imagine how fast having 6 or 7 1080s would run out of work - so tricking the app to report more GPU's than you really have allows you to download more WU's to run?
ID: 1979032 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1979034 - Posted: 6 Feb 2019, 23:58:27 UTC - in response to Message 1979032.  

Weird, under your account is stating {63} Nvidia for coprocessors
Some people have worked out how to get around the 100 WU server side limits.
oh, meaning that they run more than one WU per GPU? Where would one go to figure this out :)
No that is not the reason, the reason is to make sure that enough work is on hand to get through server outages without running out of GPU work. ;-)

Cheers.
Ah, gotcha, Tuesdays I run out of work on my 1080, I can't imagine how fast having 6 or 7 1080s would run out of work - so tricking the app to report more GPU's than you really have allows you to download more WU's to run?
Exactly. :-)

Cheers.
ID: 1979034 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1979040 - Posted: 7 Feb 2019, 1:26:13 UTC - in response to Message 1979032.  
Last modified: 7 Feb 2019, 1:26:46 UTC

Ah, gotcha, Tuesdays I run out of work on my 1080, I can't imagine how fast having 6 or 7 1080s would run out of work

Actually the time to be out of work is approximately the same with one or 7 GPU's since for each GPU you add 100 WU more.
so a 1 GPU hosts could DL 100 WU and a 7 GPU could DL 7x100. A 1 GPU host crunch 1 WU at a time, a 7 GPU host 7.
So they empty the cache at approximately the same rate.
The main problem is because the Linux Special Sauce are so fast and optimized. They could crunch a WU in a 1080Ti in less than 60 Secs.
So 100 WU holds for around 100 min an the outages normally takes 4-6 hrs, just do the math.
Just to clarify.
ID: 1979040 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1979042 - Posted: 7 Feb 2019, 1:33:06 UTC - in response to Message 1979040.  

Ah, gotcha, Tuesdays I run out of work on my 1080, I can't imagine how fast having 6 or 7 1080s would run out of work

Actually the time to be out of work is approximately the same with one or 7 GPU's since for each GPU you add 100 WU more.
so a 1 GPU hosts could DL 100 WU and a 7 GPU could DL 7x100. A 1 GPU host crunch 1 WU at a time, a 7 GPU host 7.
So they empty the cache at approximately the same rate.
The main problem is because the Linux Special Sauce are so fast and optimized. They could crunch a WU in a 1080Ti in less than 60 Secs.
So 100 WU holds for around 100 min an the outages normally takes 4-6 hrs, just do the math.
Just to clarify.


Yup, all things equal (using the same apps), more GPUs wont drain the cache faster, since the cache gets proportionally larger.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1979042 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1979045 - Posted: 7 Feb 2019, 1:53:25 UTC
Last modified: 7 Feb 2019, 2:04:00 UTC

About the 690.

Few years ago i used to run a fleet (about 8 ) with several host with 2, 3 or even 4 690 per host.
At the time they are some of the top seti crunchers, but now the things changed.
The Linux Special sauce builds changes everything.

If you want to squeeze all you can from your hosts this is what you could do:

Change the 690 to your windows host where they could run OpenCL builds only.
Then be sure you leave 1 CPU core free for each one of the GPU (2 per 690).
Search for the optimized parameters for that GPU (i not have them anymore) but you could ask Mike for some help.
If you can't find, PM i will try to look on my old messages what i use in that time.

On your Linux boxes buy the best GPU you could afford with a minimum Compute capacity of CUDA 5.0
If you can, something like the 1060 or up are the best choices.
They are good bargains on the top 10x0 series on e-bay. Look specially the 1070 (Ti or not) they have one of the best cost x power x production performances.
Obviously the RX20x0 are superior crunchers, but they cost are superior too.
If your host is powering a 690 now (who is power hungry), i'm sure it could power any Top GPU available on the market. So don't worry about that.
Some could tell about the 750Ti but there are relatively old GPU's now. Some of the newer builds could not work in there in the next years.
Install the Linux Special sauce builds and enjoy their amazing crunching speeds.

My 0.02
ID: 1979045 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1979046 - Posted: 7 Feb 2019, 1:54:48 UTC - in response to Message 1979042.  

Based on Vyper's 2080Ti host and the current mix of work the tasks finish up around 40 seconds without -nobs.

So around an hour for 100 tasks and that gpu.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1979046 · Report as offensive
Profile J3P-0
Avatar

Send message
Joined: 1 Dec 11
Posts: 45
Credit: 25,258,781
RAC: 180
United States
Message 1979148 - Posted: 7 Feb 2019, 18:19:31 UTC - in response to Message 1979042.  
Last modified: 7 Feb 2019, 18:20:06 UTC


Actually the time to be out of work is approximately the same with one or 7 GPU's since for each GPU you add 100 WU more.
so a 1 GPU hosts could DL 100 WU and a 7 GPU could DL 7x100. A 1 GPU host crunch 1 WU at a time, a 7 GPU host 7.
So they empty the cache at approximately the same rate.
The main problem is because the Linux Special Sauce are so fast and optimized. They could crunch a WU in a 1080Ti in less than 60 Secs.
So 100 WU holds for around 100 min an the outages normally takes 4-6 hrs, just do the math.
Just to clarify.





Yup, all things equal (using the same apps), more GPUs wont drain the cache faster, since the cache gets proportionally larger.



So if I have 7 GPU (7x100WU) - but I am able to trick the app to think I have 65 gpu's (65x100WU) but only really have 7 GPU's I can download 65x100 instead of 7x100 thus having a bigger cache of WU's to work from ... correct?

Please tell me how to enable this magic sorcery :)
ID: 1979148 · Report as offensive
Profile J3P-0
Avatar

Send message
Joined: 1 Dec 11
Posts: 45
Credit: 25,258,781
RAC: 180
United States
Message 1979154 - Posted: 7 Feb 2019, 18:31:16 UTC - in response to Message 1979045.  

About the 690.

Few years ago i used to run a fleet (about 8 ) with several host with 2, 3 or even 4 690 per host.
At the time they are some of the top seti crunchers, but now the things changed.
The Linux Special sauce builds changes everything.

If you want to squeeze all you can from your hosts this is what you could do:

Change the 690 to your windows host where they could run OpenCL builds only.
Then be sure you leave 1 CPU core free for each one of the GPU (2 per 690).
Search for the optimized parameters for that GPU (i not have them anymore) but you could ask Mike for some help.
If you can't find, PM i will try to look on my old messages what i use in that time.

On your Linux boxes buy the best GPU you could afford with a minimum Compute capacity of CUDA 5.0
If you can, something like the 1060 or up are the best choices.
They are good bargains on the top 10x0 series on e-bay. Look specially the 1070 (Ti or not) they have one of the best cost x power x production performances.
Obviously the RX20x0 are superior crunchers, but they cost are superior too.
If your host is powering a 690 now (who is power hungry), i'm sure it could power any Top GPU available on the market. So don't worry about that.
Some could tell about the 750Ti but there are relatively old GPU's now. Some of the newer builds could not work in there in the next years.
Install the Linux Special sauce builds and enjoy their amazing crunching speeds.

My 0.02


Thanks, I am going to give up on the 690's unfortunately and switch to something others referenced like the 1060's since they support the Special App and perform way better, I really like the concept of dual GPU cards ever since I saw an old ATI quad GPU demo card in early 2000's even before they came out with SLI and Crossfire.
ID: 1979154 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1979158 - Posted: 7 Feb 2019, 19:07:03 UTC - in response to Message 1979148.  

short answer, yes.

but you need to trick the project servers at Berkely, not the app. also there is a maximum GPU count of 64 due to memory allocation issues.

you have to edit the boinc source code and compile a custom new version of the boinc client to do it.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1979158 · Report as offensive
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : Special App and Kepler Architecture


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.