Oddness with Kepler Vid Cards

Message boards : Number crunching : Oddness with Kepler Vid Cards
Message board moderation

To post messages, you must log in.

AuthorMessage
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1407288 - Posted: 24 Aug 2013, 13:32:45 UTC
Last modified: 24 Aug 2013, 13:33:32 UTC

On my machine with an i7-3820, I have been running an MSI GTX 680 Twin Frozr III 2GB card (factory OC to 1059/1124 boost). According to GPU-Z and EVGA precision, it has been running at 1149MHz - slightly beyond the boost frequency. That's the first oddity.

Due to some buying and selling on Craigslist, I had enough surplus $$$ to add an ASUS GTX 660 Ti 3GB card (TI-DC2OC-3GD5), which is factory overclocked at 1006/1085 boost. It is running at 1215MHz (!!!!). So that's the second oddity.

More: how can the 660Ti be faster than the 680? (they both use the same GK104 chip)
ID: 1407288 · Report as offensive
spitfire_mk_2
Avatar

Send message
Joined: 14 Apr 00
Posts: 563
Credit: 27,306,885
RAC: 0
United States
Message 1407301 - Posted: 24 Aug 2013, 14:41:10 UTC

Just off the top, a lot of third party gpu apps are actually based on gpu-z. I don't remember if EVGA app is one of them, but it might. So the fact that gpu-z and evga app report the same thing is not a surprise because you are basically using gpu-z in both cases.

Another gpu-z based app is nvidiaInspector. But it has overclocking/underclocking options. I would try it to set the clocks where they are supposed to be.
ID: 1407301 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1407307 - Posted: 24 Aug 2013, 14:46:48 UTC - in response to Message 1407288.  
Last modified: 24 Aug 2013, 14:50:39 UTC

More: how can the 660Ti be faster than the 680? (they both use the same GK104 chip)

I´m not joking, what you sugest is like a compact car run faster than a muscle car, is not just the chipset or the clock that makes the GPU runs faster.
ID: 1407307 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1407308 - Posted: 24 Aug 2013, 14:47:01 UTC - in response to Message 1407301.  

By setting GPU clocks manually on a Kepler, won't you be defeating the purpose of GPU Boost?
ID: 1407308 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1407311 - Posted: 24 Aug 2013, 14:50:24 UTC - in response to Message 1407307.  

More: how can the 660Ti be faster than the 680? (they both use the same GK104 chip)

If you discover a way please share with us... i´m not joking, what you sugest is make a compact car run fast as a muscle car, is just not the chipset that makes the GPU runs faster.

I'm sure the rev counter in the compact car will reach higher RPM than the rev counter in a truck - but the truck will still shift more goods.
ID: 1407311 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1407320 - Posted: 24 Aug 2013, 15:19:38 UTC - in response to Message 1407311.  

I'm sure the rev counter in the compact car will reach higher RPM than the rev counter in a truck - but the truck will still shift more goods.


In this case, both "vehicles" have the same motor, so I think this comment is off. And the 660Ti also has a faster memory clock, by about 10%.

ID: 1407320 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1407323 - Posted: 24 Aug 2013, 15:34:31 UTC - in response to Message 1407320.  
Last modified: 24 Aug 2013, 15:37:07 UTC

I'm sure the rev counter in the compact car will reach higher RPM than the rev counter in a truck - but the truck will still shift more goods.


In this case, both "vehicles" have the same motor, so I think this comment is off. And the 660Ti also has a faster memory clock, by about 10%.

Not sure what brand of GPU you have... from the EVGA site:

NVIDIA GTX 680
1536 CUDA Cores
1097 MHz Base Clock
1163 MHz Boost Clock
140GT/s Texture Fill Rate

Memory

2048 MB, 256 bit GDDR5
6208 MHz (effective)
198.66 GB/s Memory Bandwidth

NVIDIA GTX 660 Ti
1344 CUDA Cores
915 MHz Base Clock
980 MHz Boost Clock
102.5GT/s Texture Fill Rate

Memory

2048 MB, 192 bit GDDR5
6008 MHz (effective)
144.19 GB/s Memory Bandwidth

The chip (motor) could be the same but the the rest is clearely not.
ID: 1407323 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1407324 - Posted: 24 Aug 2013, 15:39:52 UTC - in response to Message 1407323.  
Last modified: 24 Aug 2013, 15:41:44 UTC

The GTX 680 is a dual card, so 3072 CUDA cores in total.

To continue the rather laboured motoring analogy, it's likely that the 'truck' is fitted with a rev limiter, to avoid it melting (a) itself, and (b) the computer's PSU, with the extra power needed by all those cores.

Edit - argh, slipped a line when reading http://en.wikipedia.org/wiki/Comparison_of_Nvidia_graphics_processing_units#GeForce_600_Series - that was the GTX 690.

Forget I spoke.
ID: 1407324 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1407332 - Posted: 24 Aug 2013, 16:08:00 UTC

Juan - the 660Ti is an ASUS card - it was given in the original post.
ID: 1407332 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1407337 - Posted: 24 Aug 2013, 16:22:09 UTC - in response to Message 1407332.  
Last modified: 24 Aug 2013, 16:22:47 UTC

Juan - the 660Ti is an ASUS card - it was given in the original post.

My mistake, but if you look the ASUS & MSI sites you will see the same specs aplies, even if they have the same main chip, the 680 have more cores (1535 vs 1344), 256 bit and faster memory bandwidth, something very important when you crunch.
ID: 1407337 · Report as offensive
tbret
Volunteer tester
Avatar

Send message
Joined: 28 May 99
Posts: 3380
Credit: 296,162,071
RAC: 40
United States
Message 1407395 - Posted: 24 Aug 2013, 19:03:20 UTC - in response to Message 1407288.  

On my machine with an i7-3820, I have been running an MSI GTX 680 Twin Frozr III 2GB card (factory OC to 1059/1124 boost). According to GPU-Z and EVGA precision, it has been running at 1149MHz - slightly beyond the boost frequency. That's the first oddity.

Due to some buying and selling on Craigslist, I had enough surplus $$$ to add an ASUS GTX 660 Ti 3GB card (TI-DC2OC-3GD5), which is factory overclocked at 1006/1085 boost. It is running at 1215MHz (!!!!). So that's the second oddity.

More: how can the 660Ti be faster than the 680? (they both use the same GK104 chip)


I do not have a 680. I have three 670s and four 660Tis.

All of my cards run beyond their stated "boost clock" when use Precision X and increase the power target to its maximum ***and*** keep them cool. I do not have to fiddle with the core clock or memory speed, just increase the power target.

The memory bandwidth of the GTX 680 is wider than the 660Ti, so more information is moved-around per clock on the 680. Perhaps that is the bottle-neck which keeps the manufacturer from increasing the clock speed on the 680.

I have several "lesser" cards which have faster clocks than their "big brothers."

My 550Ti has a higher clock than one of my 560s, I have 460s with higher clocks than 470s. I have 560Tis with higher clocks than my one 560Ti-448, and two of my 670s have the same base clock but "boost" over their specs and different than each-other. (all EVGA cards)

In other words, I don't think you have a problem.
ID: 1407395 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1407415 - Posted: 24 Aug 2013, 19:55:58 UTC - in response to Message 1407395.  
Last modified: 24 Aug 2013, 20:03:14 UTC

tbret - thanks for the info.

I really didn't think of it as a problem, was just noting the apparent discrepancies in the context of the 660Ti being a significantly cheaper card than the 680. I just hadn't expected it to be faster at computing than the 680, which I know is a better card for gaming because of its higher bandwidth.

But crunching isn't as bandwidth-bound as gaming, so it appears the 660Ti is not only better for SETI, but cheaper, as well!

EDIT: I did nothing with EVGA Precision to do any further overclocking; I was just running it to get the info on speeds, loads and temps. That's why the clocks > then expected were surprising to me. /EDIT.
ID: 1407415 · Report as offensive
MonChrMe

Send message
Joined: 9 Jun 13
Posts: 23
Credit: 113,889
RAC: 0
United Kingdom
Message 1407463 - Posted: 24 Aug 2013, 22:08:14 UTC - in response to Message 1407415.  
Last modified: 24 Aug 2013, 22:09:30 UTC

If you load them both up to ~100% the 680 should still be a bit faster than the 660Ti as it has more cores. Not by much mind.

1536 cores * 1149Mhz -> 1.76m
1344 cores * 1215Mhz -> 1.63m

If you're not loading them to ~100% the Ti will win out, but only because cores are going unused.

Ti's have always been the best bang-for-the-buck - the 560Ti was pretty close to the the 580's performance as well. It helps that the card vendors seem to realise this and put some pretty crazy cooling solutions on the Ti cards.

Speaking of which... i'm going to need an upgrade soon. :(
ID: 1407463 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1407484 - Posted: 25 Aug 2013, 0:32:26 UTC - in response to Message 1407463.  

If you load them both up to ~100% the 680 should still be a bit faster than the 660Ti as it has more cores. Not by much mind.

1536 cores * 1149Mhz -> 1.76m
1344 cores * 1215Mhz -> 1.63m

If you're not loading them to ~100% the Ti will win out, but only because cores are going unused.

Ti's have always been the best bang-for-the-buck - the 560Ti was pretty close to the the 580's performance as well. It helps that the card vendors seem to realise this and put some pretty crazy cooling solutions on the Ti cards.

Speaking of which... i'm going to need an upgrade soon. :(


I think that's good analysis. Both are running 3 WUs at once, and around 96-99% GPU usage. I don't know how the SETI 7 app uses the cores, though, so overall efficiency is kind of up in the air. But they appear to be (as you show) within a few % of each other. My remarks were just about the surprise of the cheaper card running at a higher MHz.
ID: 1407484 · Report as offensive
tbret
Volunteer tester
Avatar

Send message
Joined: 28 May 99
Posts: 3380
Credit: 296,162,071
RAC: 40
United States
Message 1407498 - Posted: 25 Aug 2013, 3:38:05 UTC - in response to Message 1407415.  
Last modified: 25 Aug 2013, 3:51:18 UTC



I just hadn't expected it to be faster at computing than the 680



To be clear, the 660Ti has a higher clock, but it is no match for a GTX 680 in crunching. If you find your 660Ti is out-crunching your GTX 680, you need to do something differently.




But crunching isn't as bandwidth-bound as gaming, so it appears the 660Ti is not only better for SETI, but cheaper, as well!



I'm not meaning to repeat myself or be "smart" or argumentative but what you just said is not correct. A GTX 680 will run rings around a 660Ti when crunching.

I've just looked at your valid task list. What is it that has lead you to believe the 660Ti is faster than the 680? That's not a challenge to your observation; I'm curious to know what you are looking-at.

Oh, I also wonder (again; not a challenge or a doubt; just looking for information) what made you settle on three-at-a-time tasks on your cards? (I still have questions about what the optimum number might be for myself.)

My systems are very different than yours, so what is best on mine and yours may not be the same thing. I tried running two and found it was doing me no good. I wonder if I just didn't go far enough and that if I tried three it might be better. I was running 3 at a time when we were crunching v6.

EDIT: I'm sorry, I'm tired. Running more than one AP at a time did me no good, I run two MB at a time.
ID: 1407498 · Report as offensive
MonChrMe

Send message
Joined: 9 Jun 13
Posts: 23
Credit: 113,889
RAC: 0
United Kingdom
Message 1407499 - Posted: 25 Aug 2013, 3:47:11 UTC - in response to Message 1407484.  
Last modified: 25 Aug 2013, 4:06:01 UTC

My turn to be surprised; you're hitting ~100% with just 3 wu's?

A normal wu runs at 60% on my 560ti, so I run 2 to maximise. I assumed that with all the extra cores on a 660ti (384 vs 1344) that you'd be running 6 or 7 wu's to hit full load.

Guess there's a bigger difference between Fermi and Kepler boards than I realised.



Anyway, the Ti running a similar clock rate sounds about right. Fewer cores and the narrower memory bus means power consumption (hence heat production) should be a little lower on the Ti. Assuming both coolers have the same capacity (they're both pretty beefy), you'd expect to be able to push the Ti higher before reaching that capacity.
ID: 1407499 · Report as offensive
tbret
Volunteer tester
Avatar

Send message
Joined: 28 May 99
Posts: 3380
Credit: 296,162,071
RAC: 40
United States
Message 1407501 - Posted: 25 Aug 2013, 4:01:22 UTC - in response to Message 1407499.  



A normal wu runs at 60% on my 560ti, so I run 2 to maximise. I assumed that with all the extra cores on a 660ti (384 vs 1344) that you'd be running 6 or 7 wu's to hit full load.



I'm just talking because you bring-up something interesting.

My 560Tis have always out-produced my 660Tis on SETI.

The 660Tis stomp the 560Tis into the dust at GPUGRID (I believe that's the 2GB vs 1GB of VRAM).

It only takes two at a time (SETI MB) to send my 660Ti cards into 95+% GPU usage.
ID: 1407501 · Report as offensive
MonChrMe

Send message
Joined: 9 Jun 13
Posts: 23
Credit: 113,889
RAC: 0
United Kingdom
Message 1407505 - Posted: 25 Aug 2013, 4:18:14 UTC - in response to Message 1407501.  

Originally posted this as an edit; separated it out for readability.

You can use Nvidia Inspector to monitor the GPU utilisation while running GPU
tasks. Stop all CPU tasks, run a GPU WU on its own, and watch your CPU & GPU utilisation while the WU runs.

That lets you figure out how many WU's you can run before hitting 100%. It also lets you see if you need to increase the CPU allocation to feed the GPU tasks properly (I had to, CPU's an older phenom II).

For example, if your GPU utilisation is 40% you can run 2 WU on the GPU at full speed, or 3 at a reduced speed. The 3 will take longer to complete individually, but should still let you get more units done in a day by using resources that would be idle otherwise.

That test would also tell you that you need to increase the CPU allocation for each task to keep the GPU fed - otherwise the CPU WUs might slow your GPU down.

In my case I ended up going with 2 Wu's on the GPU, with half a CPU core allocated to each, plus 3 CPU tasks. The CPU's an older phenom II, and runs at 90% overall utilisation configured like that.



Interesting that the 660's show similar utilisation to the 560's despite having 3 times the cores. Perhaps the Seti WU's are being limited by something else on the board? The 560's and 660's have similar ROP counts, for example? The keplers are supposed to have simplified schedulers (some of it's offloaded to the CPU), could that be the reason?
ID: 1407505 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1407567 - Posted: 25 Aug 2013, 9:21:09 UTC - in response to Message 1407505.  


Interesting that the 660's show similar utilisation to the 560's despite having 3 times the cores. Perhaps the Seti WU's are being limited by something else on the board? The 560's and 660's have similar ROP counts, for example? The keplers are supposed to have simplified schedulers (some of it's offloaded to the CPU), could that be the reason?

I've found that with my GTX660's that just running 2 MB workunits was creating very much lower GPU memory controller loads (let alone GPU loads) compared with my GTX550Ti's/560Ti (the GTX560 Ti no longer crunches here due to the need to constantly monitor and adjust adjust to thermal limitations), but running 3 MB on the 660's produced the same memory load as the 550/560's while using similar GPU loads.

For a comparison my 2500K rig would need an extra GTX550Ti installed (if I could get 1 new here now without paying a ridiculous price I would) to keep up with with my 3570K's dual GTX660's (if this helps someone, then ok, but remember that YMMV depending on hardware platform I guess).

But if I had (or could afford) a GTX670/680 I'd be very tempted to try 4 MB's on it (sorry but I only do AP's on CPU here, my GPU's just do MB), but running 2 of those cards x4 MB's I'd likely need to reserve a CPU core to feed them without having a system impact (I had this happen with my old Q6600 @ 3GHz feeding those 2 GTX660's 3 w/u's each so I had to reserve on that CPU).

BTW my GTX660's also run above the boost spec's, but then they are reported to be only using 85-88% of their TDP (the 2nd card does run slightly faster than the 1st even though running it's own monitor using the same amount of power as the 1st card, but it also runs cooler which may account for that difference).

Cheers.
ID: 1407567 · Report as offensive
Profile Michael W.F. Miles
Avatar

Send message
Joined: 24 Mar 07
Posts: 268
Credit: 34,410,870
RAC: 0
Canada
Message 1408463 - Posted: 27 Aug 2013, 20:18:21 UTC

I have noticed a huge increase in GPU utilization.
Especially with v7 GPU tasks x41zc
I used to have to run 2 at a time on my Zotac 460 gtx to get 90% utilization but now with the new V7 1 at a time gives me the same utilization.

AP on the 460 crunching 604 tasks is the same. High GPU utilization.

For the original poster try using SIV ( SYSTEM INFORMATION VIEWER )by RED RAY and see what it reports for speed on your GPU.

It reports what the speeds really are.
I use it exclusively now for all information and GPU Over Clocking and works very well.

http://rh-software.com/


ID: 1408463 · Report as offensive

Message boards : Number crunching : Oddness with Kepler Vid Cards


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.