How does ocing effect Cuda?


log in

Advanced search

Message boards : Number crunching : How does ocing effect Cuda?

Author Message
Profile Voyager
Volunteer tester
Avatar
Send message
Joined: 2 Nov 99
Posts: 602
Credit: 3,264,813
RAC: 0
United States
Message 885022 - Posted: 13 Apr 2009, 18:24:45 UTC

Does processor speed effect cuda processing other than the time it takes to load the card?Will cuda be about the same at 3ghz vs 2ghz ?

Profile Dywanik
Avatar
Send message
Joined: 16 Mar 02
Posts: 29
Credit: 1,857,577
RAC: 0
Message 885040 - Posted: 13 Apr 2009, 19:02:25 UTC
Last modified: 13 Apr 2009, 19:08:42 UTC

That's a very good question. I don't have a CUDA to test it, however, I believe the answer is: it depends. ;-) Depends from the way you're overclocking your CPU. You usually do it by changing FSB and multiplier, e.g., you can get 2000 MHz both by having FSB 200MHz and multiplier 10 as well as FSB 100MHz and multiplier 20. And here is the point: when you have higher FSB you're overclocking your whole computer, therefore your CUDA would be more efficient for the first case.
BUT that won't be that significant unless you're o/c pr0 who benchmarks everything. ;-)
____________
"Failure is not an option."
Gene Kranz, Apollo 13 Flight Director

"Be the change you want to see in the World"
Mahatma Gandhi

My web-page:
www.dywanik.eu

Profile ignorance is no excuse
Avatar
Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,433,274
RAC: 0
Korea, North
Message 885042 - Posted: 13 Apr 2009, 19:03:55 UTC

I'd think that the mobo would be the slow point for the GPU. but I'd doubt that the difference would be that significant. most likely the CPu time would be about 1/3 slower but we are talking less than 120 seconds overall. I doubt that the performance is going to take that big of a hit.
____________
In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

End terrorism by building a school

Profile [seti.international] Dirk Sadowski
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7018
Credit: 59,157,887
RAC: 20,639
Germany
Message 885055 - Posted: 13 Apr 2009, 19:34:19 UTC
Last modified: 13 Apr 2009, 19:42:45 UTC


If a CUDA task start, a 100 % CPU-Core load happen.
In this time an OCed CPU/RAM/mobo would help to reduce this processing time.

[With stock CPU/mobo: ~ 12 sec. with Raistmer's V10 CUDA app, on my rig]
I OCed now my AMD Phenom II X4 940 BE from stock 3.0 GHz to 3.2 GHz [only with changing the multiplier] and have now less CUDA task 100 % CPU-Core load time (- ~ 1 sec. or something).

After this 100 % CPU-Core load time, an OCed CPU/RAM/mobo wouldn't help to recuce the GPU calculation time. [The CPU-Core support the GPU with around [up and downs] 24 % in the GPU calculation time, peak to 48 % - on my rig]
[guessing.. - to now I didn't had time to look to the whole calculation time of the CUDA tasks]

OTOH. It would help to OC also the GPU to recuce the calculation time on the GPU.. :-)

____________
BR



>Das Deutsche Cafe. The German Cafe.<

Profile ignorance is no excuse
Avatar
Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,433,274
RAC: 0
Korea, North
Message 885060 - Posted: 13 Apr 2009, 19:42:05 UTC

I'd think OCing the GPU would reduce processing time more. The problem is that the WU's heat the GPU greatly so OCing might just throw somw GPU's over the Heating edge.
____________
In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

End terrorism by building a school

Profile [seti.international] Dirk Sadowski
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7018
Credit: 59,157,887
RAC: 20,639
Germany
Message 885064 - Posted: 13 Apr 2009, 19:50:41 UTC - in response to Message 885060.

I'd think OCing the GPU would reduce processing time more. The problem is that the WU's heat the GPU greatly so OCing might just throw somw GPU's over the Heating edge.


I read about.. ~ 100 MHz more [GPU-Core or Shader, it wasn't named] and + ~ 1.000 RAC..


If the GPU fan is on AUTO.. and @ 100 % RPM all the time - I would look twice to my GPU OC.. ;-)

____________
BR



>Das Deutsche Cafe. The German Cafe.<

slozomby
Send message
Joined: 16 Nov 04
Posts: 20
Credit: 187,273
RAC: 0
United States
Message 885124 - Posted: 13 Apr 2009, 22:30:16 UTC
Last modified: 13 Apr 2009, 22:33:09 UTC

using my lowball card ( 8400gs) as a test since any change there should be apparent.


changing the clock speed from the default 440mhz to 500mhz did not noticibly decrease average processing time.

moving the 8400gs between machines and it always comes back to about 2.5 hours per wu.

shadercount appears to be the single biggest factor in wu processing speed.
____________

Zydor
Send message
Joined: 4 Oct 03
Posts: 172
Credit: 491,111
RAC: 0
United Kingdom
Message 885189 - Posted: 14 Apr 2009, 1:02:12 UTC - in response to Message 885022.
Last modified: 14 Apr 2009, 1:04:10 UTC

I have a Phenom2 940 BE 4Gb RAM, with an EN9800GTX+ DK 512Mb.

I have found with this setup a significant difference in WU processing when O/C the GPU. Broadly speaking the speed of processing appeared to increase in line with the increase in Shader - the latter is by far the dominent factor in speed increase.

Currently it runs at:
Engine: 776Mhz
Shader: 1925Mhz
Memory (stock speed): 1100x2 Mhz.

Default shader for the card is 1800, so its set to a 7% speed increase overall, that was reflected in the increase set against old crunch time for a WU on the GPU.

The GPU runs at 85% fan speed, and settles at 63degrees constant on full load. Its in a cramped mid tower case with a four disk RAID in there, so will run slightly hotter than a full tower, would probably run at around 60/61 degrees in a full tower. The fan is a stock fan. CUDA WU time is around 16.5mins on average with Raistners V11 Opti App, which in real terms is around eight extra WUs a day due to the O/C if used 7x24.

When O/C a gpu its important to increase both the Engine and Shader by the same proportion. Memory speed on the card memory, as such, is not relevant, keep memory at stock speed. Most gpu's will go to about 8-9% O/C before heat issues cause compute errors, I've seen big increases on that on a friend's machine, with the card water cooled. A 7-8% increase on air-cooling seems to be safe for most cards. When O/C'ing it, increase the engine and shader in line with each other by the same proportions, step by step (say 1% at a time), until you get the first compute error, then back off 2% and you will be dead safe. The only remaining question will be card temp as that will vary hugely depending on room ambient temperature, case used, and air flow around cable runs etc etc.

Anyone wishing to pursue further, nip over to GPUGRID, there is a long thread there dealing with settings and speeds of different NVIDIA Card types - its a good starter for ten when first looking at the topic.

Regards
Zy
____________

elgar
Send message
Joined: 21 May 99
Posts: 69
Credit: 2,687,478
RAC: 0
United States
Message 885200 - Posted: 14 Apr 2009, 1:47:21 UTC - in response to Message 885189.

Overclocking my 260 reduced CUDA time to ~10 minutes from ~13min (600mhz to 715mhz). Overclocking the cpu 500mhz didn't have any affect on computation time.

Now if I could just get some CUDA tasks on 6.6.20...

Profile popandbob
Volunteer tester
Send message
Joined: 19 Mar 05
Posts: 535
Credit: 1,896,421
RAC: 0
Canada
Message 885348 - Posted: 14 Apr 2009, 20:53:59 UTC - in response to Message 885200.

Overclocking my 260 reduced CUDA time to ~10 minutes from ~13min (600mhz to 715mhz). Overclocking the cpu 500mhz didn't have any affect on computation time.

Now if I could just get some CUDA tasks on 6.6.20...


If you were to up the shaders and memory a bit you will see a lower time yet :)

I was seeing times of 10 min with 700/1550/1050
I couldn't push the core any higher but upping the shaders and memory added an extra boost.
____________


Do you Good Search for Seti@Home? http://www.goodsearch.com/?charityid=888957
Or Good Shop? http://www.goodshop.com/?charityid=888957

Profile Voyager
Volunteer tester
Avatar
Send message
Joined: 2 Nov 99
Posts: 602
Credit: 3,264,813
RAC: 0
United States
Message 885388 - Posted: 14 Apr 2009, 23:08:32 UTC

I tryed ocing gpu. Uped it by 16%, times went from ~18 min. to ~16min. Cool!

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3368
Credit: 46,055,222
RAC: 32,196
Russia
Message 885493 - Posted: 15 Apr 2009, 8:22:27 UTC - in response to Message 885388.

I tryed ocing gpu. Uped it by 16%, times went from ~18 min. to ~16min. Cool!


Yes, GPU processing times can be improved by GPU OCing. But when you do such OCing, please, look for results your GPU returns. That is, if there are some "inconclusives" in pending results.
The reason it should be checked (and checked over few days on fixed GPU freqs):
When one OCing CPU there is pretty narrow freq interval where system will work but will give invalid results time to time. But with GPU situation is completely different.
One can get system freeze with OCed memory or OCed engine freqs too much, but system will continue to work with OCed too much shader freq.
This freq can be rised to the value when almost all SETI tasks will fail with different CUDA errors but system in whole will look just normal: no screen distortion, no OS hangs. But such too OCed GPU is counterproductive for SETI.
And there is pretty wide freq range where computation will go w/o reported errors but produce invalid results. So one will recive no computation error state in results (and corresponding filter on web site will not show errors) but many GPU results will be invalid still. One can check such state by using "pending" filter on website and look if there are some "Completed, validation inconclusive" results in list.

If yes, it's worth to check what result will pass validation with third result (is it your host who produces invalid results or it's your wingman's host).

I encountered such situation while OCing my own 9600GSO. Too high shader freq can ruin SETI results but will not damage system stability as whole.

Profile [seti.international] Dirk Sadowski
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7018
Credit: 59,157,887
RAC: 20,639
Germany
Message 885497 - Posted: 15 Apr 2009, 8:35:02 UTC
Last modified: 15 Apr 2009, 8:35:53 UTC


And how is the OCed CPU involved?

I have the feeling now after OCing the CPU from 3 to 3.2 GHz [only changing the multi from 15 to 16], that the CPU CUDA start run ~ 1 sec. faster.. also the complete CPU support time - ~ 5 sec./WU.
[0.44x AR-WUs]

This is only my feeling, or would help OCing the CPU also to reduce the 'pure' GPU crunching time?

____________
BR



>Das Deutsche Cafe. The German Cafe.<

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3368
Credit: 46,055,222
RAC: 32,196
Russia
Message 885507 - Posted: 15 Apr 2009, 9:10:31 UTC - in response to Message 885497.
Last modified: 15 Apr 2009, 9:14:13 UTC


And how is the OCed CPU involved?

I have the feeling now after OCing the CPU from 3 to 3.2 GHz [only changing the multi from 15 to 16], that the CPU CUDA start run ~ 1 sec. faster.. also the complete CPU support time - ~ 5 sec./WU.
[0.44x AR-WUs]

This is only my feeling, or would help OCing the CPU also to reduce the 'pure' GPU crunching time?


The faster CPU is the smaller initial load phase could be. But I don't think CPU speed will affect much on GPU performance if CPU is fast enough.
Moreover time to time CUDA app use busy-loop for threads sync - in these moments the more faster CPU is the more power and cycles will lost in vain. especially with slow GPUs.
That is, ultimate host performance tuning depends on some diametral opposite factors, IMHO no "fit to all" answer can be given. In general GPU speed should be balanced with CPU speed. If you have many hosts with many different CPUs and GPUs the best to rearrange them to use fast CPU + fast GPU and slow CPU + slow GPU (slow and fast have no absolute meaning, they are relative between CPUs and GPUs available).
BTW, if one wanna consider such tiny effects as influence of CPU OCing on GPU speed he should take into account non-monotonic CPU performance curve versus CPU freq. CPU can't wait 1,3 ticks for memory. It can wait 1 tick or 2 ticks (for example). That is, different CPU freq to memory freq ratios can give non-monotonic performance curve.

EDIT: BTW, bus OCing could affect GPU performance because of host memory<-> GPU memory transfers. That is, not CPU OCing but memory OCing could be optimal.

Profile [seti.international] Dirk Sadowski
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7018
Credit: 59,157,887
RAC: 20,639
Germany
Message 885513 - Posted: 15 Apr 2009, 9:37:35 UTC
Last modified: 15 Apr 2009, 9:40:55 UTC


After changing the multi - HyperPI 1M is running ~ 1.2 sec. faster [~ 5 %].. :-D

AFAIK.. HyperPI is a well indicator for OCing - change something [FSB/RAM or others], make the 1M test and look if it's faster or slower..


What would be a well stress procedure test prog for CPU/RAM/mobo OCing?
I don't have time [RAC ! ;-D] to make a 24 hours test loop.

There is somewhere a prog with which I could test in ~ 10 min. if the new OC is stable? :-D
If not for ~ 10 min. ;-) ..which prog would be a well OC test prog for a pure GPU crunching rig? :-)
..because now I'm back to only GPU power.. ;-)


If I find finally a well heatsink for my CPU and the small place on my MSI K9A2 Platinum..
Heatsink for MSI K9A2 Platinum ?

I will take time to mod my case.. 8. opening for 4th GPU.. and some fan openings.. for well airflow around the GPUs..
Then I will have a pure 4 GPU crunching rig.. :-)

____________
BR



>Das Deutsche Cafe. The German Cafe.<

Profile mimo
Volunteer tester
Avatar
Send message
Joined: 7 Feb 03
Posts: 86
Credit: 9,646,047
RAC: 366
Slovakia
Message 885514 - Posted: 15 Apr 2009, 9:38:02 UTC

Only PCI-e overclocking on non-GPU side will affect GPU performance and only on devices with compute capability 1.0 - they havent async memcopy. 1.1 and up have async memcopy so the pcie latencies can be hidden. And if u have drivers with device overlap (simultaneous memcpy/kernel execution) it is the best
____________

Profile [seti.international] Dirk Sadowski
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7018
Credit: 59,157,887
RAC: 20,639
Germany
Message 885744 - Posted: 16 Apr 2009, 6:16:52 UTC
Last modified: 16 Apr 2009, 6:17:26 UTC


The current crunched AR 0.41x WUs [with OCed CPU] have now the same crunching time [get support from the CPU] like in past crunched AR 0.44x WUs without OC.

AFAIK: < AR = > crunching time

This would mean - my OCed CPU help to reduce also the pure GPU crunchig time..


Of course, this isn't really a test.. ;-D
But maybe a reason for to guess it could be? :-)

____________
BR



>Das Deutsche Cafe. The German Cafe.<

Profile Lint trap
Send message
Joined: 30 May 03
Posts: 858
Credit: 25,746,799
RAC: 12,140
United States
Message 886416 - Posted: 19 Apr 2009, 17:22:09 UTC - in response to Message 885513.


After changing the multi - HyperPI 1M is running ~ 1.2 sec. faster [~ 5 %].. :-D

AFAIK.. HyperPI is a well indicator for OCing - change something [FSB/RAM or others], make the 1M test and look if it's faster or slower..


What would be a well stress procedure test prog for CPU/RAM/mobo OCing?
I don't have time [RAC ! ;-D] to make a 24 hours test loop.



For a quick, maximum stress, stress test, I use IntelBurnTest from http://www.ultimate-filez.com. This test generates more heat than Prime95. A LOT more on my rig!

I recommend use of the Custom test first - 64-128 MB and 8-10 runs. This eliminates over-stressing an unknown system that obviously has a problem after seeing the test results.

Don't skip perusing the readme file before starting the exe.

Martin

-ShEm-
Volunteer tester
Send message
Joined: 25 Feb 00
Posts: 139
Credit: 4,129,448
RAC: 0
Message 886444 - Posted: 19 Apr 2009, 18:26:11 UTC - in response to Message 886416.

OCCT is also good ;)

Message boards : Number crunching : How does ocing effect Cuda?

Copyright © 2014 University of California