How does ocing effect Cuda?

Message boards : Number crunching : How does ocing effect Cuda?
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Voyager
Volunteer tester
Avatar

Send message
Joined: 2 Nov 99
Posts: 602
Credit: 3,264,813
RAC: 0
United States
Message 885022 - Posted: 13 Apr 2009, 18:24:45 UTC

Does processor speed effect cuda processing other than the time it takes to load the card?Will cuda be about the same at 3ghz vs 2ghz ?
ID: 885022 · Report as offensive
Profile Dywanik
Avatar

Send message
Joined: 16 Mar 02
Posts: 29
Credit: 1,913,940
RAC: 0
Message 885040 - Posted: 13 Apr 2009, 19:02:25 UTC
Last modified: 13 Apr 2009, 19:08:42 UTC

That's a very good question. I don't have a CUDA to test it, however, I believe the answer is: it depends. ;-) Depends from the way you're overclocking your CPU. You usually do it by changing FSB and multiplier, e.g., you can get 2000 MHz both by having FSB 200MHz and multiplier 10 as well as FSB 100MHz and multiplier 20. And here is the point: when you have higher FSB you're overclocking your whole computer, therefore your CUDA would be more efficient for the first case.
BUT that won't be that significant unless you're o/c pr0 who benchmarks everything. ;-)
"Failure is not an option."
Gene Kranz, Apollo 13 Flight Director

"Be the change you want to see in the World"
Mahatma Gandhi

My web-page:
www.dywanik.eu
ID: 885040 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 885042 - Posted: 13 Apr 2009, 19:03:55 UTC

I'd think that the mobo would be the slow point for the GPU. but I'd doubt that the difference would be that significant. most likely the CPu time would be about 1/3 slower but we are talking less than 120 seconds overall. I doubt that the performance is going to take that big of a hit.


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 885042 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 885055 - Posted: 13 Apr 2009, 19:34:19 UTC
Last modified: 13 Apr 2009, 19:42:45 UTC


If a CUDA task start, a 100 % CPU-Core load happen.
In this time an OCed CPU/RAM/mobo would help to reduce this processing time.

[With stock CPU/mobo: ~ 12 sec. with Raistmer's V10 CUDA app, on my rig]
I OCed now my AMD Phenom II X4 940 BE from stock 3.0 GHz to 3.2 GHz [only with changing the multiplier] and have now less CUDA task 100 % CPU-Core load time (- ~ 1 sec. or something).

After this 100 % CPU-Core load time, an OCed CPU/RAM/mobo wouldn't help to recuce the GPU calculation time. [The CPU-Core support the GPU with around [up and downs] 24 % in the GPU calculation time, peak to 48 % - on my rig]
[guessing.. - to now I didn't had time to look to the whole calculation time of the CUDA tasks]

OTOH. It would help to OC also the GPU to recuce the calculation time on the GPU.. :-)

ID: 885055 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 885060 - Posted: 13 Apr 2009, 19:42:05 UTC

I'd think OCing the GPU would reduce processing time more. The problem is that the WU's heat the GPU greatly so OCing might just throw somw GPU's over the Heating edge.


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 885060 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 885064 - Posted: 13 Apr 2009, 19:50:41 UTC - in response to Message 885060.  

I'd think OCing the GPU would reduce processing time more. The problem is that the WU's heat the GPU greatly so OCing might just throw somw GPU's over the Heating edge.


I read about.. ~ 100 MHz more [GPU-Core or Shader, it wasn't named] and + ~ 1.000 RAC..


If the GPU fan is on AUTO.. and @ 100 % RPM all the time - I would look twice to my GPU OC.. ;-)

ID: 885064 · Report as offensive
slozomby

Send message
Joined: 16 Nov 04
Posts: 20
Credit: 242,588
RAC: 0
United States
Message 885124 - Posted: 13 Apr 2009, 22:30:16 UTC
Last modified: 13 Apr 2009, 22:33:09 UTC

using my lowball card ( 8400gs) as a test since any change there should be apparent.


changing the clock speed from the default 440mhz to 500mhz did not noticibly decrease average processing time.

moving the 8400gs between machines and it always comes back to about 2.5 hours per wu.

shadercount appears to be the single biggest factor in wu processing speed.
ID: 885124 · Report as offensive
Zydor

Send message
Joined: 4 Oct 03
Posts: 172
Credit: 491,111
RAC: 0
United Kingdom
Message 885189 - Posted: 14 Apr 2009, 1:02:12 UTC - in response to Message 885022.  
Last modified: 14 Apr 2009, 1:04:10 UTC

I have a Phenom2 940 BE 4Gb RAM, with an EN9800GTX+ DK 512Mb.

I have found with this setup a significant difference in WU processing when O/C the GPU. Broadly speaking the speed of processing appeared to increase in line with the increase in Shader - the latter is by far the dominent factor in speed increase.

Currently it runs at:
Engine: 776Mhz
Shader: 1925Mhz
Memory (stock speed): 1100x2 Mhz.

Default shader for the card is 1800, so its set to a 7% speed increase overall, that was reflected in the increase set against old crunch time for a WU on the GPU.

The GPU runs at 85% fan speed, and settles at 63degrees constant on full load. Its in a cramped mid tower case with a four disk RAID in there, so will run slightly hotter than a full tower, would probably run at around 60/61 degrees in a full tower. The fan is a stock fan. CUDA WU time is around 16.5mins on average with Raistners V11 Opti App, which in real terms is around eight extra WUs a day due to the O/C if used 7x24.

When O/C a gpu its important to increase both the Engine and Shader by the same proportion. Memory speed on the card memory, as such, is not relevant, keep memory at stock speed. Most gpu's will go to about 8-9% O/C before heat issues cause compute errors, I've seen big increases on that on a friend's machine, with the card water cooled. A 7-8% increase on air-cooling seems to be safe for most cards. When O/C'ing it, increase the engine and shader in line with each other by the same proportions, step by step (say 1% at a time), until you get the first compute error, then back off 2% and you will be dead safe. The only remaining question will be card temp as that will vary hugely depending on room ambient temperature, case used, and air flow around cable runs etc etc.

Anyone wishing to pursue further, nip over to GPUGRID, there is a long thread there dealing with settings and speeds of different NVIDIA Card types - its a good starter for ten when first looking at the topic.

Regards
Zy
ID: 885189 · Report as offensive
elgar

Send message
Joined: 21 May 99
Posts: 69
Credit: 2,687,478
RAC: 0
United States
Message 885200 - Posted: 14 Apr 2009, 1:47:21 UTC - in response to Message 885189.  

Overclocking my 260 reduced CUDA time to ~10 minutes from ~13min (600mhz to 715mhz). Overclocking the cpu 500mhz didn't have any affect on computation time.

Now if I could just get some CUDA tasks on 6.6.20...
ID: 885200 · Report as offensive
Profile popandbob
Volunteer tester

Send message
Joined: 19 Mar 05
Posts: 551
Credit: 4,673,015
RAC: 0
Canada
Message 885348 - Posted: 14 Apr 2009, 20:53:59 UTC - in response to Message 885200.  

Overclocking my 260 reduced CUDA time to ~10 minutes from ~13min (600mhz to 715mhz). Overclocking the cpu 500mhz didn't have any affect on computation time.

Now if I could just get some CUDA tasks on 6.6.20...


If you were to up the shaders and memory a bit you will see a lower time yet :)

I was seeing times of 10 min with 700/1550/1050
I couldn't push the core any higher but upping the shaders and memory added an extra boost.


Do you Good Search for Seti@Home? http://www.goodsearch.com/?charityid=888957
Or Good Shop? http://www.goodshop.com/?charityid=888957
ID: 885348 · Report as offensive
Profile Voyager
Volunteer tester
Avatar

Send message
Joined: 2 Nov 99
Posts: 602
Credit: 3,264,813
RAC: 0
United States
Message 885388 - Posted: 14 Apr 2009, 23:08:32 UTC

I tryed ocing gpu. Uped it by 16%, times went from ~18 min. to ~16min. Cool!
ID: 885388 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 885493 - Posted: 15 Apr 2009, 8:22:27 UTC - in response to Message 885388.  

I tryed ocing gpu. Uped it by 16%, times went from ~18 min. to ~16min. Cool!


Yes, GPU processing times can be improved by GPU OCing. But when you do such OCing, please, look for results your GPU returns. That is, if there are some "inconclusives" in pending results.
The reason it should be checked (and checked over few days on fixed GPU freqs):
When one OCing CPU there is pretty narrow freq interval where system will work but will give invalid results time to time. But with GPU situation is completely different.
One can get system freeze with OCed memory or OCed engine freqs too much, but system will continue to work with OCed too much shader freq.
This freq can be rised to the value when almost all SETI tasks will fail with different CUDA errors but system in whole will look just normal: no screen distortion, no OS hangs. But such too OCed GPU is counterproductive for SETI.
And there is pretty wide freq range where computation will go w/o reported errors but produce invalid results. So one will recive no computation error state in results (and corresponding filter on web site will not show errors) but many GPU results will be invalid still. One can check such state by using "pending" filter on website and look if there are some "Completed, validation inconclusive" results in list.

If yes, it's worth to check what result will pass validation with third result (is it your host who produces invalid results or it's your wingman's host).

I encountered such situation while OCing my own 9600GSO. Too high shader freq can ruin SETI results but will not damage system stability as whole.
ID: 885493 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 885497 - Posted: 15 Apr 2009, 8:35:02 UTC
Last modified: 15 Apr 2009, 8:35:53 UTC


And how is the OCed CPU involved?

I have the feeling now after OCing the CPU from 3 to 3.2 GHz [only changing the multi from 15 to 16], that the CPU CUDA start run ~ 1 sec. faster.. also the complete CPU support time - ~ 5 sec./WU.
[0.44x AR-WUs]

This is only my feeling, or would help OCing the CPU also to reduce the 'pure' GPU crunching time?

ID: 885497 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 885507 - Posted: 15 Apr 2009, 9:10:31 UTC - in response to Message 885497.  
Last modified: 15 Apr 2009, 9:14:13 UTC


And how is the OCed CPU involved?

I have the feeling now after OCing the CPU from 3 to 3.2 GHz [only changing the multi from 15 to 16], that the CPU CUDA start run ~ 1 sec. faster.. also the complete CPU support time - ~ 5 sec./WU.
[0.44x AR-WUs]

This is only my feeling, or would help OCing the CPU also to reduce the 'pure' GPU crunching time?


The faster CPU is the smaller initial load phase could be. But I don't think CPU speed will affect much on GPU performance if CPU is fast enough.
Moreover time to time CUDA app use busy-loop for threads sync - in these moments the more faster CPU is the more power and cycles will lost in vain. especially with slow GPUs.
That is, ultimate host performance tuning depends on some diametral opposite factors, IMHO no "fit to all" answer can be given. In general GPU speed should be balanced with CPU speed. If you have many hosts with many different CPUs and GPUs the best to rearrange them to use fast CPU + fast GPU and slow CPU + slow GPU (slow and fast have no absolute meaning, they are relative between CPUs and GPUs available).
BTW, if one wanna consider such tiny effects as influence of CPU OCing on GPU speed he should take into account non-monotonic CPU performance curve versus CPU freq. CPU can't wait 1,3 ticks for memory. It can wait 1 tick or 2 ticks (for example). That is, different CPU freq to memory freq ratios can give non-monotonic performance curve.

EDIT: BTW, bus OCing could affect GPU performance because of host memory<-> GPU memory transfers. That is, not CPU OCing but memory OCing could be optimal.
ID: 885507 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 885513 - Posted: 15 Apr 2009, 9:37:35 UTC
Last modified: 15 Apr 2009, 9:40:55 UTC


After changing the multi - HyperPI 1M is running ~ 1.2 sec. faster [~ 5 %].. :-D

AFAIK.. HyperPI is a well indicator for OCing - change something [FSB/RAM or others], make the 1M test and look if it's faster or slower..


What would be a well stress procedure test prog for CPU/RAM/mobo OCing?
I don't have time [RAC ! ;-D] to make a 24 hours test loop.

There is somewhere a prog with which I could test in ~ 10 min. if the new OC is stable? :-D
If not for ~ 10 min. ;-) ..which prog would be a well OC test prog for a pure GPU crunching rig? :-)
..because now I'm back to only GPU power.. ;-)


If I find finally a well heatsink for my CPU and the small place on my MSI K9A2 Platinum..
Heatsink for MSI K9A2 Platinum ?

I will take time to mod my case.. 8. opening for 4th GPU.. and some fan openings.. for well airflow around the GPUs..
Then I will have a pure 4 GPU crunching rig.. :-)

ID: 885513 · Report as offensive
Profile mimo
Volunteer tester
Avatar

Send message
Joined: 7 Feb 03
Posts: 92
Credit: 14,957,404
RAC: 0
Slovakia
Message 885514 - Posted: 15 Apr 2009, 9:38:02 UTC

Only PCI-e overclocking on non-GPU side will affect GPU performance and only on devices with compute capability 1.0 - they havent async memcopy. 1.1 and up have async memcopy so the pcie latencies can be hidden. And if u have drivers with device overlap (simultaneous memcpy/kernel execution) it is the best

ID: 885514 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 885744 - Posted: 16 Apr 2009, 6:16:52 UTC
Last modified: 16 Apr 2009, 6:17:26 UTC


The current crunched AR 0.41x WUs [with OCed CPU] have now the same crunching time [get support from the CPU] like in past crunched AR 0.44x WUs without OC.

AFAIK: < AR = > crunching time

This would mean - my OCed CPU help to reduce also the pure GPU crunchig time..


Of course, this isn't really a test.. ;-D
But maybe a reason for to guess it could be? :-)

ID: 885744 · Report as offensive
Profile Lint trap

Send message
Joined: 30 May 03
Posts: 871
Credit: 28,092,319
RAC: 0
United States
Message 886416 - Posted: 19 Apr 2009, 17:22:09 UTC - in response to Message 885513.  


After changing the multi - HyperPI 1M is running ~ 1.2 sec. faster [~ 5 %].. :-D

AFAIK.. HyperPI is a well indicator for OCing - change something [FSB/RAM or others], make the 1M test and look if it's faster or slower..


What would be a well stress procedure test prog for CPU/RAM/mobo OCing?
I don't have time [RAC ! ;-D] to make a 24 hours test loop.



For a quick, maximum stress, stress test, I use IntelBurnTest from http://www.ultimate-filez.com. This test generates more heat than Prime95. A LOT more on my rig!

I recommend use of the Custom test first - 64-128 MB and 8-10 runs. This eliminates over-stressing an unknown system that obviously has a problem after seeing the test results.

Don't skip perusing the readme file before starting the exe.

Martin
ID: 886416 · Report as offensive
-ShEm-
Volunteer tester

Send message
Joined: 25 Feb 00
Posts: 139
Credit: 4,129,448
RAC: 0
Message 886444 - Posted: 19 Apr 2009, 18:26:11 UTC - in response to Message 886416.  

OCCT is also good ;)
ID: 886444 · Report as offensive

Message boards : Number crunching : How does ocing effect Cuda?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.