Question about feeding the GPU


log in

Advanced search

Message boards : Number crunching : Question about feeding the GPU

Author Message
Irok
Send message
Joined: 15 Jun 04
Posts: 20
Credit: 16,936,272
RAC: 0
United States
Message 1227260 - Posted: 4 May 2012, 17:55:40 UTC

I've currently got an i7-2600k running at 4.1 GHz and a slightly overclocked GTX-570. The GPU is running two WUs at a time and each WU has 5% of a CPU core dedicated to feed it (out of 4 physical and 4 hyperthreaded cores). The CPU runs at 100% 24/7 and the GPU runs between 90% and 95%. My question is should I dedicate a larger percentage of the CPU to feed my GPU to maximize output? If so, what would the more experienced crunchers recommend? I'm currently stuck around 37,000 RAC and would like to see if I can do anything to increase my RAC even further.

http://setiathome.berkeley.edu/show_host_detail.php?hostid=6421575

____________

Profile Fred J. Verster
Volunteer tester
Avatar
Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,902,797
RAC: 257
Netherlands
Message 1227287 - Posted: 4 May 2012, 18:49:37 UTC - in response to Message 1227260.
Last modified: 4 May 2012, 18:56:53 UTC

I've currently got an i7-2600k running at 4.1 GHz and a slightly overclocked GTX-570. The GPU is running two WUs at a time and each WU has 5% of a CPU core dedicated to feed it (out of 4 physical and 4 hyperthreaded cores). The CPU runs at 100% 24/7 and the GPU runs between 90% and 95%. My question is should I dedicate a larger percentage of the CPU to feed my GPU to maximize output? If so, what would the more experienced crunchers recommend? I'm currently stuck around 37,000 RAC and would like to see if I can do anything to increase my RAC even further.

http://setiathome.berkeley.edu/show_host_detail.php?hostid=6421575


IMHO, is dedicating 1 CPU core for 1 GPU, a waiste of resources.
Your GPU will be a bit faster, but you will loose the performance
of 1 CPU-Core.
You're RAC will certainly not benefit from this, maybe when looking at your GPU, you will gain some. If the GPU can handle a more constant and higher
load.
I tried this at Milkyway and found, GPU's going into thermal shut-down, because
at the almost constant 100% load, gain 10 seconds, 106 instead of 116 seconds,
but for a short-time! GPU temps reached 101C ! (1 core= 2 threads, 1 for each GPU; EAH5870!).

Even the high-end cards are not made for 100% load, 24 x 7, 365(6) days.
Liquid-cooling becomes a necessity!

I would focus on getting enough LOAD on the GPU!
And what is to gain with an almost unused CPU, when the GPU is loaded to
the max!
I've noticed, the computer becomes annoying unresponsive, especially screen (GUI) update related issues.
____________

Profile ivan
Volunteer tester
Avatar
Send message
Joined: 5 Mar 01
Posts: 640
Credit: 146,922,210
RAC: 59,275
United Kingdom
Message 1227291 - Posted: 4 May 2012, 18:54:35 UTC - in response to Message 1227260.

I've currently got an i7-2600k running at 4.1 GHz and a slightly overclocked GTX-570. The GPU is running two WUs at a time and each WU has 5% of a CPU core dedicated to feed it (out of 4 physical and 4 hyperthreaded cores). The CPU runs at 100% 24/7 and the GPU runs between 90% and 95%. My question is should I dedicate a larger percentage of the CPU to feed my GPU to maximize output? If so, what would the more experienced crunchers recommend? I'm currently stuck around 37,000 RAC and would like to see if I can do anything to increase my RAC even further.

http://setiathome.berkeley.edu/show_host_detail.php?hostid=6421575

AFAIK that percentage is just indicative -- the programme will use as much (or as little) CPU as it needs to service the GPU -- mine are all set to 4%, but a quick glance at the Task Manager on my home PC shows it currently at 6% CPU for each task. At 90-95% GPU utilisation you might try running a third task on it to see if there's any increase in usage and RAC.

____________

Profile Gundolf Jahn
Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 360,556
RAC: 37
Germany
Message 1227311 - Posted: 4 May 2012, 19:14:59 UTC - in response to Message 1227291.

AFAIK that percentage is just indicative -- the programme will use as much (or as little) CPU as it needs to service the GPU --

Correct. Only when the sum of percentages of all concurrently running GPU tasks reaches 100% will one CPU (core) be reserved for feeding the GPUs.

Gruß,
Gundolf

Irok
Send message
Joined: 15 Jun 04
Posts: 20
Credit: 16,936,272
RAC: 0
United States
Message 1227348 - Posted: 4 May 2012, 20:03:09 UTC - in response to Message 1227311.

AFAIK that percentage is just indicative -- the programme will use as much (or as little) CPU as it needs to service the GPU --

Correct. Only when the sum of percentages of all concurrently running GPU tasks reaches 100% will one CPU (core) be reserved for feeding the GPUs.

Gruß,
Gundolf


So just leave it be and be happy with 37,000 then?
____________

Profile Fred J. Verster
Volunteer tester
Avatar
Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,902,797
RAC: 257
Netherlands
Message 1227388 - Posted: 4 May 2012, 21:09:39 UTC - in response to Message 1227348.
Last modified: 4 May 2012, 21:17:34 UTC

AFAIK that percentage is just indicative -- the programme will use as much (or as little) CPU as it needs to service the GPU --

Correct. Only when the sum of percentages of all concurrently running GPU tasks reaches 100% will one CPU (core) be reserved for feeding the GPUs.

Gruß,
Gundolf


So just leave it be and be happy with 37,000 then?


Yeah, what's wrong with that? (I would never OC a GPU!).
I, delebetatly clock down GPU- core and - memory speed, makes little difference on throughput, (I hate the word RAC) and I want reliable results,
not just fast..............!

It is about scientific calculations, so they've to be exact, very
exact, that where CUDA and OpenCL,comes in :), speeding this up.
Also, the new AVX-supported CPU's, like core I3 (XXXX)/I5-XXXX and I7- 2400/3730/50 etc.
____________

archae86
Send message
Joined: 31 Aug 99
Posts: 889
Credit: 1,572,794
RAC: 3
United States
Message 1227417 - Posted: 4 May 2012, 21:55:18 UTC - in response to Message 1227348.

Gundolf wrote:
So just leave it be and be happy with 37,000 then?

You might find that using the preferences mechanism "On multiprocessors, use at most nn % of the processors" to cut the number of active CPU SETIs down might improve GPU output more than enough to pay for itself.

The opportunity (or lack of it) depends on how much your GPU is waiting around for CPU service.

Over on Einstein, for a rather different system running different applications, on careful comparison I found and posted that running 50% (two CPU tasks) on my i5-2500K with a GTX 460 running three concurrent tasks gave more total output that running either more or fewer CPU (or GPU) tasks.

Short of this drastic measure, you might be able to get the CPU job to service the GPU more quickly by using one or another means of modifying CPU task priority.

I'm not suggesting your different hardware running your different set of tasks will have the same curves--it won't. Just illustrating that careful comparison can help you find the best operating point for your situation.
____________

Irok
Send message
Joined: 15 Jun 04
Posts: 20
Credit: 16,936,272
RAC: 0
United States
Message 1227422 - Posted: 4 May 2012, 22:21:12 UTC - in response to Message 1227388.



Yeah, what's wrong with that? (I would never OC a GPU!).
I, delebetatly clock down GPU- core and - memory speed, makes little difference on throughput, (I hate the word RAC) and I want reliable results,
not just fast..............!

It is about scientific calculations, so they've to be exact, very
exact, that where CUDA and OpenCL,comes in :), speeding this up.
Also, the new AVX-supported CPU's, like core I3 (XXXX)/I5-XXXX and I7- 2400/3730/50 etc.


I only overclocked my GPU to the same settings as the factory overclock (they call it superclocked) model. I'm not pushing it harder than that. I do have errors, but right now it's 3 WU's in the last three weeks for that GPU. I think that's acceptable considering that on average my GPU goes through 400+ WUs per day. I'd definitely scale it back if it was erroring out more often.
____________

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8764
Credit: 52,716,402
RAC: 17,906
United Kingdom
Message 1227426 - Posted: 4 May 2012, 22:29:45 UTC - in response to Message 1227417.

Gundolf wrote:
So just leave it be and be happy with 37,000 then?

You might find that using the preferences mechanism "On multiprocessors, use at most nn % of the processors" to cut the number of active CPU SETIs down might improve GPU output more than enough to pay for itself.

The opportunity (or lack of it) depends on how much your GPU is waiting around for CPU service.

Over on Einstein, for a rather different system running different applications, on careful comparison I found and posted that running 50% (two CPU tasks) on my i5-2500K with a GTX 460 running three concurrent tasks gave more total output that running either more or fewer CPU (or GPU) tasks.

Short of this drastic measure, you might be able to get the CPU job to service the GPU more quickly by using one or another means of modifying CPU task priority.

I'm not suggesting your different hardware running your different set of tasks will have the same curves--it won't. Just illustrating that careful comparison can help you find the best operating point for your situation.

Absolutely true, but unfortunately no actual data can be inferred from the Einstein experience to guide the likely 'sweet spot' here. Only the process of experimentation and recording can be transferred.

As Ivan said, the CPU %age displayed in BOINC Manager is purely a nominal value which helps BOINC schedule work. The actual demand on the CPU depends on the way the GPU application is programmed.

In Einstein's case, something of the order of 20% of the application's actual work is programmed to be run on the CPU - so keeping a substantial CPU resource free to service the GPU makes sense.

The two SETI applications are very different. The Multibeam x41g application you are running places very little demand on the CPU (although, having said that, your computer is showing substantial CPU timings for completed runs). The NVidia GPU application for Astropulse v6 - being tested at Beta - currently places a much higher demand on the CPU, but with further development that should be brought down to similar levels to the Multibeam application.

bill
Send message
Joined: 16 Jun 99
Posts: 861
Credit: 24,148,117
RAC: 1,872
United States
Message 1227440 - Posted: 4 May 2012, 22:59:51 UTC - in response to Message 1227422.



Yeah, what's wrong with that? (I would never OC a GPU!).
I, delebetatly clock down GPU- core and - memory speed, makes little difference on throughput, (I hate the word RAC) and I want reliable results,
not just fast..............!

It is about scientific calculations, so they've to be exact, very
exact, that where CUDA and OpenCL,comes in :), speeding this up.
Also, the new AVX-supported CPU's, like core I3 (XXXX)/I5-XXXX and I7- 2400/3730/50 etc.


I only overclocked my GPU to the same settings as the factory overclock (they call it superclocked) model. I'm not pushing it harder than that. I do have errors, but right now it's 3 WU's in the last three weeks for that GPU. I think that's acceptable considering that on average my GPU goes through 400+ WUs per day. I'd definitely scale it back if it was erroring out more often.


You do realize that factory overclocked cards use better/more reliable parts than they're lesser clocked cards? Factory overclocking doesn't guarantee
that the card will be 100% reliable for Seti calculations. They run they're cards for gamers not scientific calculation.

I have Palit 460 that will throw errors on Seti units 4 to 5 times a day
at factory settings. Under clock the memory from 2000 MHz to 1905 MHz and it's
100% reliable. That's still way faster than reference card specs but under factory settings. Reliability is better than fast with errors.

Irok
Send message
Joined: 15 Jun 04
Posts: 20
Credit: 16,936,272
RAC: 0
United States
Message 1227513 - Posted: 5 May 2012, 2:17:28 UTC - in response to Message 1227440.


You do realize that factory overclocked cards use better/more reliable parts than they're lesser clocked cards? Factory overclocking doesn't guarantee
that the card will be 100% reliable for Seti calculations. They run they're cards for gamers not scientific calculation.

I have Palit 460 that will throw errors on Seti units 4 to 5 times a day
at factory settings. Under clock the memory from 2000 MHz to 1905 MHz and it's
100% reliable. That's still way faster than reference card specs but under factory settings. Reliability is better than fast with errors.


I do realize that but my WU error rate is less than .1% on the GPU. There isn't much room for improvement there. The two that are currently showing as errors have had multiple people error on them so I'm pretty sure it's not my GPU that's causing it. The other errors were from WUs that were due 10 minutes after they were sent to me which I do not believe is a problem on my end. I'm going to work under the assumption that my GPU is working at the current speeds.

The GPU is running at 52% fan speed and is at 69 C with utilization in the low 90% according to GPU-Z. My CPU is running all 8 cores at 4.1 GHz at 100% and the temps are around 55 C. I am not planning on a higher overclock on my GPU nor do I plan on OCing my CPU more than it already is.
____________

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4335
Credit: 1,113,795
RAC: 779
United States
Message 1227556 - Posted: 5 May 2012, 3:04:00 UTC - in response to Message 1227513.


You do realize that factory overclocked cards use better/more reliable parts than they're lesser clocked cards? Factory overclocking doesn't guarantee
that the card will be 100% reliable for Seti calculations. They run they're cards for gamers not scientific calculation.

I have Palit 460 that will throw errors on Seti units 4 to 5 times a day
at factory settings. Under clock the memory from 2000 MHz to 1905 MHz and it's
100% reliable. That's still way faster than reference card specs but under factory settings. Reliability is better than fast with errors.


I do realize that but my WU error rate is less than .1% on the GPU. There isn't much room for improvement there. The two that are currently showing as errors have had multiple people error on them so I'm pretty sure it's not my GPU that's causing it. The other errors were from WUs that were due 10 minutes after they were sent to me which I do not believe is a problem on my end. I'm going to work under the assumption that my GPU is working at the current speeds.

Agreed, the two -12 errors are not a GPU fault, that's simply a code limitation in the CUDA applications. By itself, that's likely to give an error rate of 0.1%. The tasks expired shortly after the servers tried to send them to your CPU are all .vlar tasks expired because the "Resend lost work" BOINC code isn't prepared to switch resources. Neither of those conditions could be improved by reducing clock rates, they're outside your control.
Joe

Terror Australis
Volunteer tester
Send message
Joined: 14 Feb 04
Posts: 1759
Credit: 206,464,740
RAC: 10,806
Australia
Message 1228165 - Posted: 6 May 2012, 3:43:07 UTC
Last modified: 6 May 2012, 3:53:51 UTC

As you are running Win7 you will never achieve much more than 95% GPU usage even if running multiple tasks.

As Jason_gee has explained in other threads, this is because the Win7 video driver model has more overhead than the XP version.

Regarding your CPU question.
The main bottleneck is the memory. As others have explained the CPU requirements for a GPU unit are quite low. However the data for your GPU units and any CPU units has to pass through the system memory. It isn't about memory quantity, it's memory speed that counts here.

Looking at your results, your system is tuned pretty well. About the only thing that could could help further is fiddle your overclock to up your memory speed (even if it means a reduced CPU speed) or install faster DIMMs.

T.A.

Edit: There is also some debate as to whether or not using hyperthreading is an advantage. I think it falls into the "Your results may vary" category. If your RAC is stable, you could try turning it off while leaving everything else "as is" and seeing if it makes a difference

Profile ausymark
Send message
Joined: 9 Aug 99
Posts: 70
Credit: 9,411,305
RAC: 43
Australia
Message 1250406 - Posted: 23 Jun 2012, 13:20:40 UTC - in response to Message 1227260.

Hi Irok

My setup is similar to yours (i7 2600K overclocked to 4.5Ghz on air with an nvidia 580). On 64bit Ubuntu 12.04 linux.

The i7 is an interesting beast as its 8 virtual cores, 4 real ones. This allows i7 uses to do something quite unique as far as CPU/GPU computing goes.

Core goals for CPU crunching is to use up to 100% of all cores to crunch (assuming the CPU has appropriate cooling) and ....

Core goals of GPU crunching is to crunch as many GPU work units as the GPU memory can handle while allowing the PC system user to use the graphics ability of their machine without degradation to the 'user experience'.

On the GPU front for me that meant running just 2 work units on it at a time. This is primarily because I play games on the system, as well as use it for normal office/web tasks. (Running 3 work units causes runtime issues with graphics games as graphics memory became contested resulting in GPU Seti work unit errors).

Now we come to the i7. The CPU must keep the data communication with the GPU occurring as quickly as possible with as much data as required. This is where the i7 works well ..... I have seti configured to use 50% of my CPU - that is 4 cores. But why just 4 when I have 8 virtual? Simply that by increasing it past 4 results in longer work unit processing times with very little RAC advantage.

The operating system will schedule each CPU work unit on each of the 4 physical cores. Some will now argue that these 4 cores may not be operating at 100% all the time. Well, great! Your operating system needs some space to do other things besides seti - so the system still remains fluid and responsive to all tasks - including seti.

Now this is also where the virtual cores come in. The operating system thinks there are 8 cores and will assign 'GPU data feeding' tasks for the GPU to one, or more, of the virtual cores not running the seti cpu task. This ensures that fluid scheduling is given to the 'quieter' seti cpu core and allows for the GPU to get the workload communication it requires.

This ends up being the best of all 3 worlds:

1) The real CPU cores being worked very hard
2) The GPU running efficiently - with maximum communication to the CPU
3) The PC itself remaining fluid for user interaction (a core goal of 'running seti in the background')

If you look at my stats atm they dont look great at around just 6000RAC, but thats with the pc running for only 3 to 4 hours a day atm. - so if it was run 24/7 it would be around the 35K RAC mark.

Anyway thats just my 2c worth. :)

Cheers

Mark
____________

Profile Khangollo
Avatar
Send message
Joined: 1 Aug 00
Posts: 245
Credit: 36,410,524
RAC: 0
Slovenia
Message 1250463 - Posted: 23 Jun 2012, 16:39:02 UTC
Last modified: 23 Jun 2012, 16:41:52 UTC

Should be quite a bit more with 2600K and 580.
Just for comparison, my i7 920 (oc @ 3.2GHz, HT off), GTX 570 (stock), Linux, was doing between 35-37K RAC (not right now, since I'm doing E@H).
____________

Profile ausymark
Send message
Joined: 9 Aug 99
Posts: 70
Credit: 9,411,305
RAC: 43
Australia
Message 1250768 - Posted: 24 Jun 2012, 6:34:29 UTC - in response to Message 1250463.

My guestamate was worst case scenario, it could be as high as 50K :)

Cheers

Mark
____________

Irok
Send message
Joined: 15 Jun 04
Posts: 20
Credit: 16,936,272
RAC: 0
United States
Message 1250927 - Posted: 24 Jun 2012, 16:17:43 UTC - in response to Message 1250768.

Thanks for the info. Unfortunately I installed the 0040 bios update on my DZ68BC Intel board and knocked out the overclocking (this bios update either does that or bricks the motherboard so I guess I'm lucky). I usually set the CPU to use 75% of the cores when I game and then switch it back when I'm done. I haven't seen any improvement when I run games with fewer than two cuda units going at the same time but I tend to play older games anyways. I'm averaging around 36k per day and I'm going to see what happens when Intel finally gets off its butt and releases a bios update that fixes what it broke. Their only action so far is to remove that update from their download page.
____________

Message boards : Number crunching : Question about feeding the GPU

Copyright © 2014 University of California