Question about feeding the GPU

Author	Message
Irok Send message Joined: 15 Jun 04 Posts: 20 Credit: 16,936,272 RAC: 0	Message 1227260 - Posted: 4 May 2012, 17:55:40 UTC I've currently got an i7-2600k running at 4.1 GHz and a slightly overclocked GTX-570. The GPU is running two WUs at a time and each WU has 5% of a CPU core dedicated to feed it (out of 4 physical and 4 hyperthreaded cores). The CPU runs at 100% 24/7 and the GPU runs between 90% and 95%. My question is should I dedicate a larger percentage of the CPU to feed my GPU to maximize output? If so, what would the more experienced crunchers recommend? I'm currently stuck around 37,000 RAC and would like to see if I can do anything to increase my RAC even further. http://setiathome.berkeley.edu/show_host_detail.php?hostid=6421575 ID: 1227260 ·

Fred J. Verster Volunteer tester Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0	Message 1227287 - Posted: 4 May 2012, 18:49:37 UTC - in response to Message 1227260. Last modified: 4 May 2012, 18:56:53 UTC I've currently got an i7-2600k running at 4.1 GHz and a slightly overclocked GTX-570. The GPU is running two WUs at a time and each WU has 5% of a CPU core dedicated to feed it (out of 4 physical and 4 hyperthreaded cores). The CPU runs at 100% 24/7 and the GPU runs between 90% and 95%. My question is should I dedicate a larger percentage of the CPU to feed my GPU to maximize output? If so, what would the more experienced crunchers recommend? I'm currently stuck around 37,000 RAC and would like to see if I can do anything to increase my RAC even further. http://setiathome.berkeley.edu/show_host_detail.php?hostid=6421575 IMHO, is dedicating 1 CPU core for 1 GPU, a waiste of resources. Your GPU will be a bit faster, but you will loose the performance of 1 CPU-Core. You're RAC will certainly not benefit from this, maybe when looking at your GPU, you will gain some. If the GPU can handle a more constant and higher load. I tried this at Milkyway and found, GPU's going into thermal shut-down, because at the almost constant 100% load, gain 10 seconds, 106 instead of 116 seconds, but for a short-time! GPU temps reached 101C ! (1 core= 2 threads, 1 for each GPU; EAH5870!). Even the high-end cards are not made for 100% load, 24 x 7, 365(6) days. Liquid-cooling becomes a necessity! I would focus on getting enough LOAD on the GPU! And what is to gain with an almost unused CPU, when the GPU is loaded to the max! I've noticed, the computer becomes annoying unresponsive, especially screen (GUI) update related issues. ID: 1227287 ·

ivan Volunteer tester Send message Joined: 5 Mar 01 Posts: 783 Credit: 348,560,338 RAC: 223	Message 1227291 - Posted: 4 May 2012, 18:54:35 UTC - in response to Message 1227260. I've currently got an i7-2600k running at 4.1 GHz and a slightly overclocked GTX-570. The GPU is running two WUs at a time and each WU has 5% of a CPU core dedicated to feed it (out of 4 physical and 4 hyperthreaded cores). The CPU runs at 100% 24/7 and the GPU runs between 90% and 95%. My question is should I dedicate a larger percentage of the CPU to feed my GPU to maximize output? If so, what would the more experienced crunchers recommend? I'm currently stuck around 37,000 RAC and would like to see if I can do anything to increase my RAC even further. http://setiathome.berkeley.edu/show_host_detail.php?hostid=6421575 AFAIK that percentage is just indicative -- the programme will use as much (or as little) CPU as it needs to service the GPU -- mine are all set to 4%, but a quick glance at the Task Manager on my home PC shows it currently at 6% CPU for each task. At 90-95% GPU utilisation you might try running a third task on it to see if there's any increase in usage and RAC. ID: 1227291 ·

Gundolf Jahn Send message Joined: 19 Sep 00 Posts: 3184 Credit: 446,358 RAC: 0	Message 1227311 - Posted: 4 May 2012, 19:14:59 UTC - in response to Message 1227291. AFAIK that percentage is just indicative -- the programme will use as much (or as little) CPU as it needs to service the GPU -- Correct. Only when the sum of percentages of all concurrently running GPU tasks reaches 100% will one CPU (core) be reserved for feeding the GPUs. GruÃŸ, Gundolf ID: 1227311 ·

Irok Send message Joined: 15 Jun 04 Posts: 20 Credit: 16,936,272 RAC: 0	Message 1227348 - Posted: 4 May 2012, 20:03:09 UTC - in response to Message 1227311. AFAIK that percentage is just indicative -- the programme will use as much (or as little) CPU as it needs to service the GPU -- Correct. Only when the sum of percentages of all concurrently running GPU tasks reaches 100% will one CPU (core) be reserved for feeding the GPUs. GruÃŸ, Gundolf So just leave it be and be happy with 37,000 then? ID: 1227348 ·

Fred J. Verster Volunteer tester Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0	Message 1227388 - Posted: 4 May 2012, 21:09:39 UTC - in response to Message 1227348. Last modified: 4 May 2012, 21:17:34 UTC AFAIK that percentage is just indicative -- the programme will use as much (or as little) CPU as it needs to service the GPU -- Correct. Only when the sum of percentages of all concurrently running GPU tasks reaches 100% will one CPU (core) be reserved for feeding the GPUs. GruÃŸ, Gundolf So just leave it be and be happy with 37,000 then? Yeah, what's wrong with that? (I would never OC a GPU!). I, delebetatly clock down GPU- core and - memory speed, makes little difference on throughput, (I hate the word RAC) and I want reliable results, not just fast..............! It is about scientific calculations, so they've to be exact, very exact, that where CUDA and OpenCL,comes in :), speeding this up. Also, the new AVX-supported CPU's, like core I3 (XXXX)/I5-XXXX and I7- 2400/3730/50 etc. ID: 1227388 ·

archae86 Send message Joined: 31 Aug 99 Posts: 909 Credit: 1,582,816 RAC: 0	Message 1227417 - Posted: 4 May 2012, 21:55:18 UTC - in response to Message 1227348. Gundolf wrote: So just leave it be and be happy with 37,000 then? You might find that using the preferences mechanism "On multiprocessors, use at most nn % of the processors" to cut the number of active CPU SETIs down might improve GPU output more than enough to pay for itself. The opportunity (or lack of it) depends on how much your GPU is waiting around for CPU service. Over on Einstein, for a rather different system running different applications, on careful comparison I found and posted that running 50% (two CPU tasks) on my i5-2500K with a GTX 460 running three concurrent tasks gave more total output that running either more or fewer CPU (or GPU) tasks. Short of this drastic measure, you might be able to get the CPU job to service the GPU more quickly by using one or another means of modifying CPU task priority. I'm not suggesting your different hardware running your different set of tasks will have the same curves--it won't. Just illustrating that careful comparison can help you find the best operating point for your situation. ID: 1227417 ·

Irok Send message Joined: 15 Jun 04 Posts: 20 Credit: 16,936,272 RAC: 0	Message 1227422 - Posted: 4 May 2012, 22:21:12 UTC - in response to Message 1227388. Yeah, what's wrong with that? (I would never OC a GPU!). I, delebetatly clock down GPU- core and - memory speed, makes little difference on throughput, (I hate the word RAC) and I want reliable results, not just fast..............! It is about scientific calculations, so they've to be exact, very exact, that where CUDA and OpenCL,comes in :), speeding this up. Also, the new AVX-supported CPU's, like core I3 (XXXX)/I5-XXXX and I7- 2400/3730/50 etc. I only overclocked my GPU to the same settings as the factory overclock (they call it superclocked) model. I'm not pushing it harder than that. I do have errors, but right now it's 3 WU's in the last three weeks for that GPU. I think that's acceptable considering that on average my GPU goes through 400+ WUs per day. I'd definitely scale it back if it was erroring out more often. ID: 1227422 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1227426 - Posted: 4 May 2012, 22:29:45 UTC - in response to Message 1227417. Gundolf wrote: So just leave it be and be happy with 37,000 then? You might find that using the preferences mechanism "On multiprocessors, use at most nn % of the processors" to cut the number of active CPU SETIs down might improve GPU output more than enough to pay for itself. The opportunity (or lack of it) depends on how much your GPU is waiting around for CPU service. Over on Einstein, for a rather different system running different applications, on careful comparison I found and posted that running 50% (two CPU tasks) on my i5-2500K with a GTX 460 running three concurrent tasks gave more total output that running either more or fewer CPU (or GPU) tasks. Short of this drastic measure, you might be able to get the CPU job to service the GPU more quickly by using one or another means of modifying CPU task priority. I'm not suggesting your different hardware running your different set of tasks will have the same curves--it won't. Just illustrating that careful comparison can help you find the best operating point for your situation. Absolutely true, but unfortunately no actual data can be inferred from the Einstein experience to guide the likely 'sweet spot' here. Only the process of experimentation and recording can be transferred. As Ivan said, the CPU %age displayed in BOINC Manager is purely a nominal value which helps BOINC schedule work. The actual demand on the CPU depends on the way the GPU application is programmed. In Einstein's case, something of the order of 20% of the application's actual work is programmed to be run on the CPU - so keeping a substantial CPU resource free to service the GPU makes sense. The two SETI applications are very different. The Multibeam x41g application you are running places very little demand on the CPU (although, having said that, your computer is showing substantial CPU timings for completed runs). The NVidia GPU application for Astropulse v6 - being tested at Beta - currently places a much higher demand on the CPU, but with further development that should be brought down to similar levels to the Multibeam application. ID: 1227426 ·

bill Send message Joined: 16 Jun 99 Posts: 861 Credit: 29,352,955 RAC: 0	Message 1227440 - Posted: 4 May 2012, 22:59:51 UTC - in response to Message 1227422. Yeah, what's wrong with that? (I would never OC a GPU!). I, delebetatly clock down GPU- core and - memory speed, makes little difference on throughput, (I hate the word RAC) and I want reliable results, not just fast..............! It is about scientific calculations, so they've to be exact, very exact, that where CUDA and OpenCL,comes in :), speeding this up. Also, the new AVX-supported CPU's, like core I3 (XXXX)/I5-XXXX and I7- 2400/3730/50 etc. I only overclocked my GPU to the same settings as the factory overclock (they call it superclocked) model. I'm not pushing it harder than that. I do have errors, but right now it's 3 WU's in the last three weeks for that GPU. I think that's acceptable considering that on average my GPU goes through 400+ WUs per day. I'd definitely scale it back if it was erroring out more often. You do realize that factory overclocked cards use better/more reliable parts than they're lesser clocked cards? Factory overclocking doesn't guarantee that the card will be 100% reliable for Seti calculations. They run they're cards for gamers not scientific calculation. I have Palit 460 that will throw errors on Seti units 4 to 5 times a day at factory settings. Under clock the memory from 2000 MHz to 1905 MHz and it's 100% reliable. That's still way faster than reference card specs but under factory settings. Reliability is better than fast with errors. ID: 1227440 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1227462 - Posted: 5 May 2012, 0:05:16 UTC Last modified: 5 May 2012, 0:06:45 UTC I would try idling 1 core and see what happens (set Boinc to use 75% of CPU). I run only 3 cores on all of my quad rigs and leave 1 open to feed the GPUs more efficiently. Since the GPU is so much faster than the CPU, it does make sense to optimize it's support. The question is whether the speedup in loading the GPU offsets the 4th core not crunching any WUs. You can't lose much by trying it for a couple of weeks and seeing if you can detect any trending in your RAC. Just remember that RAC can drift up and down a bit depending on the variety of WUs being processed. So, I would not halt the experiment if after a few days you see your RAC drop just a bit. You would have to give it 2-3 weeks to be sure. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1227462 ·

Irok Send message Joined: 15 Jun 04 Posts: 20 Credit: 16,936,272 RAC: 0	Message 1227513 - Posted: 5 May 2012, 2:17:28 UTC - in response to Message 1227440. You do realize that factory overclocked cards use better/more reliable parts than they're lesser clocked cards? Factory overclocking doesn't guarantee that the card will be 100% reliable for Seti calculations. They run they're cards for gamers not scientific calculation. I have Palit 460 that will throw errors on Seti units 4 to 5 times a day at factory settings. Under clock the memory from 2000 MHz to 1905 MHz and it's 100% reliable. That's still way faster than reference card specs but under factory settings. Reliability is better than fast with errors. I do realize that but my WU error rate is less than .1% on the GPU. There isn't much room for improvement there. The two that are currently showing as errors have had multiple people error on them so I'm pretty sure it's not my GPU that's causing it. The other errors were from WUs that were due 10 minutes after they were sent to me which I do not believe is a problem on my end. I'm going to work under the assumption that my GPU is working at the current speeds. The GPU is running at 52% fan speed and is at 69 C with utilization in the low 90% according to GPU-Z. My CPU is running all 8 cores at 4.1 GHz at 100% and the temps are around 55 C. I am not planning on a higher overclock on my GPU nor do I plan on OCing my CPU more than it already is. ID: 1227513 ·

Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 1227556 - Posted: 5 May 2012, 3:04:00 UTC - in response to Message 1227513. You do realize that factory overclocked cards use better/more reliable parts than they're lesser clocked cards? Factory overclocking doesn't guarantee that the card will be 100% reliable for Seti calculations. They run they're cards for gamers not scientific calculation. I have Palit 460 that will throw errors on Seti units 4 to 5 times a day at factory settings. Under clock the memory from 2000 MHz to 1905 MHz and it's 100% reliable. That's still way faster than reference card specs but under factory settings. Reliability is better than fast with errors. I do realize that but my WU error rate is less than .1% on the GPU. There isn't much room for improvement there. The two that are currently showing as errors have had multiple people error on them so I'm pretty sure it's not my GPU that's causing it. The other errors were from WUs that were due 10 minutes after they were sent to me which I do not believe is a problem on my end. I'm going to work under the assumption that my GPU is working at the current speeds. Agreed, the two -12 errors are not a GPU fault, that's simply a code limitation in the CUDA applications. By itself, that's likely to give an error rate of 0.1%. The tasks expired shortly after the servers tried to send them to your CPU are all .vlar tasks expired because the "Resend lost work" BOINC code isn't prepared to switch resources. Neither of those conditions could be improved by reducing clock rates, they're outside your control. Joe ID: 1227556 ·

Terror Australis Volunteer tester Send message Joined: 14 Feb 04 Posts: 1817 Credit: 262,693,308 RAC: 44	Message 1228165 - Posted: 6 May 2012, 3:43:07 UTC Last modified: 6 May 2012, 3:53:51 UTC As you are running Win7 you will never achieve much more than 95% GPU usage even if running multiple tasks. As Jason_gee has explained in other threads, this is because the Win7 video driver model has more overhead than the XP version. Regarding your CPU question. The main bottleneck is the memory. As others have explained the CPU requirements for a GPU unit are quite low. However the data for your GPU units and any CPU units has to pass through the system memory. It isn't about memory quantity, it's memory speed that counts here. Looking at your results, your system is tuned pretty well. About the only thing that could could help further is fiddle your overclock to up your memory speed (even if it means a reduced CPU speed) or install faster DIMMs. T.A. Edit: There is also some debate as to whether or not using hyperthreading is an advantage. I think it falls into the "Your results may vary" category. If your RAC is stable, you could try turning it off while leaving everything else "as is" and seeing if it makes a difference ID: 1228165 ·

ausymark Send message Joined: 9 Aug 99 Posts: 95 Credit: 10,175,128 RAC: 0	Message 1250406 - Posted: 23 Jun 2012, 13:20:40 UTC - in response to Message 1227260. Hi Irok My setup is similar to yours (i7 2600K overclocked to 4.5Ghz on air with an nvidia 580). On 64bit Ubuntu 12.04 linux. The i7 is an interesting beast as its 8 virtual cores, 4 real ones. This allows i7 uses to do something quite unique as far as CPU/GPU computing goes. Core goals for CPU crunching is to use up to 100% of all cores to crunch (assuming the CPU has appropriate cooling) and .... Core goals of GPU crunching is to crunch as many GPU work units as the GPU memory can handle while allowing the PC system user to use the graphics ability of their machine without degradation to the 'user experience'. On the GPU front for me that meant running just 2 work units on it at a time. This is primarily because I play games on the system, as well as use it for normal office/web tasks. (Running 3 work units causes runtime issues with graphics games as graphics memory became contested resulting in GPU Seti work unit errors). Now we come to the i7. The CPU must keep the data communication with the GPU occurring as quickly as possible with as much data as required. This is where the i7 works well ..... I have seti configured to use 50% of my CPU - that is 4 cores. But why just 4 when I have 8 virtual? Simply that by increasing it past 4 results in longer work unit processing times with very little RAC advantage. The operating system will schedule each CPU work unit on each of the 4 physical cores. Some will now argue that these 4 cores may not be operating at 100% all the time. Well, great! Your operating system needs some space to do other things besides seti - so the system still remains fluid and responsive to all tasks - including seti. Now this is also where the virtual cores come in. The operating system thinks there are 8 cores and will assign 'GPU data feeding' tasks for the GPU to one, or more, of the virtual cores not running the seti cpu task. This ensures that fluid scheduling is given to the 'quieter' seti cpu core and allows for the GPU to get the workload communication it requires. This ends up being the best of all 3 worlds: 1) The real CPU cores being worked very hard 2) The GPU running efficiently - with maximum communication to the CPU 3) The PC itself remaining fluid for user interaction (a core goal of 'running seti in the background') If you look at my stats atm they dont look great at around just 6000RAC, but thats with the pc running for only 3 to 4 hours a day atm. - so if it was run 24/7 it would be around the 35K RAC mark. Anyway thats just my 2c worth. :) Cheers Mark ID: 1250406 ·

Khangollo Send message Joined: 1 Aug 00 Posts: 245 Credit: 36,410,524 RAC: 0	Message 1250463 - Posted: 23 Jun 2012, 16:39:02 UTC Last modified: 23 Jun 2012, 16:41:52 UTC Should be quite a bit more with 2600K and 580. Just for comparison, my i7 920 (oc @ 3.2GHz, HT off), GTX 570 (stock), Linux, was doing between 35-37K RAC (not right now, since I'm doing E@H). ID: 1250463 ·

ausymark Send message Joined: 9 Aug 99 Posts: 95 Credit: 10,175,128 RAC: 0	Message 1250768 - Posted: 24 Jun 2012, 6:34:29 UTC - in response to Message 1250463. My guestamate was worst case scenario, it could be as high as 50K :) Cheers Mark ID: 1250768 ·

Irok Send message Joined: 15 Jun 04 Posts: 20 Credit: 16,936,272 RAC: 0	Message 1250927 - Posted: 24 Jun 2012, 16:17:43 UTC - in response to Message 1250768. Thanks for the info. Unfortunately I installed the 0040 bios update on my DZ68BC Intel board and knocked out the overclocking (this bios update either does that or bricks the motherboard so I guess I'm lucky). I usually set the CPU to use 75% of the cores when I game and then switch it back when I'm done. I haven't seen any improvement when I run games with fewer than two cuda units going at the same time but I tend to play older games anyways. I'm averaging around 36k per day and I'm going to see what happens when Intel finally gets off its butt and releases a bios update that fixes what it broke. Their only action so far is to remove that update from their download page. ID: 1250927 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.