Running One CPU Unit with Multiple GPU Units on a Multi-Processor CPU

Message boards : Number crunching : Running One CPU Unit with Multiple GPU Units on a Multi-Processor CPU
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 · Next

AuthorMessage
Cherokee150

Send message
Joined: 11 Nov 99
Posts: 192
Credit: 58,513,758
RAC: 74
United States
Message 1982233 - Posted: 26 Feb 2019, 0:17:11 UTC
Last modified: 26 Feb 2019, 0:25:00 UTC

Over the last three months I have extensively tested SETI GPU Units on my Core2Quad (6156281) computer with its NVIDIA GeForce GTX 1070 GPU. When running only GPU units and no CPU units it completes the highest number of SETI units per day when configured to simultaneously run four GPU units using SoG. However, I found that, while running four GPU units at a time with no CPU units running, as the CPU cycle demand from the GPU units varies, my CPU utilization will often drop from 100% to 75%, sometimes even 50%, and occasionally 25%.

When I allow BOINC to also run CPU units while running any combination of GPU units, BOINC always starts four CPU units (one per CPU). This takes too many CPU cycles from the GPU units, which reduces the overall daily SETI unit total.

It would appear that, if I could process only one or two CPU units simultaneously with the GPU units, the computer might be able to achieve a higher daily throughput. I would like to test this theory that running one or two CPU units simultaneously with the GPU units would increase overall SETI throughput. I have not, however, been able to determine how I can set up BOINC and SETI to run less than four CPU units concurrently while running four concurrent GPU units.

Is there a way to set this up?

If there is no current way to do this in the BOINC software, it might be in the best interest of the SETI project, and other projects that use GPU processing, to add this functionality.
ID: 1982233 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1982234 - Posted: 26 Feb 2019, 0:31:57 UTC
Last modified: 26 Feb 2019, 0:40:36 UTC

BOINC Manager: Options ---> Computing Preferences --> "Computing" Tab

Set: "Use at Most" ---> 50% of CPUs
"Use at Most" ---> 50% of CPU Time


This will reduce CPU Usage by 1/2. IF your System is Quad Core AND has HyperThreading, this means that out of a Total of 8 Threads that 4 Threads will be used - 'Maximum' for CPU Crunching; the rest are freed for the System and your GPUs to use. IF NO HyperThreading, then 2 CPU Cores will be used for CPU Crunching, and 2 Cores will be freed for System and GPU Usage.

To see and get to the above selections, make sure your BOINC Manager "View" is in "Advanced View".

[EDIT:]

Also, after making the changes noted above, go to BOINC Manager ---> Options: "Read Local Prefs" <<--- Select this.

This will make BOINC use your Local Preferences over the BOINC Web Preferences, and implement the above changes.

[EDIT 2:]

NOTE: I am on BOINC 7.8.6 on MacOS High Sierra 10.13.6 - 17G5019. I believe that these BOINC Manager options are available from 7.6.22 and Up.


TL
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1982234 · Report as offensive
Cherokee150

Send message
Joined: 11 Nov 99
Posts: 192
Credit: 58,513,758
RAC: 74
United States
Message 1982235 - Posted: 26 Feb 2019, 0:42:46 UTC - in response to Message 1982234.  

Thank you, TL. I will try those functions as you suggested and report back shortly.
ID: 1982235 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1982237 - Posted: 26 Feb 2019, 1:08:33 UTC
Last modified: 26 Feb 2019, 1:15:38 UTC

This may be overkill, but you may also, (after making the changes and reading the Local Prefs into use), want to go to BOINC Manager ---> Projects' Tab, Select SETI@Home, and hit Update.

Again, this may be overkill and redundant, but will also guarantee that our changes are locked in and Updated.


TL

[EDIT:]

AND: BOINC Manager ---> Options ---> Computing Preferences

In the "When to Suspend" Section: Uncheck EVERYTHING!

If this was NOT done initially, then after making this change - Reread the Preferences AGAIN.

(AND, hitting Update again wouldn't hurt, either.)


TL
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1982237 · Report as offensive
Cherokee150

Send message
Joined: 11 Nov 99
Posts: 192
Credit: 58,513,758
RAC: 74
United States
Message 1982244 - Posted: 26 Feb 2019, 1:59:16 UTC - in response to Message 1982237.  

Thank you, TL. I have changed the "Use at most _% of the CPUs" from 100% to 25%. It works exactly as you described. That is what I was looking for!
Also, TL, thank you for the additional information. I am actually very familiar with almost all of the Computer Preferences options, both online and local. Unfortunately for me, I was not familiar with the "Use at most _% of the CPUs" option. I apparently misread the description a long time ago and, as we often do, never went back to double-check the description.

As for the "When to suspend" options, I have been devoted to SETI since I started back in 1999, so I have never set any "suspend" options. ;-)

I will now begin extensive performance testing with one, two and three CPU units running with the GPU units to determine which combination might provides greater daily throughput than the four GPU units alone.

Thank you again, TL!
ID: 1982244 · Report as offensive
Profile bloodrain
Volunteer tester
Avatar

Send message
Joined: 8 Dec 08
Posts: 231
Credit: 28,112,547
RAC: 1
Antarctica
Message 1982247 - Posted: 26 Feb 2019, 2:41:08 UTC

how would you set this up for a 16 or 32 core system?
ID: 1982247 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1982253 - Posted: 26 Feb 2019, 3:18:42 UTC - in response to Message 1982247.  

how would you set this up for a 16 or 32 core system?

The same way... Set: "Use at Most" 50% CPUs
"Use at Most" 50% CPU Time

The "Use at Most: 50% CPUs" cuts the Total Threads/Cores in half to be used - 'Maximum'. So, in your case 16 Cores WITH HyperThreading = 32 Threads; SO, 16 Threads would be used for CPU Crunching.

The "Use at Most: 50% of CPU Time" reduces the Load on the System giving more back to System Response - smoother mouse use, quick response in switching Tabs in FireFox/Browser of Choice... If you want 'more aggressive' BOINC Usage, increase "Use at Most: x% of CPU Time" where x can be from 50% to 100%. The higher the %-age the more power you give to BOINC. I lock in at 50% to make my System more 'stable' and responsive. BOINC is a "Secondary" Usage Item for me. I don't Crunch 24/7 and when I Crunch, I still expect to be able to use my System effectively as I see fit.


TL
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1982253 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1982257 - Posted: 26 Feb 2019, 4:29:36 UTC - in response to Message 1982247.  
Last modified: 26 Feb 2019, 4:34:02 UTC

how would you set this up for a 16 or 32 core system?


It depends on how many gpus you need to feed. If you setup the "app_config.xml" file so that 1 core is used for each gpu then it will devote however many cores are needed to your gpus and let the rest crunch.

The other part of the issue depends on your setup. With Windows and the stock SOG gpu application a high end gpu card can run upto 3-4 tasks and still have excellent production.

Under Linux/CUDA9x-10 you can run at most 1 gpu task at a time but it runs at least twice as fast as a Windows SOG task. And example would be 15 minutes for Windows/Gtx 750Ti. And 5-7 minutes for Linux/CUDA91. This is not a guess. I have experienced both.

A good rule of thumb however is to never let your cpus run at 100%. If you set your total available to cores to 90% of your actual cores it appears to improve production. (You can set to website defaults for your computer to 90% or do that with the local configuration option).

HTH,
Tom
A proud member of the OFA (Old Farts Association).
ID: 1982257 · Report as offensive
Cherokee150

Send message
Joined: 11 Nov 99
Posts: 192
Credit: 58,513,758
RAC: 74
United States
Message 1982271 - Posted: 26 Feb 2019, 8:10:20 UTC - in response to Message 1982257.  
Last modified: 26 Feb 2019, 8:14:15 UTC

Thank you, Tom M.

A very likely explanation why the 90% CPU provides better production than 100% is due to the CPU core temperatures. Some of the high end Intel processors run at very high temperatures. Even with good air cooling via a high performance case and multiple fans, they often run up to 100 C or higher when run continually at 100%. This is higher than their rated temperatures. (I won't get into the question as to why Intel sells CPUs that are clocked so high from the factory that they run over their maximum temperature ratings. I think we can all guess at the reason.) As a safety feature, Intel has their CPUs underclock automatically when any of the cores overheat. As an Intel CPU begins overheating, I have seen them initially underclock about 25% around 98 C - 100 C, progressing to as much as a 55% clock reduction at 103 C - 104 C. Reducing the CPU percentage lowers a critical core temperature, thus allowing the CPU to return to its rated clock speed. It would be a good idea to use one of the free monitoring programs out there, like OCCT to check your core temperatures. If your CPU is running too hot, you will find that you can quickly fine-tune your processing percentage to obtain the fully rated clock speed, not to mention greatly increasing the life of your CPU. There are other ways you can reduce your CPU temperature. One is by cooling the computer's room. I find that a 2 C to 3 C drop in room temperature will do wonders to cool the CPU. Another way, if you wish, is to go the route of getting a CPU cooler. That, of course, is a bit more complicated and expensive. In any event, getting your CPU back to a temperature that allows full clock speed will greatly increase its performance.

I hope this is helpful.
ID: 1982271 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1982314 - Posted: 26 Feb 2019, 15:26:55 UTC - in response to Message 1982257.  

how would you set this up for a 16 or 32 core system?


It depends on how many gpus you need to feed. If you setup the "app_config.xml" file so that 1 core is used for each gpu then it will devote however many cores are needed to your gpus and let the rest crunch.

The other part of the issue depends on your setup. With Windows and the stock SOG gpu application a high end gpu card can run upto 3-4 tasks and still have excellent production.

Under Linux/CUDA9x-10 you can run at most 1 gpu task at a time but it runs at least twice as fast as a Windows SOG task. And example would be 15 minutes for Windows/Gtx 750Ti. And 5-7 minutes for Linux/CUDA91. This is not a guess. I have experienced both.

A good rule of thumb however is to never let your cpus run at 100%. If you set your total available to cores to 90% of your actual cores it appears to improve production. (You can set to website defaults for your computer to 90% or do that with the local configuration option).

HTH,
Tom


A 1080Ti running 3 SoGs at a time will do it anywhere from 9 minutes to 12 minutes, Longest was 15 minutes depending on where the work unit came from. That means an average of 3-4 minutes per SoG, longest was 5 minutes. Linux Cuda 91 will run anywhere from 52 sec to 68 seconds. I disagrees with the use of "use at most". I believe you have a misunderstanding of how it actually works. If you plan of running high volume of work units then it's better to use an app_config and <max_concurrent> to limit the number of simultaneous work units to allow for better throughput, while having "use at most" set to 100%

The only time to use a "Use at most" is if you are running a low thread count CPU and need to restrict it for the OS to have access to a thread for other things.
ID: 1982314 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1982317 - Posted: 26 Feb 2019, 15:30:48 UTC - in response to Message 1982271.  
Last modified: 26 Feb 2019, 15:31:59 UTC

Thank you, Tom M.

A very likely explanation why the 90% CPU provides better production than 100% is due to the CPU core temperatures. Some of the high end Intel processors run at very high temperatures. Even with good air cooling via a high performance case and multiple fans, they often run up to 100 C or higher when run continually at 100%. This is higher than their rated temperatures. (I won't get into the question as to why Intel sells CPUs that are clocked so high from the factory that they run over their maximum temperature ratings. I think we can all guess at the reason.) As a safety feature, Intel has their CPUs underclock automatically when any of the cores overheat. As an Intel CPU begins overheating, I have seen them initially underclock about 25% around 98 C - 100 C, progressing to as much as a 55% clock reduction at 103 C - 104 C. Reducing the CPU percentage lowers a critical core temperature, thus allowing the CPU to return to its rated clock speed. It would be a good idea to use one of the free monitoring programs out there, like OCCT to check your core temperatures. If your CPU is running too hot, you will find that you can quickly fine-tune your processing percentage to obtain the fully rated clock speed, not to mention greatly increasing the life of your CPU. There are other ways you can reduce your CPU temperature. One is by cooling the computer's room. I find that a 2 C to 3 C drop in room temperature will do wonders to cool the CPU. Another way, if you wish, is to go the route of getting a CPU cooler. That, of course, is a bit more complicated and expensive. In any event, getting your CPU back to a temperature that allows full clock speed will greatly increase its performance.

I hope this is helpful.


Anyone running a high thread count CPU should not be dependent on air cooling to keep their CPU temps down. If you are paying that much for a CPU chip, then you best spend a little more and go for aftermarket AIO to cool that CPU. I just invested an upgrade from corsair 280mm AIO to an Alphacool 360mm AIO to keep the CPU temps down. If you go all in for top of the line CPU then you might as well build yourself a full water cooled system. Ask Keith, Steve, Mark or any of the others that are running high count machines.
ID: 1982317 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1982363 - Posted: 27 Feb 2019, 0:50:31 UTC - in response to Message 1982317.  

I dunno, my 2x 10-core stay around 45C with only an air cooler and at 80-90% load.

It depends on your setup and which air cooler you’re using. At least with an air cooler there’s no pump to fail.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1982363 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1982370 - Posted: 27 Feb 2019, 1:21:33 UTC - in response to Message 1982314.  

...I disagrees with the use of "use at most". I believe you have a misunderstanding of how it actually works. If you plan of running high volume of work units then it's better to use an app_config and <max_concurrent> to limit the number of simultaneous work units to allow for better throughput, while having "use at most" set to 100%

The only time to use a "Use at most" is if you are running a low thread count CPU and need to restrict it for the OS to have access to a thread for other things.

It's pretty clear what it says it does. It also is easy to test to see if it has any control over the GPU Apps;
Computing
Usage limits
Use at most N % of the CPUs: Keeps some CPUs free for other applications. Example: 75% means use 6 cores on an 8-core CPU.
'other applications' includes GPU applications among other types.
The only App that setting affects is the CPU Apps. What else do you think it controls?
ID: 1982370 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1982371 - Posted: 27 Feb 2019, 1:28:16 UTC - in response to Message 1982370.  

...I disagrees with the use of "use at most". I believe you have a misunderstanding of how it actually works. If you plan of running high volume of work units then it's better to use an app_config and <max_concurrent> to limit the number of simultaneous work units to allow for better throughput, while having "use at most" set to 100%

The only time to use a "Use at most" is if you are running a low thread count CPU and need to restrict it for the OS to have access to a thread for other things.

It's pretty clear what it says it does. It also is easy to test to see if it has any control over the GPU Apps;
Computing
Usage limits
Use at most N % of the CPUs: Keeps some CPUs free for other applications. Example: 75% means use 6 cores on an 8-core CPU.
'other applications' includes GPU applications among other types.
The only App that setting affects is the CPU Apps. What else do you think it controls?


What it says and what it does are 2 different things.
I've tested it, just as others and it doesn't do what it says it does. It's never worked the way it was supposed to.
ID: 1982371 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1982375 - Posted: 27 Feb 2019, 1:42:55 UTC - in response to Message 1982371.  

It works as explained for me. Give an example were it doesn't control the number of CPUs used by the CPU Applications, or how it controls some other application it's not supposed to. I've tested it extensively, it has absolutely no control over how much CPU the GPU Apps use. The only settings that control how much CPU the GPUs use are application specific controls such as -nobs and -use_sleep. There isn't any BOINC setting that controls how much CPU the GPUs use, but you can set the BOINC CPU control to 100% and starve the GPUs, as the BOINC default setting does.
ID: 1982375 · Report as offensive
Profile bloodrain
Volunteer tester
Avatar

Send message
Joined: 8 Dec 08
Posts: 231
Credit: 28,112,547
RAC: 1
Antarctica
Message 1982388 - Posted: 27 Feb 2019, 2:50:10 UTC - in response to Message 1982317.  

like other person depends on cooler and also case to. i dont run boic all day long on pc
ID: 1982388 · Report as offensive
mmonnin
Volunteer tester

Send message
Joined: 8 Jun 17
Posts: 58
Credit: 10,176,849
RAC: 0
United States
Message 1982389 - Posted: 27 Feb 2019, 3:21:07 UTC

The 'Use X % of CPUs' only affects how many tasks are running. GPU tasks included. As a test, I just set my E@H app_config to use 3.0 CPUs per task and I am running two tasks in Parallel. Setting it to anything less than 50% of CPUs stops one of the GPU tasks. I'm not sure why it's 50% though. As 3 CPU threads per task is 6 required out of 8 threads on a 3770k. Anything less than 75% should have stopped the 2nd task. The only thing else running on that client is WUProp (NCI and 0.025 CPUs).

How much actual CPU usage is used by the GPU app's exe is totally up the the app whether it's CUDA vs OpenCL or if it has other params like -nobs or SWAN_SYNC.
ID: 1982389 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1982390 - Posted: 27 Feb 2019, 3:36:18 UTC - in response to Message 1982389.  

+1
What mmonnin said. cpu core count rounded to whole integer.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1982390 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1982394 - Posted: 27 Feb 2019, 3:54:47 UTC - in response to Message 1982389.  
Last modified: 27 Feb 2019, 4:12:03 UTC

The 'Use X % of CPUs' only affects how many tasks are running. GPU tasks included. As a test, I just set my E@H app_config to use 3.0 CPUs per task and I am running two tasks in Parallel. Setting it to anything less than 50% of CPUs stops one of the GPU tasks. I'm not sure why it's 50% though. As 3 CPU threads per task is 6 required out of 8 threads on a 3770k. Anything less than 75% should have stopped the 2nd task. The only thing else running on that client is WUProp (NCI and 0.025 CPUs).

How much actual CPU usage is used by the GPU app's exe is totally up the the app whether it's CUDA vs OpenCL or if it has other params like -nobs or SWAN_SYNC.

This isn't the E@H app Board, please show that in SETI.
Apparently different Project use different Settings, test it in SETI.
Note the CPU use;

ID: 1982394 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1982396 - Posted: 27 Feb 2019, 4:06:25 UTC - in response to Message 1982394.  
Last modified: 27 Feb 2019, 4:09:49 UTC

Since project_max_concurrent and max_concurrent are no longer able to be used with a gpu_exclude, I had to resort to % of cpu to allocate my desired 11 tasks in total to be running on my Ryzen hosts. 8 cpu tasks + 3 gpu tasks. I also use affinity to control where the cpu and gpu tasks run on the logical cores.

So I set the Ryzen 2700X hosts with 3 gpus each to 70% of 100% of 16 cores = 11.2 cores. Rounded down to whole integer. I have 8 cpu tasks running on even processor cores and 3 gpu tasks supported on 3 odd logical cpu cores. Total 11 tasks.

For the Threadripper 2920X host with 4 gpus, I have cpu % set to 68% of 100% of 24 cores = 16.32 cores. Rounded down to whole integer. I have 12 cpu tasks running on even processor cores and 4 gpu tasks supported on 4 odd logical cpu cores. Total 16 tasks.

That is the way it actually works. Not the way the description says it supposed to work.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1982396 · Report as offensive
1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Running One CPU Unit with Multiple GPU Units on a Multi-Processor CPU


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.