Reserve CPU core for GPU Tasks

Questions and Answers : Preferences : Reserve CPU core for GPU Tasks
Message board moderation

To post messages, you must log in.

AuthorMessage
Team Bill

Send message
Joined: 8 Dec 05
Posts: 3
Credit: 11,419,898
RAC: 73
United States
Message 2028444 - Posted: 19 Jan 2020, 4:44:52 UTC

Problem

I'm currently seeing a lot of opencl_nvidia_SoG tasks that are coming to my GPU running for a few minutes and then failing. I suspect it is because I'm running my CPU at 100% and it's causing some computational errors with the WU. I want to reserve 1 core for the GPU (Nvidia/Intel GPU) jobs that run on my machine and let the other 5 cores (10 threads) run CPU jobs for Seti.

Currently, my system is running 12 CPU jobs 1 GPU job on the Nvidia card and 1 GPU job on the integrated Intel GPU. (14 total jobs simultaneously)

I know I need to configure an app_config.xml file and have tried playing around with it, but I'm not understanding how to get the settings right. Can someone help me out? I've detailed my CPU/GPU below. I've got a massive CPU cooler and have no issues running at 100%, I've been doing it for years, but would welcome some suggestions on tuning.

Ideal State

I'd like to run 10 concurrent CPU Jobs and 1 GPU job for the Nvidia card and 1 GPU job for the Intel GPU. (12 Total Concurrent) But am open to suggestions.

Thanks for your help!

Current Setup
    Intel i7 8700k (6 core/12 thread with graphics enabled for intel GPU compute)
    Nvidia 2080


ID: 2028444 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 2028449 - Posted: 19 Jan 2020, 5:16:07 UTC

Your problem is your video driver- anything later than v 436.60 will result in errors on some Arecibo WUs. Go back to v431.60 or earlier & everything will be good. Since you are running WIn10 Pro you can make use of the Group Policy Editor to stop Windows from changing the driver whenever it gets the urge to do so.

Reserving a CPU core to support the GPU will improve your CPU & GPU processing times. And not using the iGPU (Intel GPU) will improve your CPU processing times and boost your system output by a big amount.
One of your CPU WU Runtimes-
Run time 3 hours 55 min 58 sec
CPU time 3 hours 14 min 6 sec

There shouldn't be more than a few minutes difference between the CPU time and the Run time. Stop processing on the iGPU, reserve a core to support the GPU & not only will that difference in processing times reduce, but the actual CPU time should be a lot less. That CPU & that application i would expect it to be closer to 2 hours.
Another way to get really poor CPU performance is if your memory modules aren't in the correct slots (if not all slots are in use).


You can use the project web settings to free up a CPU core, but i prefere to use all of my cores & threads so i use a file (located in the C:\ProgramData\BOINC\projects\setiathome.berkeley.edu folder) named
app_config.xml
It's easiest just to open a new file using Notepad, copy the settings below, and then Save as app_config.xml (make sure no .txt on the end). Then using the BOINC Manager, just select "Options, Read config files" & it will use those settings.

<app_config>
 <app>
  <name>setiathome_v8</name>
  <gpu_versions>
  <gpu_usage>1.0</gpu_usage>
  <cpu_usage>1.0</cpu_usage>
  </gpu_versions>
 </app>
 <app>
  <name>astropulse_v7</name>
  <gpu_versions>
  <gpu_usage>1.0</gpu_usage>
  <cpu_usage>1.0</cpu_usage>
  </gpu_versions>
 </app>
</app_config>
To go back to using the default configuration, just delete that file, exit & then restart BOINC.


You could also make use of some command line settings to get a lot mor work out of that RTX 2080, but first i would sort out the driver issue, disable the iGPU processing, and reserve a core to support the GPU. Once all that is done, then you can make use of some command line settings.
Grant
Darwin NT
ID: 2028449 · Report as offensive
Team Bill

Send message
Joined: 8 Dec 05
Posts: 3
Credit: 11,419,898
RAC: 73
United States
Message 2028586 - Posted: 20 Jan 2020, 1:18:12 UTC - in response to Message 2028449.  

Beautiful! All seems to be working as expected. I'm going to let it burn in like this for a bit and then look at specific tuning for the 2080. Giving it a full CPU seems to have improved performance already.

I had a heck of a time downgrading my driver, but some googling got me what I needed to know. I'm not typically a Windows guy, but sometimes I use this computer for other things and haven't quite been able to escape its grasp. Thanks for the help with the app_config.xml file. I had been fiddling with it before with no luck, you gave me exactly what I needed to know to get it implemented.

Running 11 CPU tasks and 1 NVIDIA task with 1 CPU at the same time seems to be running about 3-4 hours for the CPU jobs. (2-3 minutes for the opencl_nvidia_SoG jobs), I probably could play around with optimizing my CPU/MB configurations. I typically don't like to push the voltages on it since I usually run the rig at 100% either for SETI or video encoding.

Thanks again for your help and detailed response! It was much appreciated!
ID: 2028586 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 2028621 - Posted: 20 Jan 2020, 7:54:09 UTC - in response to Message 2028586.  

I typically don't like to push the voltages on it since I usually run the rig at 100% either for SETI or video encoding.
No need to muck around with voltages unless you want to overclock.
I personally don't see any value in it these days as all modern CPUs effectively overclock themselves (turbo boost etc).
Getting much more speed out of the CPU results in much higher power consumption, for what is only a modest boost in performance.

Unfortunately the present server issues means when you look at your task list for details on how things are going, it's almost a day behind.
Ideally, there should only be a few minutes difference between CPU time & Run time for your CPU WUs (check out the tasks on my Windows system (although due to the present server issues that may not be possible)). And your Run time (with the stock applications you are using) should probably be around 2 hours for Arecibo VLARs, and hour and a half for most GBT WUs (blc**_2bit_guppi_ etc)
Unless in your Computing preferences, under Usage Limits you have set "Use at most 100 % of CPU time" to something less than 100%?
Grant
Darwin NT
ID: 2028621 · Report as offensive
Team Bill

Send message
Joined: 8 Dec 05
Posts: 3
Credit: 11,419,898
RAC: 73
United States
Message 2028701 - Posted: 21 Jan 2020, 2:16:04 UTC - in response to Message 2028621.  

Well, I think I found my problem with my CPU. Digging around in the task manager I found a processes called "LightingService" that was eating up 3-7% of the CPU. A quick google to find out it was the ASUS application to run the RGB lights over the VRMs on my motherboard. Apparently its a known issue with the Asus driver, so I uninstalled it. I don't really care about lights when its blocked by my CPU cooler.

I've been pushing pushing tasks up, but haven't been able to look at them to see if they are CPU/Run times are closer, but based on what I am seeing for run times, it seems to have helped a lot. I'll take a look when things catch up on the server and let you know. Look at that, I've got some results. Seems to be a lot closer.

4,126.94 	4,036.66 	54.70 	SETI@home v8 v8.00    windows_intelx86
4,182.28 	4,075.47 	77.53 	SETI@home v8 v8.00    windows_intelx86

I'll take another look through my process table and see if I can disable some services and keep other things from starting. I bet I can tighten those times up even more.

Thanks again for your help!
ID: 2028701 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2028714 - Posted: 21 Jan 2020, 4:23:00 UTC - in response to Message 2028701.  
Last modified: 21 Jan 2020, 4:26:21 UTC

I wouldn't worry about it. The run_time versus the cpu_time is within 90 seconds. Yes, you could reduce the cpu loading by dropping one more cpu task, but then your cpu RAC production would drop.

I shoot for a 3-4 minute differential in run_times versus cpu_times as that is the most efficient loading and still keeps the cpu temps in check.

What you want to avoid is where the run_time exceeds the cpu _time by 15 minutes or more as you are missing out on reporting more tasks per hour.

Also look into using a more efficient cpu application that is provided in the Lunatics installer. The applications are as much as 200% faster than what the stock cpu applications offer.
https://setiathome.berkeley.edu/forum_thread.php?id=85086&postid=2028438
Lunatics installer
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2028714 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 2028723 - Posted: 21 Jan 2020, 6:25:48 UTC - in response to Message 2028701.  
Last modified: 21 Jan 2020, 6:28:26 UTC

I'll take another look through my process table and see if I can disable some services and keep other things from starting. I bet I can tighten those times up even more.
As Keith said, I wouldn't worry about it.
If you've reserved or set aside a core, and you're still getting 10min or more difference in Run time & CPU Time, then use Task Manager (or better yet Process Explorer) to see what else is making use of your CPU besides Seti.
Windows services only make use of CPU time when they need it, 3rd party addons tend to be the worst offenders for sucking up CPU resources.
Grant
Darwin NT
ID: 2028723 · Report as offensive

Questions and Answers : Preferences : Reserve CPU core for GPU Tasks


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.