NVIDIA fan controls (Linux MINT 18, Driver Version 367.57, two NVidia GTX 950s)

Questions and Answers : GPU applications : NVIDIA fan controls (Linux MINT 18, Driver Version 367.57, two NVidia GTX 950s)
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile ralphw
Volunteer tester

Send message
Joined: 7 May 99
Posts: 78
Credit: 18,032,718
RAC: 38
United States
Message 1831446 - Posted: 19 Nov 2016, 17:25:18 UTC
Last modified: 19 Nov 2016, 17:27:05 UTC

I'm running SETI@home 8.0 applications on Linux Mint 18 (based on Ubuntu 16.04).
GPU temps are extemely high, and I feel I need better speed controls for the GPU fans.

>nvidia-smi
>Sat Nov 19 12:00:51 2016
>+-----------------------------------------------------------------------------+
>| NVIDIA-SMI 367.57 Driver Version: 367.57 |
>|-------------------------------+----------------------+----------------------+
>| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
>| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
>|===============================+======================+======================|
>| 0 GeForce GTX 950 Off | 0000:01:00.0 On | N/A |
>| 0% 40C P0 23W / 125W | 214MiB / 1996MiB | 0% Default |
>+-------------------------------+----------------------+----------------------+
>| 1 GeForce GTX 950 Off | 0000:02:00.0 Off | N/A |
>| 0% 25C P8 6W / 125W | 1MiB / 1996MiB | 0% Default |
>+-------------------------------+----------------------+----------------------+

....


Despite playing with the "coolbits" settings in the Xorg.conf file, no controls to set the fans are ever activated.
It also seems like the settings should be under the "device" section, but the nvidia-settings CLI option puts them in the "screens" section.

I have two NVidia GTX 950s in this system.

When runnning SETI@home, GPU temps would peak briefly at 90 degrees C, which is about 40 degrees too hot.
So I'm looking for a way to turn the fans on more or continuously when running GPU work units.
ID: 1831446 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1831456 - Posted: 19 Nov 2016, 18:55:03 UTC - in response to Message 1831446.  

ID: 1831456 · Report as offensive
Profile ralphw
Volunteer tester

Send message
Joined: 7 May 99
Posts: 78
Credit: 18,032,718
RAC: 38
United States
Message 1831536 - Posted: 20 Nov 2016, 3:44:39 UTC - in response to Message 1831456.  
Last modified: 20 Nov 2016, 3:46:34 UTC

Those look interesting, but the NVIDIA driver controls are supposed to be able to regulate this. Aside from googling (which I'm trying), how can I be sure the driver version is "proper" and supports the fan speed setting capability?
ID: 1831536 · Report as offensive
Profile ralphw
Volunteer tester

Send message
Joined: 7 May 99
Posts: 78
Credit: 18,032,718
RAC: 38
United States
Message 1831561 - Posted: 20 Nov 2016, 13:16:02 UTC - in response to Message 1831536.  
Last modified: 20 Nov 2016, 13:32:22 UTC

So I'm thinking that there is a driver compatibility issue with Linux Mint 18, that might be preventing things from working. This guide shows use of the PPA install method with Linux MINT 18.
https://johners.tech/2016/07/installing-the-latest-nvidia-graphics-drivers-on-linux-mint-18/

I was using the PPA for graphics drivers, but am going to try switching to the .run installation method (downloading the driver directly from NVidia).
(This post talks about CUDA support, but recommends NOT using the PPA method - https://devtalk.nvidia.com/default/topic/955464/cuda-setup-and-installation/gtx-1080-cuda-8-0-on-linux-mint-18-problems-setting-up-/#reply)

Not sure if this will work any better, but I need to try something to combat the frustration of (works with 14.04 / doesn't work with MINT 18) syndrome.

Fan controls in NVIDIA seem important when I'm running GPU apps on SETI.
90 degree GPU temps seem too high.
ID: 1831561 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1831580 - Posted: 20 Nov 2016, 14:55:06 UTC - in response to Message 1831561.  
Last modified: 20 Nov 2016, 14:57:52 UTC

Can't you use Nvidia's own 375.20? I wonder even if it isn't possible to run Speedfan through Wine? (My girlfriend looked that up, and it seems Speedfan is garbage under Wine, so don't try that!)

90 degrees seems a tad high. Are all your case fans spinning a good storm?
ID: 1831580 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1831582 - Posted: 20 Nov 2016, 15:06:20 UTC

Okay, one option is http://gkrellm.srcbox.net/
SMP CPU monitor that can chart individual CPUs and/or a composite CPU.
Temperature, fan, and voltage sensor monitors if supported by the kernel and the mainboard hardware. Linux requires lm_sensors modules, sysfs sensors for kernels >= 2.6.0 or a running mbmon daemon. Sensors can also be read from mbmon on FreeBSD. On Linux, you can also monitor disk temperatures from the hddtemp daemon and nvidia GPU temperatures if nvidia-settings is installed.
Each sensor monitor has a configurable alarm and warning.


and https://wiki.archlinux.org/index.php/conky
Conky is a system monitor software for the X Window System. It is available for GNU/Linux and FreeBSD. It is free software released under the terms of the GPL license. Conky is able to monitor many system variables including CPU, memory, swap, disk space, temperature, top, upload, download, system messages, and much more. It is extremely configurable, however, the configuration can be a little hard to understand. Conky is a fork of torsmo.
ID: 1831582 · Report as offensive
Profile ralphw
Volunteer tester

Send message
Joined: 7 May 99
Posts: 78
Credit: 18,032,718
RAC: 38
United States
Message 1831601 - Posted: 20 Nov 2016, 17:09:15 UTC - in response to Message 1831580.  

The problem seems to be that manual fan control is enabled by the "Coolbits" option in xorg.conf

Problem 1 (solved): xorg.conf gets overwritten with new values upon reboot. Enabling nogpumanager in GRUB doesn't fix this in Linux MINT 18, so I had to use restore to using chattr +i on the /etc/X11/xorg.conf file.

Problem 2 (working on): even though the line
"Coolbits" "12" is now preserved for both device entries, only one NVidia device allows for manual control of fan speed. Working on a solution using this Folding@home page as a guide:

https://foldingforum.org/viewtopic.php?p=267165
It requires setting up multiple X Servers, one per GPU.

I'd prefer to easier configuration options for multiple GPUs and multiple monitors, along with a more aggressive adaptive fan control system. I'm not overclocking, but SETI@home pushes GPUs hard.
ID: 1831601 · Report as offensive

Questions and Answers : GPU applications : NVIDIA fan controls (Linux MINT 18, Driver Version 367.57, two NVidia GTX 950s)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.