Setting up Linux to crunch CUDA90 and above for Windows users

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 162 · Next

AuthorMessage
Profile tazzduke
Volunteer tester

Send message
Joined: 15 Sep 07
Posts: 190
Credit: 28,269,068
RAC: 5
Australia
Message 1866384 - Posted: 9 May 2017, 8:48:37 UTC - in response to Message 1866376.  

Greetings Stephen

Well we meet again lol, yeah I got it bad this time lol.

This time I am determined to stay on Linux on the two pcs already converted. lol

At least I got the GTX 980 and the GTX 950 up on the special app.

Regards
ID: 1866384 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1866408 - Posted: 10 May 2017, 1:24:03 UTC - in response to Message 1866384.  

Greetings Stephen

Well we meet again lol, yeah I got it bad this time lol.

This time I am determined to stay on Linux on the two pcs already converted. lol

At least I got the GTX 980 and the GTX 950 up on the special app.

Regards


. . I am running with the special sauce on 2 machines, one with 2 x 1060s and the other with 1 x 1050ti, both doing well. A third machine with 2 x 970s is new and just settling in with Windows and SoG. When I am happy with it I will swing that over too. The new version from Petri, 41p-zi3t2b, is doing very nicely on the 1060s, I didn't bother running TBar's version because I am not concerned about it being wholly consumed with crunching, that is what that machine is there for :). I ran TBar's version on the 1050ti and it runs well. It surprisingly gets better results on GBT WUs so perhaps they like it better when -bs is turned on. But it is slower on Arecibo tasks so I have moved over to Petri's release and I am running it with -bs off. Definitely quicker on Arecibo work, but sadly just a little slower on Guppis. TBar's version gets run times of 6.9 to 7.1 mins on BLC02s (changed over before the BLC03s started), but Petri's takes 7.1 to 7.4 mins but that includes BLC03 tasks as well which take a few seconds longer than BLC02s. But I think overall it is slightly better off on balance. Of course, if Arecibo worked were to cease then I would probably swing it back to the TBar version. Those 10 to 15 secs per task add up over a day :)

. . I will be interested to see how the 970s perform when that machine gets some salsa, err special sauce. I am interested to see if Pascal with much higher clock speed (the 1060s) can outperform a GPU with more CUs and a lot more cuda cores. The 970s are doing quite well on GBT tasks under SoG, so the results could be very interesting.

. . Have fun :)

Stephen

:)
ID: 1866408 · Report as offensive     Reply Quote
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1866593 - Posted: 10 May 2017, 19:06:10 UTC - in response to Message 1866344.  

hmm it should show up. Check PSensor preferences.

Also might try GKrellm, I like it better for CPU usage. Also shows numerical GPU temps/fans.
Also in terminal ... nvidia-smi -l

This is from memory, so might be wrong ... In terminal
sudo sensors ... so show info
sudo sensors -detect ... check system for additional sensors i.e. Not just defaults.

Thanks. The only thing working is using nvidia-smi -l, guess that will have to do :)
ID: 1866593 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1869714 - Posted: 27 May 2017, 3:47:55 UTC

. . Hello everyone,

. . Just checking that everyone is alive :)

Stephen
ID: 1869714 · Report as offensive     Reply Quote
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1869728 - Posted: 27 May 2017, 5:39:17 UTC - in response to Message 1869714.  

. . Hello everyone,

. . Just checking that everyone is alive :)

Stephen

Only those who havn't committed suicide yet. :-D


Sorry, only joking. ;-)

Cheers.
ID: 1869728 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1869780 - Posted: 27 May 2017, 13:44:40 UTC - in response to Message 1869728.  


. . Just checking that everyone is alive :)

Only those who havn't committed suicide yet. :-D
Sorry, only joking. ;-)


. . So no Win10 users left then ? :)

Stephen

:)
ID: 1869780 · Report as offensive     Reply Quote
Profile scocam
Avatar

Send message
Joined: 28 Feb 17
Posts: 27
Credit: 15,120,999
RAC: 0
United States
Message 1869800 - Posted: 27 May 2017, 16:26:40 UTC
Last modified: 27 May 2017, 16:27:29 UTC

I'm losing RAC like mad. Not sure what changed but I'm out of town and it's driving me crazy. Looks like I've been losing RAC since Thursday. Anyone else?
ID: 1869800 · Report as offensive     Reply Quote
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1869816 - Posted: 27 May 2017, 18:14:45 UTC - in response to Message 1869800.  

Your output/day graph seems very similar in shape to mine, so I don't think there is a problem. Must just be the mixture of tasks coming out.
I looked and all 4 cards have been validating with similar times, so don't think anything is haywire.

Also PM'ing you
ID: 1869816 · Report as offensive     Reply Quote
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1869817 - Posted: 27 May 2017, 18:26:32 UTC

Guess it`s all those guppies what got recently that gets RAC going down.
ID: 1869817 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1869843 - Posted: 27 May 2017, 21:51:21 UTC - in response to Message 1869800.  

I'm losing RAC like mad. Not sure what changed but I'm out of town and it's driving me crazy. Looks like I've been losing RAC since Thursday. Anyone else?

Some GBT splitters were stuck for a week or several on one file. That's been fixed so the result is instead of almost all Arecibo work, it's now mostly GBT work. RAC for everyone will now fall.
Grant
Darwin NT
ID: 1869843 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1869870 - Posted: 28 May 2017, 0:32:32 UTC - in response to Message 1869800.  

I'm losing RAC like mad. Not sure what changed but I'm out of town and it's driving me crazy. Looks like I've been losing RAC since Thursday. Anyone else?


. . Me too! I have had a nasty ol' M$ "Creator's Edition" update on once machine but that would only account for a drop in the days returns of about 3 or 4 thousand. It has dropped way more than that, so something is happening elsewhere. Credit Screw is probably having one of it's indigestion fits.

Stephen

??
ID: 1869870 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1869885 - Posted: 28 May 2017, 3:13:47 UTC - in response to Message 1861922.  

Stephen,

>
>. . I didn't even hear about the -poll switch until I was well and truly into SoG and trying to learn it's tuning options. If I had still been doing CUDA50 I would have given that a try.
>

I have a windows box running stock seti. It is also munching mostly Mcuda50 right now :(
If this -poll switch works on a windows version of Mcuda50, what file/where does it go?

Thanks,
Tom
A proud member of the OFA (Old Farts Association).
ID: 1869885 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1869892 - Posted: 28 May 2017, 3:40:23 UTC - in response to Message 1869885.  

I have a windows box running stock seti. It is also munching mostly Mcuda50 right now :(
If this -poll switch works on a windows version of Mcuda50, what file/where does it go?

It goes in the app_info.xml and if you get it wrong, you loose all your work.
It will require 1 CPU thread for each WU being crunched, and that thread doesn't get released for other (ie system) work. It does boost CUDA50 output, but no where near as much as just running SoG with a few settings as posted in other threads.
Grant
Darwin NT
ID: 1869892 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1869943 - Posted: 28 May 2017, 10:55:18 UTC - in response to Message 1869892.  

I have a windows box running stock seti. It is also munching mostly Mcuda50 right now :(
If this -poll switch works on a windows version of Mcuda50, what file/where does it go?

It goes in the app_info.xml and if you get it wrong, you loose all your work.
It will require 1 CPU thread for each WU being crunched, and that thread doesn't get released for other (ie system) work. It does boost CUDA50 output, but no where near as much as just running SoG with a few settings as posted in other threads.



Ahhhh, TY Grant. Sounds like its not worth the risk. Be less trouble to install the Lunatics upgrade instead.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1869943 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1869961 - Posted: 28 May 2017, 14:48:35 UTC - in response to Message 1869943.  


It goes in the app_info.xml and if you get it wrong, you loose all your work.
It will require 1 CPU thread for each WU being crunched, and that thread doesn't get released for other (ie system) work. It does boost CUDA50 output, but no where near as much as just running SoG with a few settings as posted in other threads.

Ahhhh, TY Grant. Sounds like its not worth the risk. Be less trouble to install the Lunatics upgrade instead.
Tom


. . Lunatics 4.45 Beta6 is highly recommended ...

Stephen

:)
ID: 1869961 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1870021 - Posted: 28 May 2017, 23:40:11 UTC - in response to Message 1869961.  

. . Lunatics 4.45 Beta6 is highly recommended ...

Stephen

:)


;)

Yupper, and it is doing a super job of speeding up the cpu processing speed on an older generation 4 core/4 HT Xeon Windows box I have been banging around on....

Tom
A proud member of the OFA (Old Farts Association).
ID: 1870021 · Report as offensive     Reply Quote
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1870253 - Posted: 31 May 2017, 3:40:52 UTC
Last modified: 31 May 2017, 4:06:05 UTC

As a longtime user of Precision X in Windows to control my NVIDIA GPU fan speeds/temperatures, I've been somewhat bummed that there didn't seem to be a similar tool in Linux. I did apply the "Coolbits" tweak, which allowed me to set a fixed fan speed, but that doesn't have a "fan curve" capability. Also, every time I rebooted, I had to manually go in and set my fan speeds for each GPU all over again. With 4 GPUs on one of my new Linux hosts and 3 on the other, that tended to get rather tedious.

So.......this afternoon, I finally decided to do some experimenting with a fan control script for Linux. Starting with a script called AutoFan by Murcio Filho and then melding it with bits and pieces that I learned from other fan control recommendations, I seem to have actually stumbled into something that works (at least for Ubuntu 14.04). Since others have expressed interest in a similar utility, here's what it looks like for a system with 2 GPUs.

#!/bin/bash

nvidia-settings -a "[gpu:0]/GPUFanControlState=1"
nvidia-settings -a "[gpu:1]/GPUFanControlState=1"

interval=10

while true; do
	current_temp=$(nvidia-settings -t -q [gpu:0]/GPUCoreTemp)
	if (("$current_temp" < 61)); then
		nvidia-settings -a [fan:0]/GPUTargetFanSpeed=20
	elif (("$current_temp" > 60)) && (("$current_temp" < 66)); then
		nvidia-settings -a [fan:0]/GPUTargetFanSpeed=40
	elif (("$current_temp" > 65)) && (("$current_temp" < 71)); then
		nvidia-settings -a [fan:0]/GPUTargetFanSpeed=50
	elif (("$current_temp" > 70)) && (("$current_temp" < 73)); then
		nvidia-settings -a [fan:0]/GPUTargetFanSpeed=60
	elif (("$current_temp" > 72)) && (("$current_temp" < 76)); then
		nvidia-settings -a [fan:0]/GPUTargetFanSpeed=70
	elif (("$current_temp" > 75)) && (("$current_temp" < 78)); then
		nvidia-settings -a [fan:0]/GPUTargetFanSpeed=80
	elif (("$current_temp" > 77)); then
		nvidia-settings -a [fan:0]/GPUTargetFanSpeed=100
	fi
	
	current_temp=$(nvidia-settings -t -q [gpu:1]/GPUCoreTemp)
	if (("$current_temp" < 61)); then
		nvidia-settings -a [fan:1]/GPUTargetFanSpeed=20
	elif (("$current_temp" > 60)) && (("$current_temp" < 66)); then
		nvidia-settings -a [fan:1]/GPUTargetFanSpeed=40
	elif (("$current_temp" > 65)) && (("$current_temp" < 71)); then
		nvidia-settings -a [fan:1]/GPUTargetFanSpeed=50
	elif (("$current_temp" > 70)) && (("$current_temp" < 73)); then
		nvidia-settings -a [fan:1]/GPUTargetFanSpeed=60
	elif (("$current_temp" > 72)) && (("$current_temp" < 76)); then
		nvidia-settings -a [fan:1]/GPUTargetFanSpeed=70
	elif (("$current_temp" > 75)) && (("$current_temp" < 78)); then
		nvidia-settings -a [fan:1]/GPUTargetFanSpeed=80
	elif (("$current_temp" > 77)); then
		nvidia-settings -a [fan:1]/GPUTargetFanSpeed=100
	fi
	
        sleep "$interval"
done

The script should be saved with a ",sh" extension and marked as Executable in the file permissions. I've got mine saved to the Desktop as "gpufanspeed.sh". Then, if you add that file to your Startup Applications, it should run continuously as long as you're logged on.

ADDITIONAL NOTES:
1. You must have already applied the Coolbits tweak for fan control and verified that the slider has been successfully added to the NVIDIA Settings screen for the GPU(s) you want to control. (See Message 1856811 and earlier posts.)

2. The "interval" value is given in seconds. I'm using 10, as that seems to be a sufficiently frequent temperature sampling to me, but use whatever makes you happy. Temperatures are in Celsius. Fan speeds are a percentage, 0 to 100.

3. The fan curve "steps" are what I'm using on both my boxes to start with. They can be customized to suit, adjusting values and adding or deleting steps to fine tune them as desired.

4. This script handles 2 GPUs. For a single GPU system, simply eliminate the second block of code for [gpu:1] and [fan:1]. For additional GPUs (my boxes have 3 on one and 4 on the other), just duplicate that block as many times as necessary using [gpu:2], [gpu:3] and [fan:2], [fan:3] etc. This also applies to the lines setting the "GPUFanControlState" at the start of the script. There must be one for each GPU you want to control.

5. The "GPUTargetFanSpeed" attribute works for me on my box with NVIDIA driver 375.66 and, from what I've read online, seems to be the preferred choice for newer drivers. For some reason, however, on my other Linux box with driver 375.39, that attribute is unknown. If you run into that problem, substitute "GPUCurrentFanSpeed" wherever you see "GPUTargetFanSpeed".

6. I'm a Linux newbie, generally limited to "Monkey see, monkey [su]do". The above script is my first crack at scripting in a Linux environment and is offered without warranty of any kind. Use at your own risk (though I think that should be minimal if you monitor your results fairly closely at first).

EDIT: Expanded Note 4 to include reference to the "GPUFanControlState" lines at the top of the script.
ID: 1870253 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1870256 - Posted: 31 May 2017, 4:48:27 UTC - in response to Message 1870253.  
Last modified: 31 May 2017, 5:02:30 UTC

As a longtime user of Precision X in Windows to control my NVIDIA GPU fan speeds/temperatures, I've been somewhat bummed that there didn't seem to be a similar tool in Linux. I did apply the "Coolbits" tweak, which allowed me to set a fixed fan speed, but that doesn't have a "fan curve" capability. Also, every time I rebooted, I had to manually go in and set my fan speeds for each GPU all over again. With 4 GPUs on one of my new Linux hosts and 3 on the other, that tended to get rather tedious.

So.......this afternoon, I finally decided to do some experimenting with a fan control script for Linux. Since others have expressed interest in a similar utility, here's what it looks like for a system with 2 GPUs.

. . Boy that is so much needed. Like yourself I find it tedious to have to reset the fan control every time I have to restart Linux. In Windows I use Afterburner which is a great tool that I have come to love and depend upon. I miss it so much in Linux. I will be giving this a try for sure.


The script should be saved with a ",sh" extension and marked as Executable in the file permissions. I've got mine saved to the Desktop as "gpufanspeed.sh". Then, if you add that file to your Startup Applications, it should run continuously as long as you're logged on.

ADDITIONAL NOTES:
1. You must have already applied the Coolbits tweak for fan control and verified that the slider has been successfully added to the NVIDIA Settings screen for the GPU(s) you want to control. (See Message 1856811 and earlier posts.)

. . If I understand this correctly, so as long as the fan control tweak has been saved to the Nvidia X-server configuration, you do not need to manually start the X-server interface each time you log in? As long as this script auto-runs at startup you have nicely automated fan control in the background?

. . This is exactly what I started this thread for. This will be a great help, thanks.

Stephen

:)
ID: 1870256 · Report as offensive     Reply Quote
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1870330 - Posted: 31 May 2017, 15:49:36 UTC - in response to Message 1870256.  

. . If I understand this correctly, so as long as the fan control tweak has been saved to the Nvidia X-server configuration, you do not need to manually start the X-server interface each time you log in? As long as this script auto-runs at startup you have nicely automated fan control in the background?
Yes, that's correct.
ID: 1870330 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1870374 - Posted: 31 May 2017, 20:29:46 UTC - in response to Message 1870330.  

. . Wonderful! :)

Stephen

:)
ID: 1870374 · Report as offensive     Reply Quote
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 162 · Next

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.