Message boards :
Number crunching :
Setting up Linux to crunch CUDA90 and above for Windows users
Message board moderation
Previous · 1 . . . 32 · 33 · 34 · 35 · 36 · 37 · 38 . . . 162 · Next
Author | Message |
---|---|
Bruce Send message Joined: 15 Mar 02 Posts: 123 Credit: 124,955,234 RAC: 11 |
I just run: Hi Stephen, All GPUs are enabled. Each one is defined in the Device section. If you compare my 20-nvidia.conf device section with the device section from the xconfig file that has multiple devices defined you will see that they are almost identical, except for the spell checker changing Coolbits to Cool-bits, my file is actually Coolbits. The file works. Coolbits sort of works. The question is why does it work on the first gpu but not the other three? It should work on all of them, but it doesn't. As far as running nvidia-xconfig, there is probably a good reason that it did not want to run on Mint 18.2. It would be the way to go on an older version of Linux though. There have probably been significant changes in some of the newer versions of Linux. Very puzzling. Lets keep those ideas coming. Bruce |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Some people have had trouble with the 384 driver with kernel 4.10. There doesn't appear to be any trouble with the Ubuntu repository version of 375.66 with kernel 4.10. My systems are using the 375.66 driver with the xorg.conf in the etc/X11 location. They are using the same format they have for months, each device has a screen section and the coolbits option is in each screen section. My device sections don't have a coolbits option. The fan control works on all my GPUs; Section "Screen" Identifier "Screen0" Device "Device0" Monitor "Monitor0" DefaultDepth 24 Option "Coolbits" "28" Option "Stereo" "0" Option "nvidiaXineramaInfoOrder" "DFP-0" Option "metamodes" "nvidia-auto-select +0+0" Option "SLI" "Off" Option "MultiGPU" "Off" Option "BaseMosaic" "off" SubSection "Display" Depth 24 EndSubSection EndSection Section "Screen" Identifier "Screen1" Device "Device1" Monitor "Monitor1" DefaultDepth 24 Option "Coolbits" "28" Option "Stereo" "0" Option "nvidiaXineramaInfoOrder" "CRT-0" Option "metamodes" "nvidia-auto-select +0+0" Option "SLI" "Off" Option "MultiGPU" "Off" Option "BaseMosaic" "off" SubSection "Display" Depth 24 EndSubSection EndSectionYour mileage may vary. |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
Section "Device"I believe the "Coolbits" (no hyphen) entries should be in the "Screen" sections, not the "Device" sections. At least that's the way it's set up on my Ubuntu 14.04 machines. EDIT: Which I now see is pretty much what TBar said in his reply. |
Bruce Send message Joined: 15 Mar 02 Posts: 123 Credit: 124,955,234 RAC: 11 |
Hi TBar I switched back to the Linux Nouveau video driver, rebooted, then removed and purged nvidia -384 driver. Did a few updates to the OS, rebooted a couple of times. I then installed nVidia 378 driver, rebooted again. Moved my 20-nvidia.conf file to etc/x11 and renamed it to xorg.conf, rebooted. It did the same as before, fan control on gpu0 but not on the rest. I added a Screen section and moved Coolbits to that section. xorg.conf: --------------------------------------------------------------------------------- ##Start Section "Device" Identifier "Device0" Driver "nvidia" VendorName "NVIDIA Corporation" BoardName "GeForce GTX TITAN Z" BusID "PCI:3:0:0" EndSection Section "Device" Identifier "Device1" Driver "nvidia" VendorName "NVIDIA Corporation" BoardName "GeForce GTX TITAN Z" BusID "PCI:4:0:0" EndSection Section "Device" Identifier "Device2" Driver "nvidia" VendorName "NVIDIA Corporation" BoardName "GeForce GTX TITAN Z" BusID "PCI:9:0:0" EndSection Section "Device" Identifier "Device3" Driver "nvidia" VendorName "NVIDIA Corporation" BoardName "GeForce GTX TITAN Z" BusID "PCI:10:0:0" EndSection Section "Screen" Identifier "Screen0" Device "Device0" Monitor "Monitor1" DefaultDepth 24 Option "Coolbits" "4" Subsection "Display" Depth 24 EndSubSection EndSection Section "Screen" Identifier "Screen1" Device "Device1" Monitor "Monitor1" DefaultDepth 24 Option "Coolbits" "4" Subsection "Display" Depth 24 EndSubSection EndSection Section "Screen" Identifier "Screen2" Device "Device2" Monitor "Monitor2" DefaultDepth 24 Option "Coolbits" "4" Subsection "Display" Depth 24 EndSubSection EndSection Section "Screen" Identifier "Screen3" Device "Device3" Monitor "Monitor3" DefaultDepth 24 Option "Coolbits" "4" Subsection "Display" Depth 24 EndSubSection EndSection ##End ------------------------------------------------------------------------------- Rebooted. It still only shows up on the first gpu and works fine there. Enable the fan control, move to 100% then apply. It will just not show up on the other three. This is a bit frustrating. I really think it has something to do with Mint 18.2 Cinnamon. Getting kind of tired, I think I'll let this go till tomorrow and see what other comments might show up, then try the nVidia 375 driver. Bruce |
Bruce Send message Joined: 15 Mar 02 Posts: 123 Credit: 124,955,234 RAC: 11 |
Hi Guys, I have switched to the nVidia 375 driver from the repository today. Coolbits is still acting up. It starts up but is still only showing on gpu0, gpu1,2,3 are still not showing it. All devices have the Coolbit option. I even tried the Option "SLI" "off" in the hope that it might make a difference, no joy. Any more suggestions? Thanks. Bruce |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
I don't know that I have any specific suggestions, but just for comparison, here's what the complete xorg.conf file looks like on one of my Linux boxes that has never had any problems with Coolbits. Perhaps you can spot something significant. # nvidia-xconfig: X configuration file generated by nvidia-xconfig # nvidia-xconfig: version 375.39 (buildmeister@swio-display-x86-rhel47-09) Tue Jan 31 20:47:44 PST 2017 Section "ServerLayout" Identifier "Layout0" Screen 0 "Screen0" Screen 1 "Screen1" RightOf "Screen0" Screen 2 "Screen2" RightOf "Screen1" InputDevice "Keyboard0" "CoreKeyboard" InputDevice "Mouse0" "CorePointer" EndSection Section "Files" EndSection Section "InputDevice" # generated from default Identifier "Mouse0" Driver "mouse" Option "Protocol" "auto" Option "Device" "/dev/psaux" Option "Emulate3Buttons" "no" Option "ZAxisMapping" "4 5" EndSection Section "InputDevice" # generated from default Identifier "Keyboard0" Driver "kbd" EndSection Section "Monitor" Identifier "Monitor0" VendorName "Unknown" ModelName "Unknown" HorizSync 28.0 - 33.0 VertRefresh 43.0 - 72.0 Option "DPMS" EndSection Section "Monitor" Identifier "Monitor1" VendorName "Unknown" ModelName "Unknown" HorizSync 28.0 - 33.0 VertRefresh 43.0 - 72.0 Option "DPMS" EndSection Section "Monitor" Identifier "Monitor2" VendorName "Unknown" ModelName "Unknown" HorizSync 28.0 - 33.0 VertRefresh 43.0 - 72.0 Option "DPMS" EndSection Section "Device" Identifier "Device0" Driver "nvidia" VendorName "NVIDIA Corporation" BoardName "GeForce GTX 780" BusID "PCI:1:0:0" EndSection Section "Device" Identifier "Device1" Driver "nvidia" VendorName "NVIDIA Corporation" BoardName "GeForce GTX 670" BusID "PCI:2:0:0" EndSection Section "Device" Identifier "Device2" Driver "nvidia" VendorName "NVIDIA Corporation" BoardName "GeForce GTX 960" BusID "PCI:6:0:0" EndSection Section "Screen" Identifier "Screen0" Device "Device0" Monitor "Monitor0" DefaultDepth 24 Option "ThermalConfigurationCheck" "True" Option "Coolbits" "28" SubSection "Display" Depth 24 EndSubSection EndSection Section "Screen" Identifier "Screen1" Device "Device1" Monitor "Monitor1" DefaultDepth 24 Option "ThermalConfigurationCheck" "True" Option "Coolbits" "28" SubSection "Display" Depth 24 EndSubSection EndSection Section "Screen" Identifier "Screen2" Device "Device2" Monitor "Monitor2" DefaultDepth 24 Option "ThermalConfigurationCheck" "True" Option "Coolbits" "28" SubSection "Display" Depth 24 EndSubSection EndSection I did notice one thing, probably a typo, in your Screen0 section, where you have Monitor1 specified instead of Monitor0. Maybe that's significant, or maybe not. EDIT: Apparently, the system isn't picky about the BoardNames. I just noticed that the file still shows a GTX 780 and GTX 670, both of which were replaced by GTX 980s several weeks ago, without any effect on the saved configuration. |
tullio Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1 |
On my SuSE Linux Leap 42.2 box I am using nVidia driver 384.59 provided by SuSE and it works perfectly on my GTX 750 Ti board. GPU temperature is 50 C. Tullio |
Bruce Send message Joined: 15 Mar 02 Posts: 123 Credit: 124,955,234 RAC: 11 |
Hi Guys, Thanks Jeff. Your xconfig file helped alot. I am making progress of sorts. I had to tweak it a little bit. Add a fourth device, change a few settings. Here is the my new file: --------------------------------------------------------------- Section "InputDevice" # generated from default Identifier "Mouse0" Driver "mouse" Option "Protocol" "auto" Option "Device" "/dev/psaux" Option "Emulate3Buttons" "no" Option "ZAxisMapping" "4 5" EndSection Section "InputDevice" # generated from default Identifier "Keyboard0" Driver "kbd" EndSection Section "Monitor" Identifier "Monitor0" VendorName "Unknown" ModelName "Unknown" HorizSync 28.0 - 33.0 VertRefresh 43.0 - 72.0 Option "DPMS" EndSection Section "Device" Identifier "Device0" Driver "nvidia" VendorName "NVIDIA Corporation" BoardName "GeForce GTX TITAN Z" BusID "PCI:3:0:0" EndSection Section "Device" Identifier "Device1" Driver "nvidia" VendorName "NVIDIA Corporation" BoardName "GeForce GTX TITAN Z" BusID "PCI:4:0:0" EndSection Section "Device" Identifier "Device2" Driver "nvidia" VendorName "NVIDIA Corporation" BoardName "GeForce GTX TITAN Z" BusID "PCI:9:0:0" EndSection Section "Device" Identifier "Device3" Driver "nvidia" VendorName "NVIDIA Corporation" BoardName "GeForce GTX TITAN Z" BusID "PCI:10:0:0" EndSection Section "Screen" Identifier "Screen0" Device "Device0" Monitor "Monitor0" DefaultDepth 24 Option "Coolbits" "4" Option "SLI" "off" SubSection "Display" Depth 24 EndSubSection EndSection Section "Screen" Identifier "Screen1" Device "Device1" Monitor "Monitor1" DefaultDepth 24 Option "Coolbits" "4" Option "SLI" "off" SubSection "Display" Depth 24 EndSubSection EndSection Section "Screen" Identifier "Screen2" Device "Device2" Monitor "Monitor2" DefaultDepth 24 Option "Coolbits" "4" Option "SLI" "off" SubSection "Display" Depth 24 EndSubSection EndSection Section "Screen" Identifier "Screen3" Device "Device3" Monitor "Monitor3" DefaultDepth 24 Option "Coolbits" "4" Option "SLI" "off" SubSection "Display" Depth 24 EndSubSection EndSection ----------------------------------------------------------------------- This file as it sets still does not show Coolbits on gpu1,2,3. If I add this: ------------------------------------------------------------------------ Section "ServerLayout" Identifier "Layout0" Screen 0 "Screen0" 0 0 Screen 1 "Screen1" RightOf "Screen0" Screen 2 "Screen2" RightOf "Screen1" Screen 3 "Screen3" RightOf "Screen2" InputDevice "Keyboard0" "CoreKeyboard" InputDevice "Mouse0" "CorePointer" EndSection ---------------------------------------------------------------------------- to the top of the file it will show Coolbits on all four GPUs, however with that section added in it crashes the Cinnamon desktop and goes into fallback mode. Don't know why. Fix one problem and create another one (sounds like M$). Anybody have any idea what might be causing the crash? Thanks Bruce |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
You may need to include 3 more Monitor sections, since the Screen sections reference 4 monitors. That's the way mine was generated, even though I only have the one monitor. I'm also wondering about the "0 0" at the end of your Screen 0 "Screen0" 0 0 line. Were those added intentionally, or just some sort of artifact? (Please note that I don't actually have much expertise in this area. I'm just making observations from what catches my eye.) |
Bruce Send message Joined: 15 Mar 02 Posts: 123 Credit: 124,955,234 RAC: 11 |
OK Jeff, I added three more Monitor sections to the file and rebooted, just to make sure it ran right. It booted fine, I added the Layout section to the file and rebooted again. The Cinnamon desktop crashed, but the GPUs all showed Coolbits. I tried with and without those extra 0s, same result - crashed desktop. Maybe there is just enough difference in Mint 18.2 that Coolbits might not work. Still willing to try other suggestions. Thanks Bruce |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
This is a working xorg.conf from Mint 18.2 with 2 - 750Ti's # nvidia-xconfig: X configuration file generated by nvidia-xconfig |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
This is a working xorg.conf from Mint 18.2 with 2 - 750Ti's . . Gets out magnifying glass ... Stephen :) |
Bruce Send message Joined: 15 Mar 02 Posts: 123 Credit: 124,955,234 RAC: 11 |
Thanks Brent, your file helped me. Did have to borrow Stephen's magnifying glass. My Layout section was the cause of my problems. This is what I started with: ----------------------------------------------------------- Section "ServerLayout" Identifier "Layout0" Screen 0 "Screen0" 0 0 Screen 1 "Screen1" RightOf "Screen0" Screen 2 "Screen2" RightOf "Screen1" Screen 3 "Screen3" RightOf "Screen2" InputDevice "Keyboard0" "CoreKeyboard" InputDevice "Mouse0" "CorePointer" EndSection ----------------------------------------------------------------- The nVidia driver is evidently expecting individual video cards with a single gpu as far as Coolbits is concerned. I have a pair of Titan Z cards. Two Cards, two fans, four gpus. I deleted the double 0 on Screen0, also deleted Screen1 and Screen3. Rebooted and could control the fan on both cards. Yaaaaaaa! I want to Thank everybody who helped out with this. GOOD GOING GUYS!!!! Bruce |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
Ah....a dual GPU card. That didn't even occur to me. Perhaps because I've never breathed the rarified atmosphere up there in Titan land. ;^) Glad to hear you got it working! |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Wow, I didn't realize the Titan-Z was a dual GPU card either. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Bruce Send message Joined: 15 Mar 02 Posts: 123 Credit: 124,955,234 RAC: 11 |
Hi Guys, Got Boinc installed and it is now running. Everything was running normal, until the GPU-3 stopped and I received this notice from the server: Notice from BOINC NVIDIA GPU 3: GeForce GTX TITAN Z cannot be used for CUDA or OpenCL computation with CUDA driver 6.5 or later Thu 17 Aug 2017 08:06:49 PM EDT. That statement is just not true. The other three are working just fine with driver 375. Is there anything that I can do to convince the Seti Server that it is OK. It started out running fine until the notice, or should I drive to Berkekey with a sledge hammer? ¯\_(ツ)_/¯ -------------------------------------------- This is weird, rebooted and the server kicked out GPU2 now. CLInfo shows everything looking normal. Any ideas? Bruce |
Bruce Send message Joined: 15 Mar 02 Posts: 123 Credit: 124,955,234 RAC: 11 |
OK. Panic attack over for now. Went to our friend the Internet and found a couple of things to try. Turns out I needed to go to the Package Manager and install Boinc-client-nvidia-cuda. Rebooted and it's all back to normal. On to the next challenge. And there are a few. Bruce |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
OK. Panic attack over for now. I don't know what version of the special application you're using, as the current one automatically tunes for best performance, and i'd expect much better crunching times from your cards thatn you're presently getting. If you try pfp = 13 as a configuration setting, you should get much better output from them. Grant Darwin NT |
Bruce Send message Joined: 15 Mar 02 Posts: 123 Credit: 124,955,234 RAC: 11 |
Hi Grant, For now I am just running the stock apps, just to get a baseline on things. I'll probably let it run this way for a week or so, then switch over to Petri's special app and see what improvements it makes. I wasn't to happy about the way Boinc installed, no chance to specify where to place the data dir. That's my next goal, to move it off my SSD onto my Big HDD storage drive. Any advice on that would be much appreciated. It's all a learning experience. Bruce |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
Any advice on that would be much appreciated. I'm only good with Windows. From what I've seen in these threads is that anything other than the default directories can result in all sorts of permissions issues. If you're good with Linux, no problem to sort out. If you're not, prepare for a headache. Give the "Linux CUDA 'Special' App finally available, featuring Low CPU use" thread a look, particularly towards the end. Tbar has setup some downloads that are meant to make installing the special application pretty much straight forward. Grant Darwin NT |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.