Setting up Linux to crunch CUDA90 and above for Windows users

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 32 · 33 · 34 · 35 · 36 · 37 · 38 . . . 162 · Next

AuthorMessage
Bruce
Volunteer tester

Send message
Joined: 15 Mar 02
Posts: 123
Credit: 124,955,234
RAC: 11
United States
Message 1883022 - Posted: 10 Aug 2017, 19:19:26 UTC - in response to Message 1882981.  

I just run:
sudo nvidia-xconfig --thermal-configuration-check --cool-bits=28 --enable-all-gpus
That would be 4 for you if you just want fan control.

It puts the appropriate sections in the xconfig file, and you done, no need for extra config files.


. . I think that is the crucial part for his problem ...

Stephen

:)


Hi Stephen,

All GPUs are enabled. Each one is defined in the Device section. If you compare my 20-nvidia.conf device section with the device section from the xconfig file that has multiple devices defined you will see that they are almost identical, except for the spell checker changing Coolbits to Cool-bits, my file is actually Coolbits.
The file works. Coolbits sort of works. The question is why does it work on the first gpu but not the other three?
It should work on all of them, but it doesn't.

As far as running nvidia-xconfig, there is probably a good reason that it did not want to run on Mint 18.2. It would be the way to go on an older version of Linux though. There have probably been significant changes in some of the newer versions of Linux.

Very puzzling. Lets keep those ideas coming.
Bruce
ID: 1883022 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1883025 - Posted: 10 Aug 2017, 19:54:54 UTC - in response to Message 1883022.  
Last modified: 10 Aug 2017, 19:56:11 UTC

Some people have had trouble with the 384 driver with kernel 4.10. There doesn't appear to be any trouble with the Ubuntu repository version of 375.66 with kernel 4.10. My systems are using the 375.66 driver with the xorg.conf in the etc/X11 location. They are using the same format they have for months, each device has a screen section and the coolbits option is in each screen section. My device sections don't have a coolbits option. The fan control works on all my GPUs;
Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    Option         "Coolbits" "28"
    Option         "Stereo" "0"
    Option         "nvidiaXineramaInfoOrder" "DFP-0"
    Option         "metamodes" "nvidia-auto-select +0+0"
    Option         "SLI" "Off"
    Option         "MultiGPU" "Off"
    Option         "BaseMosaic" "off"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

Section "Screen"
    Identifier     "Screen1"
    Device         "Device1"
    Monitor        "Monitor1"
    DefaultDepth    24
    Option         "Coolbits" "28"
    Option         "Stereo" "0"
    Option         "nvidiaXineramaInfoOrder" "CRT-0"
    Option         "metamodes" "nvidia-auto-select +0+0"
    Option         "SLI" "Off"
    Option         "MultiGPU" "Off"
    Option         "BaseMosaic" "off"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection
Your mileage may vary.
ID: 1883025 · Report as offensive     Reply Quote
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1883035 - Posted: 10 Aug 2017, 20:28:48 UTC - in response to Message 1882978.  
Last modified: 10 Aug 2017, 20:31:08 UTC

Section "Device"
Identifier "Device0"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GeForce GTX TITAN Z"
BusID "PCI:3:0:0"
Option "Cool-bits" "4"
EndSection
I believe the "Coolbits" (no hyphen) entries should be in the "Screen" sections, not the "Device" sections. At least that's the way it's set up on my Ubuntu 14.04 machines.

EDIT: Which I now see is pretty much what TBar said in his reply.
ID: 1883035 · Report as offensive     Reply Quote
Bruce
Volunteer tester

Send message
Joined: 15 Mar 02
Posts: 123
Credit: 124,955,234
RAC: 11
United States
Message 1883066 - Posted: 10 Aug 2017, 23:26:35 UTC - in response to Message 1883025.  

Hi TBar

I switched back to the Linux Nouveau video driver, rebooted, then removed and purged nvidia -384 driver. Did a few updates to the OS, rebooted a couple of times.
I then installed nVidia 378 driver, rebooted again.
Moved my 20-nvidia.conf file to etc/x11 and renamed it to xorg.conf, rebooted.
It did the same as before, fan control on gpu0 but not on the rest.
I added a Screen section and moved Coolbits to that section.
xorg.conf:
---------------------------------------------------------------------------------
##Start
Section "Device"
Identifier "Device0"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GeForce GTX TITAN Z"
BusID "PCI:3:0:0"
EndSection

Section "Device"
Identifier "Device1"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GeForce GTX TITAN Z"
BusID "PCI:4:0:0"
EndSection

Section "Device"
Identifier "Device2"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GeForce GTX TITAN Z"
BusID "PCI:9:0:0"
EndSection

Section "Device"
Identifier "Device3"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GeForce GTX TITAN Z"
BusID "PCI:10:0:0"
EndSection

Section "Screen"
Identifier "Screen0"
Device "Device0"
Monitor "Monitor1"
DefaultDepth 24
Option "Coolbits" "4"
Subsection "Display"
Depth 24
EndSubSection
EndSection

Section "Screen"
Identifier "Screen1"
Device "Device1"
Monitor "Monitor1"
DefaultDepth 24
Option "Coolbits" "4"
Subsection "Display"
Depth 24
EndSubSection
EndSection

Section "Screen"
Identifier "Screen2"
Device "Device2"
Monitor "Monitor2"
DefaultDepth 24
Option "Coolbits" "4"
Subsection "Display"
Depth 24
EndSubSection
EndSection

Section "Screen"
Identifier "Screen3"
Device "Device3"
Monitor "Monitor3"
DefaultDepth 24
Option "Coolbits" "4"
Subsection "Display"
Depth 24
EndSubSection
EndSection

##End
-------------------------------------------------------------------------------
Rebooted. It still only shows up on the first gpu and works fine there. Enable the fan control, move to 100% then apply.
It will just not show up on the other three.

This is a bit frustrating. I really think it has something to do with Mint 18.2 Cinnamon.
Getting kind of tired, I think I'll let this go till tomorrow and see what other comments might show up, then try the nVidia 375 driver.
Bruce
ID: 1883066 · Report as offensive     Reply Quote
Bruce
Volunteer tester

Send message
Joined: 15 Mar 02
Posts: 123
Credit: 124,955,234
RAC: 11
United States
Message 1883263 - Posted: 11 Aug 2017, 21:42:12 UTC

Hi Guys,

I have switched to the nVidia 375 driver from the repository today. Coolbits is still acting up. It starts up but is still only showing on gpu0, gpu1,2,3 are still not showing it. All devices have the Coolbit option. I even tried the Option "SLI" "off" in the hope that it might make a difference, no joy.

Any more suggestions?

Thanks.
Bruce
ID: 1883263 · Report as offensive     Reply Quote
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1883274 - Posted: 11 Aug 2017, 22:37:29 UTC - in response to Message 1883263.  
Last modified: 11 Aug 2017, 22:47:05 UTC

I don't know that I have any specific suggestions, but just for comparison, here's what the complete xorg.conf file looks like on one of my Linux boxes that has never had any problems with Coolbits. Perhaps you can spot something significant.

# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 375.39  (buildmeister@swio-display-x86-rhel47-09)  Tue Jan 31 20:47:44 PST 2017


Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0"
    Screen      1  "Screen1" RightOf "Screen0"
    Screen      2  "Screen2" RightOf "Screen1"
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
EndSection

Section "Files"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/psaux"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Monitor"
    Identifier     "Monitor1"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Monitor"
    Identifier     "Monitor2"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "GeForce GTX 780"
    BusID          "PCI:1:0:0"
EndSection

Section "Device"
    Identifier     "Device1"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "GeForce GTX 670"
    BusID          "PCI:2:0:0"
EndSection

Section "Device"
    Identifier     "Device2"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "GeForce GTX 960"
    BusID          "PCI:6:0:0"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    Option         "ThermalConfigurationCheck" "True"
    Option         "Coolbits" "28"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

Section "Screen"
    Identifier     "Screen1"
    Device         "Device1"
    Monitor        "Monitor1"
    DefaultDepth    24
    Option         "ThermalConfigurationCheck" "True"
    Option         "Coolbits" "28"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

Section "Screen"
    Identifier     "Screen2"
    Device         "Device2"
    Monitor        "Monitor2"
    DefaultDepth    24
    Option         "ThermalConfigurationCheck" "True"
    Option         "Coolbits" "28"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

I did notice one thing, probably a typo, in your Screen0 section, where you have Monitor1 specified instead of Monitor0. Maybe that's significant, or maybe not.

EDIT: Apparently, the system isn't picky about the BoardNames. I just noticed that the file still shows a GTX 780 and GTX 670, both of which were replaced by GTX 980s several weeks ago, without any effect on the saved configuration.
ID: 1883274 · Report as offensive     Reply Quote
Profile tullio
Volunteer tester

Send message
Joined: 9 Apr 04
Posts: 8797
Credit: 2,930,782
RAC: 1
Italy
Message 1883295 - Posted: 11 Aug 2017, 23:41:57 UTC

On my SuSE Linux Leap 42.2 box I am using nVidia driver 384.59 provided by SuSE and it works perfectly on my GTX 750 Ti board. GPU temperature is 50 C.
Tullio
ID: 1883295 · Report as offensive     Reply Quote
Bruce
Volunteer tester

Send message
Joined: 15 Mar 02
Posts: 123
Credit: 124,955,234
RAC: 11
United States
Message 1883352 - Posted: 12 Aug 2017, 6:28:14 UTC

Hi Guys,

Thanks Jeff. Your xconfig file helped alot. I am making progress of sorts.
I had to tweak it a little bit. Add a fourth device, change a few settings.
Here is the my new file:
---------------------------------------------------------------
Section "InputDevice"
# generated from default
Identifier "Mouse0"
Driver "mouse"
Option "Protocol" "auto"
Option "Device" "/dev/psaux"
Option "Emulate3Buttons" "no"
Option "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"

# generated from default
Identifier "Keyboard0"
Driver "kbd"
EndSection

Section "Monitor"
Identifier "Monitor0"
VendorName "Unknown"
ModelName "Unknown"
HorizSync 28.0 - 33.0
VertRefresh 43.0 - 72.0
Option "DPMS"
EndSection

Section "Device"
Identifier "Device0"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GeForce GTX TITAN Z"
BusID "PCI:3:0:0"
EndSection

Section "Device"
Identifier "Device1"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GeForce GTX TITAN Z"
BusID "PCI:4:0:0"
EndSection

Section "Device"
Identifier "Device2"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GeForce GTX TITAN Z"
BusID "PCI:9:0:0"
EndSection

Section "Device"
Identifier "Device3"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GeForce GTX TITAN Z"
BusID "PCI:10:0:0"
EndSection

Section "Screen"
Identifier "Screen0"
Device "Device0"
Monitor "Monitor0"
DefaultDepth 24
Option "Coolbits" "4"
Option "SLI" "off"
SubSection "Display"
Depth 24
EndSubSection
EndSection

Section "Screen"
Identifier "Screen1"
Device "Device1"
Monitor "Monitor1"
DefaultDepth 24
Option "Coolbits" "4"
Option "SLI" "off"
SubSection "Display"
Depth 24
EndSubSection
EndSection

Section "Screen"
Identifier "Screen2"
Device "Device2"
Monitor "Monitor2"
DefaultDepth 24
Option "Coolbits" "4"
Option "SLI" "off"
SubSection "Display"
Depth 24
EndSubSection
EndSection

Section "Screen"
Identifier "Screen3"
Device "Device3"
Monitor "Monitor3"
DefaultDepth 24
Option "Coolbits" "4"
Option "SLI" "off"
SubSection "Display"
Depth 24
EndSubSection
EndSection
-----------------------------------------------------------------------

This file as it sets still does not show Coolbits on gpu1,2,3. If I add this:
------------------------------------------------------------------------
Section "ServerLayout"
Identifier "Layout0"
Screen 0 "Screen0" 0 0
Screen 1 "Screen1" RightOf "Screen0"
Screen 2 "Screen2" RightOf "Screen1"
Screen 3 "Screen3" RightOf "Screen2"
InputDevice "Keyboard0" "CoreKeyboard"
InputDevice "Mouse0" "CorePointer"
EndSection
----------------------------------------------------------------------------
to the top of the file it will show Coolbits on all four GPUs, however with that section added in it crashes the Cinnamon desktop and goes into fallback mode. Don't know why. Fix one problem and create another one (sounds like M$).

Anybody have any idea what might be causing the crash?

Thanks
Bruce
ID: 1883352 · Report as offensive     Reply Quote
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1883382 - Posted: 12 Aug 2017, 15:48:48 UTC - in response to Message 1883352.  

You may need to include 3 more Monitor sections, since the Screen sections reference 4 monitors. That's the way mine was generated, even though I only have the one monitor. I'm also wondering about the "0 0" at the end of your Screen 0 "Screen0" 0 0 line. Were those added intentionally, or just some sort of artifact?

(Please note that I don't actually have much expertise in this area. I'm just making observations from what catches my eye.)
ID: 1883382 · Report as offensive     Reply Quote
Bruce
Volunteer tester

Send message
Joined: 15 Mar 02
Posts: 123
Credit: 124,955,234
RAC: 11
United States
Message 1883402 - Posted: 12 Aug 2017, 18:07:23 UTC

OK Jeff, I added three more Monitor sections to the file and rebooted, just to make sure it ran right.
It booted fine, I added the Layout section to the file and rebooted again.
The Cinnamon desktop crashed, but the GPUs all showed Coolbits. I tried with and without those extra 0s, same result - crashed desktop.

Maybe there is just enough difference in Mint 18.2 that Coolbits might not work.

Still willing to try other suggestions.

Thanks
Bruce
ID: 1883402 · Report as offensive     Reply Quote
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1883403 - Posted: 12 Aug 2017, 18:26:25 UTC - in response to Message 1883402.  

This is a working xorg.conf from Mint 18.2 with 2 - 750Ti's
# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig: version 375.66 (buildmeister@swio-display-x86-rhel47-06) Mon May 1 15:45:32 PDT 2017

Section "ServerLayout"
Identifier "Layout0"
Screen 0 "Screen0"
Screen 1 "Screen1" RightOf "Screen0"
InputDevice "Keyboard0" "CoreKeyboard"
InputDevice "Mouse0" "CorePointer"
EndSection

Section "Files"
EndSection

Section "InputDevice"
# generated from default
Identifier "Mouse0"
Driver "mouse"
Option "Protocol" "auto"
Option "Device" "/dev/psaux"
Option "Emulate3Buttons" "no"
Option "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"
# generated from default
Identifier "Keyboard0"
Driver "kbd"
EndSection

Section "Monitor"
Identifier "Monitor0"
VendorName "Unknown"
ModelName "Unknown"
HorizSync 28.0 - 33.0
VertRefresh 43.0 - 72.0
Option "DPMS"
EndSection

Section "Monitor"
Identifier "Monitor1"
VendorName "Unknown"
ModelName "Unknown"
HorizSync 28.0 - 33.0
VertRefresh 43.0 - 72.0
Option "DPMS"
EndSection

Section "Device"
Identifier "Device0"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GeForce GTX 750 Ti"
BusID "PCI:2:0:0"
EndSection

Section "Device"
Identifier "Device1"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GeForce GTX 750 Ti"
BusID "PCI:4:0:0"
EndSection

Section "Screen"
Identifier "Screen0"
Device "Device0"
Monitor "Monitor0"
DefaultDepth 24
Option "ThermalConfigurationCheck" "True"
Option "Coolbits" "28"
SubSection "Display"
Depth 24
EndSubSection
EndSection

Section "Screen"
Identifier "Screen1"
Device "Device1"
Monitor "Monitor1"
DefaultDepth 24
Option "ThermalConfigurationCheck" "True"
Option "Coolbits" "28"
SubSection "Display"
Depth 24
EndSubSection
EndSection

ID: 1883403 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1883420 - Posted: 12 Aug 2017, 21:18:47 UTC - in response to Message 1883403.  

This is a working xorg.conf from Mint 18.2 with 2 - 750Ti's
# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig: version 375.66 (buildmeister@swio-display-x86-rhel47-06) Mon May 1 15:45:32 PDT 2017

Section "ServerLayout"
Identifier "Layout0"
Screen 0 "Screen0"
Screen 1 "Screen1" RightOf "Screen0"
InputDevice "Keyboard0" "CoreKeyboard"
InputDevice "Mouse0" "CorePointer"
EndSection

Section "Files"
EndSection

Section "InputDevice"
# generated from default
Identifier "Mouse0"
Driver "mouse"
Option "Protocol" "auto"
Option "Device" "/dev/psaux"
Option "Emulate3Buttons" "no"
Option "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"
# generated from default
Identifier "Keyboard0"
Driver "kbd"
EndSection





. . Gets out magnifying glass ...

Stephen

:)
ID: 1883420 · Report as offensive     Reply Quote
Bruce
Volunteer tester

Send message
Joined: 15 Mar 02
Posts: 123
Credit: 124,955,234
RAC: 11
United States
Message 1883458 - Posted: 12 Aug 2017, 23:28:48 UTC

Thanks Brent, your file helped me. Did have to borrow Stephen's magnifying glass.

My Layout section was the cause of my problems. This is what I started with:
-----------------------------------------------------------
Section "ServerLayout"
Identifier "Layout0"
Screen 0 "Screen0" 0 0
Screen 1 "Screen1" RightOf "Screen0"
Screen 2 "Screen2" RightOf "Screen1"
Screen 3 "Screen3" RightOf "Screen2"
InputDevice "Keyboard0" "CoreKeyboard"
InputDevice "Mouse0" "CorePointer"
EndSection
-----------------------------------------------------------------
The nVidia driver is evidently expecting individual video cards with a single gpu as far as Coolbits is concerned.
I have a pair of Titan Z cards. Two Cards, two fans, four gpus.
I deleted the double 0 on Screen0, also deleted Screen1 and Screen3.
Rebooted and could control the fan on both cards. Yaaaaaaa!

I want to Thank everybody who helped out with this.

GOOD GOING GUYS!!!!
Bruce
ID: 1883458 · Report as offensive     Reply Quote
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1883468 - Posted: 12 Aug 2017, 23:46:21 UTC - in response to Message 1883458.  

Ah....a dual GPU card. That didn't even occur to me. Perhaps because I've never breathed the rarified atmosphere up there in Titan land. ;^)

Glad to hear you got it working!
ID: 1883468 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1883470 - Posted: 12 Aug 2017, 23:55:13 UTC - in response to Message 1883468.  

Wow, I didn't realize the Titan-Z was a dual GPU card either.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1883470 · Report as offensive     Reply Quote
Bruce
Volunteer tester

Send message
Joined: 15 Mar 02
Posts: 123
Credit: 124,955,234
RAC: 11
United States
Message 1884556 - Posted: 17 Aug 2017, 20:28:13 UTC
Last modified: 17 Aug 2017, 21:12:39 UTC

Hi Guys,

Got Boinc installed and it is now running.

Everything was running normal, until the GPU-3 stopped and I received this notice from the server:

Notice from BOINC
NVIDIA GPU 3: GeForce GTX TITAN Z cannot be used for CUDA or OpenCL computation with CUDA driver 6.5 or later
Thu 17 Aug 2017 08:06:49 PM EDT.

That statement is just not true. The other three are working just fine with driver 375. Is there anything that I can do to convince the Seti Server that it is OK. It started out running fine until the notice, or should I drive to Berkekey with a sledge hammer? ¯\_(ツ)_/¯

--------------------------------------------

This is weird, rebooted and the server kicked out GPU2 now. CLInfo shows everything looking normal.

Any ideas?
Bruce
ID: 1884556 · Report as offensive     Reply Quote
Bruce
Volunteer tester

Send message
Joined: 15 Mar 02
Posts: 123
Credit: 124,955,234
RAC: 11
United States
Message 1884580 - Posted: 17 Aug 2017, 22:12:31 UTC

OK. Panic attack over for now.

Went to our friend the Internet and found a couple of things to try.
Turns out I needed to go to the Package Manager and install Boinc-client-nvidia-cuda.
Rebooted and it's all back to normal.

On to the next challenge. And there are a few.
Bruce
ID: 1884580 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1884648 - Posted: 18 Aug 2017, 5:19:28 UTC - in response to Message 1884580.  

OK. Panic attack over for now.

Went to our friend the Internet and found a couple of things to try.
Turns out I needed to go to the Package Manager and install Boinc-client-nvidia-cuda.
Rebooted and it's all back to normal.

On to the next challenge. And there are a few.

I don't know what version of the special application you're using, as the current one automatically tunes for best performance, and i'd expect much better crunching times from your cards thatn you're presently getting.
If you try pfp = 13 as a configuration setting, you should get much better output from them.
Grant
Darwin NT
ID: 1884648 · Report as offensive     Reply Quote
Bruce
Volunteer tester

Send message
Joined: 15 Mar 02
Posts: 123
Credit: 124,955,234
RAC: 11
United States
Message 1884664 - Posted: 18 Aug 2017, 6:22:15 UTC - in response to Message 1884648.  

Hi Grant,

For now I am just running the stock apps, just to get a baseline on things.
I'll probably let it run this way for a week or so, then switch over to Petri's special app and see what improvements it makes.

I wasn't to happy about the way Boinc installed, no chance to specify where to place the data dir.
That's my next goal, to move it off my SSD onto my Big HDD storage drive.
Any advice on that would be much appreciated.

It's all a learning experience.
Bruce
ID: 1884664 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1884665 - Posted: 18 Aug 2017, 6:43:15 UTC - in response to Message 1884664.  

Any advice on that would be much appreciated.

I'm only good with Windows.
From what I've seen in these threads is that anything other than the default directories can result in all sorts of permissions issues. If you're good with Linux, no problem to sort out. If you're not, prepare for a headache.

Give the "Linux CUDA 'Special' App finally available, featuring Low CPU use" thread a look, particularly towards the end.
Tbar has setup some downloads that are meant to make installing the special application pretty much straight forward.
Grant
Darwin NT
ID: 1884665 · Report as offensive     Reply Quote
Previous · 1 . . . 32 · 33 · 34 · 35 · 36 · 37 · 38 . . . 162 · Next

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.