Setting up Linux to crunch CUDA90 and above for Windows users

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 118 · 119 · 120 · 121 · 122 · 123 · 124 . . . 162 · Next

AuthorMessage
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2002958 - Posted: 17 Jul 2019, 3:52:09 UTC - in response to Message 2002951.  

Your system has been borked for a while, I'd still suggest Upgrading to a newer version.
. . Actually it has been working quite well until I try to change something. So if I can have this sort of problem changing a video driver how much damage can I cause trying to change the OS itself?
I thought that was your system that couldn't update the kernel past 4.4.0-96 or the video driver wouldn't work? I believe that's back in this thread somewhere. It now shows 4.4.0-148-generic...and the video driver doesn't work... A clean install would probably fix that.

A bit of advice already in this thread, Tell People Which Driver You Are Using. The Uninstall Commands are Different depending on which Driver you are using.
For the Driver Downloaded from nVidia the command is a simple sudo nvidia-uninstall For the Repository & PPA driver you use the purge command.
The nvidia-uninstall command Will remove All nVidia components installed by their driver. Do Not Try to remove the driver from nVidia by purging, as you see it doesn't work and it kills the simple nvidia-uninstall command so you are borked. I've gone from 418 to 410 a couple of times using the driver from nVidia in Ubuntu 18.04 and 19.04, no problems here.

. . Sorry for not making that clear, this all ensued from your notice that the latest drivers are now in the repository. I have avoided using nvidia's own drivers ever since the first 'bouncing logon screen" problem. I was running nvidia-384 previously and decided to make 0.98b-101 the new platform for my Linux rigs so I upgraded to nvidia-430. All from the repository/ppa. :(
The way I understand the latest driver thing is it only works in 18.04. I have 19.04 and it still won't list the 430 driver. I just looked at 16.04 and it won't show any repository driver higher than 384, same with 14.04. I also thought the PPA only works with 18.04, so, how did you install 430, 418, and 410 in Ubuntu 14.04 without using the downloaded nVidia driver? Why does it only list 430 in my 18.04 system and none of the others?

I think you'll find the problems with the early versions 18.04 and networking was related to running autoremove. That problem seems to be fixed with later versions. One way to tell if there is a problem with an installed version of 18.04 is to boot into recovery mode and try enabling networking. If enabling networking fails in recovery mode, then there is a problem with your system.
ID: 2002958 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2002961 - Posted: 17 Jul 2019, 4:10:45 UTC

Yes correct. For now only 18.04 LTS has the 430 drivers in the main repo. But from the articles at itsfoss and omgubuntu, it's coming for earlier LTS versions too.

https://itsfoss.com/ubuntu-lts-latest-nvidia-drivers/

For now, Ubuntu 18.04 LTS supports this out of the box. It will soon be available for Ubuntu 16.04 LTS (and later LTS versions will follow).

This will also benefit other distributions that are based on Ubuntu LTS releases. Zorin OS, Linux Mint are a few such examples:

Already announced that 20.04 LTS will have Nvidia drivers installed out of the box. Part of the development schedule already set in stone.

I can find the 430 drivers available in 19.04 by using the developer "bionic-proposed" toggle in the Software&Updates application which enables the SRU or stable updates. That is how I was able to pull down the 5.0.0.21 kernel you apprised me of earlier. I like the 5.0 kernel for its better developed thread scheduler, especially for AMD cpus. Thanks for that tip BTW. Appreciated.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2002961 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2002962 - Posted: 17 Jul 2019, 4:36:10 UTC - in response to Message 2002961.  

I have the Developer Options Pre-released updates checked on all my systems, including two 19.04 systems on different machines, only the 18.04 system shows driver 430 available.
ID: 2002962 · Report as offensive     Reply Quote
Gene Project Donor

Send message
Joined: 26 Apr 99
Posts: 150
Credit: 48,393,279
RAC: 118
United States
Message 2002963 - Posted: 17 Jul 2019, 4:59:20 UTC

@Heretic
. . Is there a Linux command to locate a file anywhere on a system?


yes. "locate string"
no need for * wild cards, etc., locate will display all instances of file paths in which the string is found. It IS case sensitive.
You will need, of course, to have installed the "locate" package, and it will bring in its dependency "findutils".

After installing, and periodically to keep things current, you'll need to do "updatedb" which builds the file path database, which is where the search is performed - instead of needing to traverse the directory trees every time.
ID: 2002963 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2002965 - Posted: 17 Jul 2019, 5:09:49 UTC - in response to Message 2002962.  

I have the Developer Options Pre-released updates checked on all my systems, including two 19.04 systems on different machines, only the 18.04 system shows driver 430 available.

Hmm, I wonder I have it is because I had the ppa active before I activated the developer SRU. I have toggled off the ppa in the S&U application but I have not removed the ppa.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2002965 · Report as offensive     Reply Quote
Profile tazzduke
Volunteer tester

Send message
Joined: 15 Sep 07
Posts: 190
Credit: 28,269,068
RAC: 5
Australia
Message 2002968 - Posted: 17 Jul 2019, 5:53:54 UTC

Greetings All

Just recently I added a GTX 1060 to my Core2Quad Linux machine -

https://setiathome.berkeley.edu/show_host_detail.php?hostid=8716959

I did a clean install of Linux Mint 19.1, and the sourced my Drivers from the NVIDIA PPA repository.

I did note that I was given the option of, 430, 418, 410 and 384 (I think it was 384)

I ran with the 418 drivers and machine is running sweet. Of note this is simply a cruncher PC only.

Just my observations

Regards
Mark
ID: 2002968 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2002971 - Posted: 17 Jul 2019, 7:23:25 UTC - in response to Message 2002955.  

I’m Ian. Steve was my father who started this account. I took it over for him a few years ago and leave his name for legacy.

If your BIOS is set for Legacy mode, then you may need to boot the Legacy installer. My point was that the configurations need to match. My systems are setup for UEFI and so I boot the UEFI installer. I think I’m both cases I have the SATA mode set to AHCI. A few of my systems are running 2 SSDs in RAID 1 and hence have the SATA mode set to RAID. But I remember having to do a back and forth dance finding the right combination of BIOS/boot settings in the BIOS to be able to boot it and have the installer recognize the raid array. But with a single drive it should be simpler.


. . OK Ian, thanks, pleased to meet you :) ...

. . Once I sort out this problem with the video drivers I will revisit the issue of the SSD on both the C2D and the Ryzen rigs.

Stephen
ID: 2002971 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2002972 - Posted: 17 Jul 2019, 7:26:27 UTC - in response to Message 2002957.  

My notice is that the Nvidia drivers are now part of the default Debian repos. No need to install the ppa repository. The 430 drivers are now automatically installed in the original OS installation without user intervention. They should just work out of the box from a clean install.

There is a difference between the closed proprietary Nvidia 430 drivers provided in the main distro now and the open source 430 drivers provided by the ppa. They ARE DIFFERENT.


. . I used the drivers from the ppa not from an install distro.

. . But exactly what is the difference, are the distro drivers actually the specifically Nvidia drivers?

Stephen

? ?
ID: 2002972 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2002973 - Posted: 17 Jul 2019, 7:42:07 UTC - in response to Message 2002958.  

I thought that was your system that couldn't update the kernel past 4.4.0-96 or the video driver wouldn't work? I believe that's back in this thread somewhere. It now shows 4.4.0-148-generic...and the video driver doesn't work... A clean install would probably fix that.

. . Not quite but close, that system is running on a flashdrive and would not boot with releases above 96 so it stayed there for a long while, but every now and then I would try a new release to see if the problem had been solved, when I tried 148 it was so that is where it is at.

The way I understand the latest driver thing is it only works in 18.04. I have 19.04 and it still won't list the 430 driver. I just looked at 16.04 and it won't show any repository driver higher than 384, same with 14.04. I also thought the PPA only works with 18.04, so, how did you install 430, 418, and 410 in Ubuntu 14.04 without using the downloaded nVidia driver? Why does it only list 430 in my 18.04 system and none of the others

. . Well I guess someone is right, I must be very special, on this system the 'additional drivers' tab shows them all. Just don't ask me to explain why ..

I think you'll find the problems with the early versions 18.04 and networking was related to running autoremove. That problem seems to be fixed with later versions. One way to tell if there is a problem with an installed version of 18.04 is to boot into recovery mode and try enabling networking. If enabling networking fails in recovery mode, then there is a problem with your system.

. . OK that is encouraging, I will revisit the Linux installation on the Ryzen, I really want that thing running Special sauce.

Stephen

:)
ID: 2002973 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2002974 - Posted: 17 Jul 2019, 7:45:46 UTC - in response to Message 2002963.  

@Heretic
. . Is there a Linux command to locate a file anywhere on a system?


yes. "locate string"
no need for * wild cards, etc., locate will display all instances of file paths in which the string is found. It IS case sensitive.
You will need, of course, to have installed the "locate" package, and it will bring in its dependency "findutils".

After installing, and periodically to keep things current, you'll need to do "updatedb" which builds the file path database, which is where the search is performed - instead of needing to traverse the directory trees every time.


. . Thanks Gene I will investigate that and see if I can solve this annoying issue.

Stephen

. .
ID: 2002974 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2002975 - Posted: 17 Jul 2019, 7:49:14 UTC - in response to Message 2002972.  

My notice is that the Nvidia drivers are now part of the default Debian repos. No need to install the ppa repository. The 430 drivers are now automatically installed in the original OS installation without user intervention. They should just work out of the box from a clean install.

There is a difference between the closed proprietary Nvidia 430 drivers provided in the main distro now and the open source 430 drivers provided by the ppa. They ARE DIFFERENT.


. . I used the drivers from the ppa not from an install distro.

. . But exactly what is the difference, are the distro drivers actually the specifically Nvidia drivers?

Stephen

? ?

The distro drivers are the officially released proprietary closed-source drivers from Nvidia. Notice how they are identified in the distro description. That means no Linux or Ubuntu developer saw them. They are just delivered to the distro packagers as is.

The ppa distro is identified by being labelled open source drivers. That means the ppa developers maintainers take the source code from Nvidia and compile them themselves. Compiling source code leaves it up to you exactly what target you build for, what parameters you compile with etc. etc. IOW, since the ppa developers are not privvy to any of the ways that Nvidia compiled the source code for their own driver sets, they have to make best guesses on their own. Notice the ppa only supports a few distros and doesn't cover every distro in circulation

So there likely are ppa driver packages that are slightly different than Nvidia's driver packages. The uninstall method is different as explained earlier.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2002975 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2002976 - Posted: 17 Jul 2019, 8:19:39 UTC - in response to Message 2002973.  

I thought that was your system that couldn't update the kernel past 4.4.0-96 or the video driver wouldn't work? I believe that's back in this thread somewhere. It now shows 4.4.0-148-generic...and the video driver doesn't work... A clean install would probably fix that.

. . Not quite but close, that system is running on a flashdrive and would not boot with releases above 96 so it stayed there for a long while, but every now and then I would try a new release to see if the problem had been solved, when I tried 148 it was so that is where it is at.

The way I understand the latest driver thing is it only works in 18.04. I have 19.04 and it still won't list the 430 driver. I just looked at 16.04 and it won't show any repository driver higher than 384, same with 14.04. I also thought the PPA only works with 18.04, so, how did you install 430, 418, and 410 in Ubuntu 14.04 without using the downloaded nVidia driver? Why does it only list 430 in my 18.04 system and none of the others

. . Well I guess someone is right, I must be very special, on this system the 'additional drivers' tab shows them all. Just don't ask me to explain why ..

I think you'll find the problems with the early versions 18.04 and networking was related to running autoremove. That problem seems to be fixed with later versions. One way to tell if there is a problem with an installed version of 18.04 is to boot into recovery mode and try enabling networking. If enabling networking fails in recovery mode, then there is a problem with your system.

. . OK that is encouraging, I will revisit the Linux installation on the Ryzen, I really want that thing running Special sauce.

Stephen

:)
OK, I think I've got it now. They changed the PPA so it would "work" with the older systems. You used it on 14.04... and now your drivers don't work. But, was it the PPA drivers, or the Kernel problem you thought was "fixed" with 4.4.0-148? In either case, a Clean install will surely fix both problems and do so much faster than any of your feeble attempts are likely to.
ID: 2002976 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2002991 - Posted: 17 Jul 2019, 12:58:19 UTC - in response to Message 2002976.  

OK, I think I've got it now. They changed the PPA so it would "work" with the older systems. You used it on 14.04... and now your drivers don't work. But, was it the PPA drivers, or the Kernel problem you thought was "fixed" with 4.4.0-148? In either case, a Clean install will surely fix both problems and do so much faster than any of your feeble attempts are likely to.


. . I think the driver worked but doesn't like 0.97, or the other way around, and it trashed my cache. Deciding to revert to a lower release driver I attempted to remove the 430 driver but something to do with the lib32gcc4 module stayed on. The error message when I try to install refers to 'broken packages being held' preventing the fresh installation from working. All other utes/apps that I have running were happy with the 430 drivers but not BOINC.

. . I see no reason to presume there is a kernel problem with my machine though I doubt I can prove it is not a possibility. But I will try Ian's suggestion to see if I can remove the problem module and see if that allows the installation to go through.

Stephen

. .
ID: 2002991 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2002993 - Posted: 17 Jul 2019, 13:20:53 UTC - in response to Message 2002947.  

You may have to play around with the settings to get it to load up properly depending on your exact hardware. I really can’t remember if C2D stuff was compatible with UEFI or not.


. . I have installed 'locate' and it finds locate OK so it works, but while it finds two listings for lib32gcc1, the module that fails to install, as follows:-

/var/lib/dpkg/info/lib32gcc1.list and
/var/lib/dpkg/info/lib32gcc1.postrm

... it fails to find any reference to lib32gcc4. So it seems there is data corruption in some file/index making it think that module is installed somewhere. Seems TBar is right, I am stuck at an impasse ... :(

Stephen

<shrug>
ID: 2002993 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2002997 - Posted: 17 Jul 2019, 13:34:24 UTC - in response to Message 2002993.  
Last modified: 17 Jul 2019, 14:04:15 UTC

Try
sudo dpkg --configure -a


Then reboot and try to install the drivers again.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2002997 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 2003076 - Posted: 17 Jul 2019, 22:58:00 UTC

I just managed to start running all 16 threads on an Amd 2700 (8c/16t) cpu.

And its NOT pulling 100% on the task manager. Instead it is running 78-80%.

The trick(s). Assign 1 cpu thread per gpu. And drop the "-nobs" from the command line in app_info.xml

The task manager is showing cpu processing loads for most of the cpus to be the same as at 90% of available cores/threads so there is a chance that this actually will raise the RAC.

Notice I did NOT say a good chance :)

I will be throwing more gpus back onto that box (its running 6 right now) later.

Tom
A proud member of the OFA (Old Farts Association).
ID: 2003076 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2003089 - Posted: 18 Jul 2019, 1:15:25 UTC - in response to Message 2002997.  
Last modified: 18 Jul 2019, 1:17:02 UTC

Try
sudo dpkg --configure -a


Then reboot and try to install the drivers again.


. . OK, tried that twice but still at the same impasse.

Response when attempting to install video drivers:-

reading package list done
building dependency tree done
reading state information done

Some packages could not be installed. This may mean that you have requested an impossible situation
or you are using an unstable distribution that some required packages have not yet been created or been
moved out of incoming.
The following information may help

The following packages have unmet dependency
nvidia-410 : Depends : lib32gcc1 but it is not going to be installed
E: Unable to correct problem, you have held broken packages


. . I guess Linux does not permit the repair of such a situation ... aint computers wonderful ... <sigh>

. . {afterthought} Does Package manager have a function to detect (and hopefully repair) broken packages?

Stephen

:(
ID: 2003089 · Report as offensive     Reply Quote
Profile Mr. Kevvy Crowdfunding Project Donor*Special Project $250 donor
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 3776
Credit: 1,114,826,392
RAC: 3,319
Canada
Message 2003090 - Posted: 18 Jul 2019, 1:25:23 UTC
Last modified: 18 Jul 2019, 1:29:32 UTC

If you haven't already done this, this has always worked for me when video drivers go wrong.

sudo apt-get purge nvidia-*


This will wipe all NVidia drivers. Reboot after of course. Warning: if your card is an RTX 20xx you may boot into the Nouveau drivers which may give you an unstable desktop or none at all. If so you will have to temporarily install another card ie a 1070 that Nouveau supports.

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update


Open Drivers (Mint) or Driver Manager (Ubuntu) from the start menu. You should see current drivers available to install. I suggest 4.18, others may suggest 4.30. Choose either, click Apply, wait a minute until they are installed, reboot again.

Good luck. :^)
ID: 2003090 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2003091 - Posted: 18 Jul 2019, 1:31:39 UTC - in response to Message 2003089.  

. . {afterthought} Does Package manager have a function to detect (and hopefully repair) broken packages?


If you are referring to Synaptic Package Manager . . . . then the answer is YES.

From the menu Edit >> Fix broken packages
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2003091 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2003093 - Posted: 18 Jul 2019, 2:02:04 UTC - in response to Message 2003090.  
Last modified: 18 Jul 2019, 2:07:34 UTC

If you haven't already done this, this has always worked for me when video drivers go wrong.

sudo apt-get purge nvidia-*

I already had him try apt purge *nvidia*, which is a bit more complete than just nvidia-*.

nvidia-* will match those beginning with "nvidia-" followed by one or more of any character.
*nvidia* will match those beginning with one or more of any character, containing "nvidia" and followed by one or more of any character

for example, your command would not purge all nvidia items, since some do not begin with "nvidia-" but my commands would capture anything containing the string "nvidia" no matter the placement in the name. you could test that by running your command, then running mine to see what was missed the first time.

although I did have him try just apt purge, and not apt-get purge, thinking he was on >Ubuntu 16.04. I'm not sure if apt got pushed back to Ubuntu 14.04 or not in the updates, or if he had the foresight to run apt-get. If not, he should try re-running that with apt-get.

as for fixing the broken packages. you can do as Keith says. you can run commands form the terminal also.

sudo apt-get update –fix-missing

sudo apt-get install -f

Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2003093 · Report as offensive     Reply Quote
Previous · 1 . . . 118 · 119 · 120 · 121 · 122 · 123 · 124 . . . 162 · Next

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.