Setting up Linux to crunch CUDA90 and above for Windows users

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 125 · 126 · 127 · 128 · 129 · 130 · 131 . . . 162 · Next

AuthorMessage
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2005029 - Posted: 31 Jul 2019, 15:34:43 UTC - in response to Message 2005022.  

how to mod the boinc to get work,tuesday when seti goes down. I get no work. Some users showing 64gpus.


. . They fiddle the books. Unless you are good at re-writing code for BOINC and then recompiling it you are probably not in the running to do that, which is why I cannot. Though at the rate modern machines can chew through WUs the current levels of work allowed are inadequate even with moderate outages. This weeks outage lasted almost 7 hours which is longer than the norm of late, but 3 of my machines ran through the allocated 100 tasks per GPU well before the outage was over. The minimum increase in reported number of GPUs seems to be a multiplier of 4, but I would be happy even with a multiplier of just 2, though I would probably settle for the 4 :)

Stephen

. .


How do you multiple by 4

Look the spoofed thread for any info.

https://setiathome.berkeley.edu/forum_thread.php?id=84441
ID: 2005029 · Report as offensive     Reply Quote
Dave Lewis

Send message
Joined: 12 Apr 99
Posts: 34
Credit: 53,432,603
RAC: 108
United States
Message 2005046 - Posted: 31 Jul 2019, 20:01:55 UTC
Last modified: 31 Jul 2019, 20:07:04 UTC

I want to offer a heads-up to anyone who possibly has made the same mistake or encountered the error situation that I did. I'm using Kubuntu 19.04. There have been a number of updates over the past several months to the KDE Plasma Desktop. I had been using the proprietary Nvidia driver 4.18 with total success with my GTX 1080 gpu for successfully crunching seti@home now for approximately 2.5 - 3 months.

This morning I noticed that there were updates available to the system and as I normally do I looked through the listing of programs/files to be updated and for the most part it was a series of updates to the KDE Plasma Desktop. While perusing the list of file updates I either completely missed the identification of updates to the Nvidia driver from the proprietary version 418 to 430.40 or it wasn't listed at all. Upon rebooting all of the seti@home GPU workunits errored out. Edited to add that the error message was "no CUDA-capable device is detected".

The crucial change that I saw was that the Nvidia proprietary ver. 418 driver was no longer listed in the Software & Updates "Additional Drivers" tab but instead only the 390, 415 and 430 open source drivers were listed. It turns out that current release of the KDE Plasma Desktop now incorporates the proprietary Nvidia drivers in the "open source" driver packaging. On the Additional Drivers tab of the KDE Software & Updates program the following driver was selected:

"Using NVIDIA driver metapackage from nvidia-driver-430 (open source)"

There were similar unchecked lines for the 390 and 415 versions as well. There was the following line near the bottom of the window:

"No proprietary drivers are in use."

I went back to basics and started re-reading the beginning of this thread and saw Steven's post (linked below) that said the following:

". . Once Linux is running there are some 'must haves' you should install as well.

1) The right nVidia drivers e.g NVIDIA-Linux-x86_64-375.39.run
2) Two libraries, libcudart.so.8.0 and libcufft.so.8.0"

The Nvidia driver was supposedly already installed so I searched for "libcudart" and "libcufft" in the Synaptic Package Manager program and those libraries, even though listed as available, were not installed even though I had installed them previously when I had used the proprietary Nvidia display driver. Apparently the update performed this morning resulted in those libraries being uninstalled. I got a single hit (a newer version from the version listed in Stephen's link) for the libcudart and two hits for libcufft so I installed all 3. I restarted my system and after a few minutes I started getting workunits and processing began as usual. I've not encountered any errors or invalids with the update to the version 430.40 Nvidia driver so far (about 10-15 workunits validated).

I wanted to pass this on in case anyone else encountered this situation. If anyone else has encountered this and provided a remedy I apologize as I tried searching for a remedy and didn't find any.

https://setiathome.berkeley.edu/forum_thread.php?id=81271&postid=1860293
ID: 2005046 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2005052 - Posted: 31 Jul 2019, 20:24:41 UTC - in response to Message 2005046.  

This caught me out too when the open source Nvidia driver got pulled into the distro. The update to the 430 drivers did not install the drivers into the kernel. Nor did it have the correct packaging to have the Nvidia drivers added to the DKMS sources for future kernel compilations.

Ignore the information in the thread you referenced. The information is only for informational, general outline purposes as the info is very outdated now. You don't need to worry about the CUDA libraries as they are statically incorporated into the science application.

To prevent this type of event occurring in the future it is always best to check for the CUDA and OpenCL drivers being installed, BEFORE, you light off BOINC. This is simple and easy to do with a check in the Terminal with clinfo. Clinfo will print out all the detected graphics drivers in the kernel and verify for you that you have the correct proprietary drivers installed for BOINC.
sudo apt install clinfo


What you need to do is to use the Additional Drivers to temporarily install the 415 drivers to remove the incorrectly installed 430 drivers. The system won't let you reinstall the 430 drivers because it already detects them. Or for another way just install the drivers in the Terminal and you can watch the progress of the installation, something that you can't do with the installation of the drivers in the Additional Drivers tab.
sudo apt install nvidia-driver-418

That would be sufficient for all Nvidia cards up to the introduction of the Supers. The 418 drivers are necessary for the CUDA101 application If you have any of the Supers you will have to use the Nvidia .run installer to get the 430.34 driver to handle those cards.
When you use the Terminal to install the drivers, you can watch the progress of the installation. Pay attention to the end where the driver is compiled into the kernel and that driver is added as a DKMS module. If you see that you are good to go for further kernel updates.

Once again, after you have rebooted the system to get up onto the Nvidia drivers, check with clinfo to be sure you are on the Nvidia driver and not still on the Nouveau drivers. If you are still stuck on default drivers, repeat the process with another version. Or if necessary, sudo apt purge *nvidia* to get rid of all remnants of the Nvidia drivers so you can start with a blank slate.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2005052 · Report as offensive     Reply Quote
Dave Lewis

Send message
Joined: 12 Apr 99
Posts: 34
Credit: 53,432,603
RAC: 108
United States
Message 2005090 - Posted: 31 Jul 2019, 23:40:42 UTC - in response to Message 2005052.  
Last modified: 31 Jul 2019, 23:41:49 UTC

Thanks for the info Keith. You and many others have always been very helpful here.

I knew that the info in the one link I mentioned was outdated and I took it simply as general instructions. I checked to see if those two libraries were installed after trying to install the version 418 drivers again. It was then that I saw that the library files weren't installed so I figured that I'd give installing them a shot as I could always uninstall them if installing them didn't work. But installing them did work for my KDE/Kubuntu installation. And now I'm a bit gunshy about removing the 430.40 driver setup as it seems to be working after adding those 3 libraries. If the ver. 430.40 drivers are known to be unreliable then I will definitely make the effort to re-install version 4.18 as it had worked perfectly well for me up until today's experience.

I'm certain that I used clinfo before (as you wrote the commands) when I was setting up my linux system about 2.5 to 3 months ago. With my system unchanged since I wrote the prior post I get the following result with the clinfo command line which says that the series of version 418 driver files are no longer needed and describes how to remove them. I'm guessing the updates that were done today prevented the previously installed version 418 driver from being displayed on the "Additional Drivers" tab.

https://drive.google.com/open?id=1u1YLtrFTpwlkKFTlHovSSxDftlgXMl_5

I did try "sudo apt install nvidia-driver-418" earlier today before I posted my original message and the driver would not install. Unfortunately I don't remember the wording of the message that was displayed but it was something to the effect of it not being found in the repository even though the correct ppa for the proprietary nVidia drivers had already been added. And I do see nvidia-graphics-drivers-418 for 418.56-0ubuntu0~gpu19.04.3 listed on the launchpad.net webpage so it is available for installation. I'm guessing that some configuration option in the update that I got must be blocking it from installing the older driver.
ID: 2005090 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2005093 - Posted: 1 Aug 2019, 0:14:23 UTC - in response to Message 2005090.  

You had already installed the 430 drivers which had removed the 418 drivers. I still see the 418 drivers listed for 18.04 on the ppa website. They updated the 430 drivers today. I just got updated on this daily driver from 430.26 to 430.40. The 430.40 drivers are the new ones to handle the new Super cards. So at least one does not have to resort to the Nvidia .run installer now for Super support. Glad the ppa got caught up with current drivers.

I had no problem backleveling from the 430.26 drivers to the 418 drivers yesterday on another machine.

https://launchpad.net/~graphics-drivers/+archive/ubuntu/ppa
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2005093 · Report as offensive     Reply Quote
Dave Lewis

Send message
Joined: 12 Apr 99
Posts: 34
Credit: 53,432,603
RAC: 108
United States
Message 2005113 - Posted: 1 Aug 2019, 4:07:43 UTC - in response to Message 2005093.  
Last modified: 1 Aug 2019, 4:08:56 UTC

I just surpassed 330 consecutive valid tasks with ver. 430.40 of the nVidia driver so I'm going to leave it in place and check more often (2-3 times/day) instead of once a day or so as I have been doing in order to see if any other workunits error out or are invalid which may indicate a problem with the driver. Thanks again for you thoughts and guidance Keith.
ID: 2005113 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2005807 - Posted: 5 Aug 2019, 3:13:37 UTC - in response to Message 2005113.  

I just surpassed 330 consecutive valid tasks with ver. 430.40 of the nVidia driver so I'm going to leave it in place and check more often (2-3 times/day) instead of once a day or so as I have been doing in order to see if any other workunits error out or are invalid which may indicate a problem with the driver. Thanks again for you thoughts and guidance Keith.


. . I take the lack of further messages to indicate everything is now a source of joy?

Stephen

? ?
ID: 2005807 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2005810 - Posted: 5 Aug 2019, 3:31:12 UTC - in response to Message 2005807.  

I think all the drama was the transition from ppa packages to distro packages.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2005810 · Report as offensive     Reply Quote
Dave Lewis

Send message
Joined: 12 Apr 99
Posts: 34
Credit: 53,432,603
RAC: 108
United States
Message 2005882 - Posted: 5 Aug 2019, 18:17:04 UTC - in response to Message 2005807.  
Last modified: 5 Aug 2019, 18:27:00 UTC



. . I take the lack of further messages to indicate everything is now a source of joy?

Stephen

? ?


For me it is Stephen. A short while ago I passed 3800 consecutive valid tasks with the nVidia version 430.40 and the 3 files that I added in my OP on this issue ("libcudart" and "libcufft"). A few days ago I noticed an update to the ver. 430.40 nVidia drivers and I held my breath and performed that update. I've experienced no problems since that update as well.

Edited to add: Keith may be completely correct and those 3 library files that I added may not be needed since that last nVidia driver update. I'm hesitant atm to delete them on my system and potentially cause more errors.
ID: 2005882 · Report as offensive     Reply Quote
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2005888 - Posted: 5 Aug 2019, 18:29:06 UTC

My host already crunch 1000's of WU with the 430.40 with no problem at all.

Not make any tests about the crunching speed.
ID: 2005888 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2006740 - Posted: 11 Aug 2019, 1:58:19 UTC
Last modified: 11 Aug 2019, 2:13:44 UTC

. . Panic over ...

. . I extracted fresh copies of the files from the archive and copied them over the problem ones. It is working now.

Stephen

? ?
ID: 2006740 · Report as offensive     Reply Quote
elec999 Project Donor

Send message
Joined: 24 Nov 02
Posts: 375
Credit: 416,969,548
RAC: 141
Canada
Message 2007488 - Posted: 15 Aug 2019, 13:26:32 UTC

Anyone manage to get better results or optimize these better.
ID: 2007488 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2007496 - Posted: 15 Aug 2019, 13:55:56 UTC - in response to Message 2007488.  

Anyone manage to get better results or optimize these better.


can you elaborate? better result in comparison to what? optimize what better?

you'll need to be more specific with your questions.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2007496 · Report as offensive     Reply Quote
elec999 Project Donor

Send message
Joined: 24 Nov 02
Posts: 375
Credit: 416,969,548
RAC: 141
Canada
Message 2007698 - Posted: 16 Aug 2019, 16:11:58 UTC - in response to Message 2007496.  

Anyone manage to get better results or optimize these better.


can you elaborate? better result in comparison to what? optimize what better?

you'll need to be more specific with your questions.



Well I wanted to see if there was anyway to squeeze some extra performance out of the gpus.
ID: 2007698 · Report as offensive     Reply Quote
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22160
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2007700 - Posted: 16 Aug 2019, 16:19:59 UTC

Given you have about a dozen active computers with GPUs you need to say which ones in particular that way you can be pointed in the right direction for each of them as there will be different answers for AMD & nVidia, and within each of the families.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2007700 · Report as offensive     Reply Quote
elec999 Project Donor

Send message
Joined: 24 Nov 02
Posts: 375
Credit: 416,969,548
RAC: 141
Canada
Message 2007706 - Posted: 16 Aug 2019, 17:04:44 UTC - in response to Message 2007700.  

Given you have about a dozen active computers with GPUs you need to say which ones in particular that way you can be pointed in the right direction for each of them as there will be different answers for AMD & nVidia, and within each of the families.



Sorry the 1070s, 1070ti and 1080,2070. All Nvidia highend ones.
ID: 2007706 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2007713 - Posted: 16 Aug 2019, 17:58:55 UTC - in response to Message 2007706.  

What more do you expect? You are already running Linux with the special CUDA101 app with -nobs. The cards don't go any faster.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2007713 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2007717 - Posted: 16 Aug 2019, 18:32:41 UTC - in response to Message 2007713.  

What more do you expect? You are already running Linux with the special CUDA101 app with -nobs. The cards don't go any faster.


well... not yet. until petri pulls another magic algorithm out of his hat :)

but yeah, for now, it's as fast as it goes. the only very small gains you could get are overclocking the hardware to run faster. but you will get small gains for large increases in heat/power consumption and instability. just leave it be.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2007717 · Report as offensive     Reply Quote
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22160
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2008051 - Posted: 18 Aug 2019, 14:58:44 UTC

OK - quick question (and I'm sure it's been asked already, but I can't find the answer.)
Will GTX980 benefit from running the CUDA 9/10 special brew, or should I stick with the older CUDA 6/8 version?
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2008051 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2008054 - Posted: 18 Aug 2019, 15:13:56 UTC - in response to Message 2008051.  

Yes.

GTX 980 has CC 5.2, so it’s good to use the cuda 90 app. The current app requires minimum CC 5.0
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2008054 · Report as offensive     Reply Quote
Previous · 1 . . . 125 · 126 · 127 · 128 · 129 · 130 · 131 . . . 162 · Next

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.