Setting up Linux to crunch CUDA90 and above for Windows users

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 67 · 68 · 69 · 70 · 71 · 72 · 73 . . . 162 · Next

AuthorMessage
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1952911 - Posted: 30 Aug 2018, 18:12:56 UTC - in response to Message 1952899.  

I update my host from Ubuntu 16.04 to Ubuntu 18.04, besides a small witch of the video source (it changes from one GPU to another) all is working fine. Not thakes more than 1/2 hour to compleate the task.

Running CUDA 9.2 Pascal directly with no problem.

One point is important i have allmost no knowledge about Linux at all, just follow the questions the OS update procedure makes.


. . Thank you Juan, but did it just update the installation with the later kernel or did it wipe everything and install from scratch?

Stephen

??

Just click on the update and follow the instructions.
ID: 1952911 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1952914 - Posted: 30 Aug 2018, 18:54:23 UTC
Last modified: 30 Aug 2018, 19:01:05 UTC

Stephen, I agree, at this point you should just reinstall everything fresh. I'd also recommend just using the standard Ubuntu distro. And unless you need 7.4.44 to stack up extra WUs, you might find the install experience easier to just install the repository version of BOINC, which will install all needed dependencies during install rather than you needing to do it manually. Just keep in mind that doing it this way, you won't get the benefit of some of the improvements that TBar has made to the BOINC client under linux. but it will still work.

1. Get the Ubuntu 18.04.1 LTS image and burn to a disc, or create a bootable USB stick using Rufus on a Windows machine.
2. Install Ubuntu 18.04.1 LTS on whatever system.

3. after a fresh install, its a good measure to get any updates:
sudo apt-get update
sudo apt-get upgrade


4. Install nvidia proprietary drivers:
this is a good resource, and explains both ways of doing it, via GUI or via command line:
https://www.linuxbabe.com/ubuntu/install-nvidia-driver-ubuntu-18-04

I like doing it via command line personally so i'll put the commands here
first add the PPA repository (only needed if you want the latest 396 drivers with CUDA 9.2):
sudo add-apt-repository ppa:graphics-drivers/ppa


now install the actual driver:
sudo apt-get install nvidia-driver-396

*if you want the normal 390 driver just change the "396" to "390"

**on my systems, this would install the 396.54 driver. and this would also fail to install OpenCL as well (meaning your AstroPulse tasks wont run). i don't know why, but it's what happened on all my systems. the 390 package should include it as normal though.

to install OpenCL:
sudo apt-get install ocl-icd-libopencl1


reboot the system to apply the nvidia drivers

5. Install BOINC:
sudo apt-get install boinc-client boinc-manager


Boinc Manager will now be launchable from the applications menu at the bottom left of your desktop window. This way boinc is a service install, and runs automatically at boot, but starting and stopping boinc-client will be different than before with TBar's version.

Start BOINC client
sudo /etc/init.d/boinc-client start

Stop BOINC client
sudo /etc/init.d/boinc-client stop

Restart BOINC client
sudo /etc/init.d/boinc-client restart


where the BOINC data directory is located will be different also.

will be located at
/var/lib/boinc-client/


From this point you should be able to setup your projects and log in as normal via the Boinc Manager. after doing that and adding SETI, just suspend and/or exit BOINC and stop the client.

6. copy over the CUDA app package contents into the seti project folder now at:
/var/lib/boinc-client/projects/setiathome.berkeley.edu/

and make any other changes to your app_config.xml, cc_config.xml, or app_info.xml that you might want to make (optional)
make sure that the apps are all set to allow execution file as a program, found by right click -> properties, Permissions tab.

start the boinc-client with the command above. then launch Boinc Manager again, and you should be good to go. I don't think i've missed anything, but i could have as i just typed this all up from my memory of installing things this way on about 10 different systems in the last couple weeks.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1952914 · Report as offensive     Reply Quote
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1952926 - Posted: 30 Aug 2018, 19:45:21 UTC - in response to Message 1952859.  

Well, Libcurl3 deals with networking. boinc will not function without it working correctly. I'm not surprised to see you are having other networking troubles as well. The Enable Networking in Recovery Mode not working should have been a clue. I would have reinstalled the System yesterday if it were me. The next time just install the WebKit and leave the Libcurl3 as it is, see if that works. If you ever do get it working, the CUDA 9.1 App will work with the 390 driver, that's partly why I chose to use it. Anyone running the Repository 390 driver can use the new CUDA App.

BTW, the Newer Systems use GDM as the display manager, that's why stop lightdm doesn't work on that system. I think the new cmd is something like sudo service GDM stop. That's partly why I changed to using the Recovery mode instead of the console, you don't have to stop anything in Recovery mode.


I use init 3.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1952926 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1952928 - Posted: 30 Aug 2018, 19:49:30 UTC
Last modified: 30 Aug 2018, 19:50:08 UTC

Yes, for some reason this time, the 396.54 drivers don't pick up the OpenCL drivers. The ppa drivers in earlier versions always installed them on my systems. The newest contest build I did in the late stages ran into this problem. I just backleveled to 390 to get the OpenCL drivers and then updated to the 396.54 drivers. Ian's sudo apt-get install ocl-icd-libopencl1 would have worked just as well.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1952928 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1952980 - Posted: 31 Aug 2018, 1:49:55 UTC - in response to Message 1952496.  

The only thing "different" about Linux is that it can still be used and controlled by the command line. Something that a long time DOS/Windows 3.1 user should remember.


From my POV to use Boinc with Lubuntu/Linux there is a lot more monkeying around.

Assuming you have good gpu drivers.

With windows you download 1 app. Run install. May need to start app the first time. Download/install TThortle (sp?).

Add projects.

Done.

The last "install" steps I read in this string were a bit more complicated than that.

I have no wish to start a Windows vs. Linux "religious" war. So please do NOT take this as a "slam" of Linux, just a Windows POV on simplicity. If running BOINC under Linux was really simple, we wouldn't be spending so much time on this thread troubleshooting installs, upgrades etc.

Respectfully,
Tom
A proud member of the OFA (Old Farts Association).
ID: 1952980 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1952988 - Posted: 31 Aug 2018, 2:30:57 UTC - in response to Message 1952980.  

Now there's a replacement for zi3v, I will be updating the All-In-One Linux install.
The biggest change is the CUDA Libraries are built-in to the App, so, no more manually installing the Libraries. That was the biggest item for the User.
The New Package will work by Downloading the Zipped File, Unzipping it to Home, installing the One or two dependencies, then double clicking on boincmgr.
It will have the CUDA Special App preinstalled.
I think it will be difficult to match that level of simplicity.
The BOINC folder is in your Home folder with this Linux install, so you don't have to go looking for Hidden folders like you do in Windows or the Repository version. Also everything relating to BOINC is in One folder with the Berkeley BOINC install. This makes it simple if you for some reason want more than One Host folder on your machine. It's as simple as having 3 folders named BOINC-1, BOINC-2, and BOINC-3. I usually name them with the Host ID, such as BOINC-6726, BOINC-7052, etc. It also makes it easy if you want to move BOINC-7052 to another machine, just zip the folder, move it to the other machine and unzip it. Everything pertaining to Host ....7052 is in that One folder, even the BOINC Apps.

Oh, BTW. Looks like I'm missing another Cluster by using the Driver from nVidia. I installed 396.54 from Recovery Mode and it wasn't missing the OpenCL files, and the installer didn't touch my perfectly working xorg.conf. Worked just like any other driver install from nVidia.
ID: 1952988 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1953009 - Posted: 31 Aug 2018, 6:51:11 UTC
Last modified: 31 Aug 2018, 6:59:00 UTC

Here it is, the Easiest SETI Special App Install Ever.
Expand the folder to Home
Open BOINC and double click boincmgr.
You Do need to install the Dependencies in the Post, just follow the instructions.

BOINC All-In-One

BTW, there is a compressed copy of the Five BOINC 7.4.44 files in the Docs folder, and a README 7.4.44 file.
ID: 1953009 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1953011 - Posted: 31 Aug 2018, 7:40:41 UTC - in response to Message 1953009.  

Thanks for the new All-in-One TBar.

Oh, BTW, just to clarify since you keep bringing it up, the 396 drivers from the ppa don't mess with the xorg.conf. The drivers install perfectly fine but the latest seem to not install the OpenCL component. The only program that messes with the xorg.conf is nvidia-xconfig and that is only when you run it to install the cool-bits tweak. If you don't run that program, your xorg.conf never gets touched.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1953011 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1953012 - Posted: 31 Aug 2018, 7:56:36 UTC - in response to Message 1953011.  

Don't really know myself, I've never had the problem you guys keep having. Sorta like the problems Stephen keeps having, I don't have any of those either. So, I'll take your word for it, until I experience it, or something different. Speaking of problems, so far I've only had One person say they are having problems with the New CUDA Special App, and that One person refused my offer of help, so, I suppose everything is going well in CUDA land.
ID: 1953012 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1953015 - Posted: 31 Aug 2018, 8:18:14 UTC - in response to Message 1953012.  

You would never have had the problem if you never used nvidia-xconfig. So if you never have any need to change your xorg.conf file with nvidia-xconfig, you will never run into the problem.

I have the .run file and I can run it from recovery console if I ever need to. Emergency preparedness done.

I have not had any issues with the 0.97 app or any of your variants. All work very well. Thank you.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1953015 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1953034 - Posted: 31 Aug 2018, 12:34:54 UTC - in response to Message 1953012.  

Don't really know myself, I've never had the problem you guys keep having. Sorta like the problems Stephen keeps having, I don't have any of those either. So, I'll take your word for it, until I experience it, or something different. Speaking of problems, so far I've only had One person say they are having problems with the New CUDA Special App, and that One person refused my offer of help, so, I suppose everything is going well in CUDA land.


I only had problems with one version of it (v0.97b1 multi) and only on one system. And as you postulated, only because it’s a special case as no one else here runs 6+ GPUs on a single system which is why no one else experienced it.

This was remedied with b2pascal so no further action was necessary.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1953034 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1953044 - Posted: 31 Aug 2018, 14:07:56 UTC - in response to Message 1953034.  

I see. Yet a few days ago you couldn't resist shouting out on the WOW Forum that I had just posted a 'Bad' App and people shouldn't use it, even though no one else had reported any problems and you yourself say it was only on one machine. I believe most people would agree that wasn't a very nice thing to do. Down right Dirty some might say. Especially since I had given you the possible fix and you refused to try it. Read this post from a while ago; That's done by passing a piece of information known as the "API version" in - in this case - the app_info.xml file.

So why don't I put that line in My app_infos? Purely because No One Else does, and I don't want to answer hundreds of people asking why I'm the only one that uses that line. Apparently most people don't need it anyway. After the API line the next consensus possible solution would be to increase the Memory allocation in the Boinc Manager Preferences, and from My experience, you are close to the edge with 6 GPUs and 8GBs of RAM. Yes, some Apps use different amounts of Shared Memory than others. You could have easily have found that info using Google, but you instead decided to badger Me. Again, not very nice IMO.

Let's just say I'm still not a very happy camper.
ID: 1953044 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1953046 - Posted: 31 Aug 2018, 14:42:18 UTC
Last modified: 31 Aug 2018, 14:51:17 UTC

I never once said or implied that you had a bad app. If you got that impression then I apologize. I said I had a problem with the latest version on one system that was (at the time) fixed by reverting to the older version.

When I asked you for some more details about your proposed fix, instead of helping me understand, you said I was too stupid to understand it anyway.

And it’s not that I refused Your fix. I just, one, wanted more info about it, and two, wanted to wait until after the contest since I was getting burned out constantly changing the settings, I just wanted to relax. I had every intention of revisiting the issue afterwards. But you released a new version before that time came so it’s all rather moot
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1953046 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1953048 - Posted: 31 Aug 2018, 14:59:10 UTC - in response to Message 1953046.  

You know, those Shouts are still at WOW, not to mention the few around there that have memory spans.
Apparently you are too stupid to use Google, or were just more interested in badgering than seeking an answer.
If you did use google you would probably find the problem was with your System, not the App. Can't have that can you.
ID: 1953048 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1953050 - Posted: 31 Aug 2018, 15:18:11 UTC

I was originally on v0.97 Pascal from Petri. It worked fine.

You then released v0.97b1multi and said it would supersede v0.97 with little or no impact vs the Pascal only version. (If I recall correctly you did not release a v0.97b1 Pascal version. If I missed it then sorry). This wasn’t the case on the system in question. I reverted back, reported my results here and that’s where we started the discussion.

Since the versions before and after v0.97b1multi worked fine on my system, and no system changes were made across all 3 versions, I can conclude that v0.97b1 multi in the default config did not work well for me.

You’re welcome to search the archived messages or any of my forum posts. I never once said you had a bad app. I said that one version wasn’t working for me.

The post from Richard you linked (and other google searching on my own recently) have been helpful. I’m still a little confused on the actual value used for the api version and what impact that has vs running another value. Or what values are even possible to put in there. Richard recommended to another person to try 6.1.0 and also stated that the value didn’t matter as long as it was at least 6.0. You told me to run 7.11.0. That was my only confusion. I only wanted more info on what the api version value actually represented. What would happen if I used Richards value instead? Or another value? Is there a document to outline the different api versions? And every time I asked I was met with hostility.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1953050 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1953066 - Posted: 31 Aug 2018, 16:58:03 UTC - in response to Message 1953050.  

All I can say is your system must be right on the edge if a slight difference in S_LIMIT settings will trigger a memory shortage. The only difference in All the Apps I've posted, besides the version changes, were slight changes in that setting. I've found that with cards up to the 1060 a lower S_LIMIT setting is faster. In the case of the App you say gave you trouble, it was exactly 11 seconds slower that the second one I posted, according to the benchmark App. The first App I posted was named Maxwell+, because it worked on Maxwells and Pascals. The name was changed because some people wanted to know if it also worked with Pascal in mixed systems. Since Petri's Maxwell App didn't work in 14.04, it was an easy choice to have just One Maxwell App.

Have you had any trouble with that slow app in any other system? No one else has reported any trouble, except it was a little slower than the other. I find it interesting you thought nothing about Shouting out that there were problems with the App on a Public forum. Did you even think of the consequences? You didn't think that would anger anyone?

I posted what was used to compile the App, if I would have posted anything else, someone would have probably questioned it later.
It's right at the top of the log,
This file contains any messages produced by compilers while running configure, to aid debugging if configure makes a mistake.
It was created by BOINC configure 7.11.0, which was generated by GNU Autoconf 2.69....
You are welcome to use whatever you wish, I think Petri is still using 7.5.0. I decided to try the newer version. Why would you be confused about the API number the person that compiled the App gave you? It just sounds as though you are trying to spread more FUD to me. In fact, most of your posts seem to be targeted at spreading FUD. Similar to Shouting out FUD on a Forum. Yes, targeting someone with FUD usually generates hostility. If you don't know what FUD means, use Google.
ID: 1953066 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1953069 - Posted: 31 Aug 2018, 17:15:26 UTC - in response to Message 1953066.  

All I can say is your system must be right on the edge if a slight difference in S_LIMIT settings will trigger a memory shortage. The only difference in All the Apps I've posted, besides the version changes, were slight changes in that setting. I've found that with cards up to the 1060 a lower S_LIMIT setting is faster. In the case of the App you say gave you trouble, it was exactly 11 seconds slower that the second one I posted, according to the benchmark App. The first App I posted was named Maxwell+, because it worked on Maxwells and Pascals. The name was changed because some people wanted to know if it also worked with Pascal in mixed systems. Since Petri's Maxwell App didn't work in 14.04, it was an easy choice to have just One Maxwell App.

Have you had any trouble with that slow app in any other system? No one else has reported any trouble, except it was a little slower than the other. I find it interesting you thought nothing about Shouting out that there were problems with the App on a Public forum. Did you even think of the consequences? You didn't think that would anger anyone?

I posted what was used to compile the App, if I would have posted anything else, someone would have probably questioned it later.
It's right at the top of the log,
This file contains any messages produced by compilers while running configure, to aid debugging if configure makes a mistake.
It was created by BOINC configure 7.11.0, which was generated by GNU Autoconf 2.69....
You are welcome to use whatever you wish, I think Petri is still using 7.5.0. I decided to try the newer version. Why would you be confused about the API number the person that compiled the App gave you? It just sounds as though you are trying to spread more FUD to me. In fact, most of your posts seem to be targeted at spreading FUD. Similar to Shouting out FUD on a Forum. Yes, targeting someone with FUD usually generates hostility. If you don't know what FUD means, use Google.


I'm not trying to create FUD. you posted the app and specifically asked for feedback. which is what i gave you, i saw a problem, i reported it. were you only looking for positive feedback?

all i ever asked for was a document, or SOMETHING, explaining the different api versions available for use and how each api version differs, all you ever gave me was a link to the BOINC wiki that told you the formatting of how to apply it in app_info. i didnt misunderstand what the api_version was FOR, i just wanted to know about the VALUE itself and how each different value would change things. purely for my own pursuit of understanding, nothing more.

it's not that i didnt trust you, of course i trusted you, i just wanted to know more about it and you interpreted that as skepticism from me, which is not what it was.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1953069 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1953098 - Posted: 31 Aug 2018, 21:13:08 UTC - in response to Message 1953069.  
Last modified: 31 Aug 2018, 21:13:40 UTC

Gentleman.

Let us be quiet for a while. You two are on the front edge of a "flame war" if not in one. I want everyone to step away for a couple of days. Please.

And if possible, get the offending posts hidden and/or deleted on WOW's website.

Tbar is doing a great service for us getting these CUDA8.0 and then some setup for use.

Ian&Steve are asking questions.

And both of you are running into the single biggest problem with this type of communication. It loses all the context and body cues.

Please stop and give it a rest.

Tom (on my knees, even though they hurt)
A proud member of the OFA (Old Farts Association).
ID: 1953098 · Report as offensive     Reply Quote
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1953104 - Posted: 31 Aug 2018, 22:19:59 UTC - in response to Message 1953098.  

Gentleman.

Let us be quiet for a while. You two are on the front edge of a "flame war" if not in one. I want everyone to step away for a couple of days. Please.

And if possible, get the offending posts hidden and/or deleted on WOW's website.

Tbar is doing a great service for us getting these CUDA8.0 and then some setup for use.

Ian&Steve are asking questions.

And both of you are running into the single biggest problem with this type of communication. It loses all the context and body cues.

Please stop and give it a rest.

Tom (on my knees, even though they hurt)


Thanks Tom!

I was about to say something in a LOUD VOICE. Not in anger, not in haste. Just something to make "two kids on a block to stop". Period. "."
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1953104 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1953109 - Posted: 31 Aug 2018, 22:45:26 UTC - in response to Message 1953104.  
Last modified: 31 Aug 2018, 22:47:29 UTC

I'm 63, but there does appear to be a child around. I've been kinda busy the last few weeks and missed a bit.
Here's a Definite FUD attempt concerning the missing pulse fix I worked on for weeks, https://setiathome.berkeley.edu/forum_thread.php?id=81271&postid=1952367#1952367
However, it was quickly pointed out the child didn't know what he was doing and he had to scurry away.

I might point out that Stephen knows all about the Repository version of BOINC, that's why he is trying to install the Berkeley version.
Who knows he might succeed this time.
ID: 1953109 · Report as offensive     Reply Quote
Previous · 1 . . . 67 · 68 · 69 · 70 · 71 · 72 · 73 . . . 162 · Next

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.