Setting up Linux to crunch CUDA90 and above for Windows users

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 95 · 96 · 97 · 98 · 99 · 100 · 101 . . . 162 · Next

AuthorMessage
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1985692 - Posted: 17 Mar 2019, 23:03:12 UTC - in response to Message 1985683.  

The Nvidia X Server is reporting full 16x connectivity on individual riser card setups. And on the 1 to 4 riser/expander card too.
You will get the actual readings if you go to the PowerMizer Tab and then select Preferred Mode: Prefer Max Performance setting. Then look at the Current readings, that will give you the correct values. My 1 to 4 Switch will only show PCIe2 speed at x1. I don't use that Switch when possible, because it gives the same Hung GPUs you are getting.

I also recommend you Don't buy a more expensive Switch, but instead use the money for a more capable motherboard. You can buy a new board cheaper than some of those expensive switches.


You are right, once I got pointed to the right menu.

While I am getting Gen3 on most of my slots. With the exception of the one gpu I have plugged into the MB, everyone else is showing 1X (one lane).

Inspite of that, the two Gtx 1070Ti's seem to be processing just as fast as the one that is directly connected to the MB.

If it will show up, I have a possible MB upgrade. It will at least let me run 7 gpus with full channels if I get ambitious and buy the full sized ribbon cables that Ian&SteveC are using. To do that would probably require I move the MB onto a Miners Rack and set them up right above the MB.
Might be able to shoehorn all three Gtx 1070Ti's onto the MB and use two riser expanders to handle the gtx 1060's as an transitional step.

Right now, it is "all working" so to speak. If I wanted to limit myself to 8 gpus, I could tuck another 1070Ti back inside. That covers one of the slots I was using to get to 9 gpus.

I foresee a busy Tuesday if all the parts show up.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1985692 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1985706 - Posted: 18 Mar 2019, 1:13:11 UTC - in response to Message 1985692.  

whats the model of the new board?
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1985706 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1985713 - Posted: 18 Mar 2019, 2:33:32 UTC - in response to Message 1985706.  
Last modified: 18 Mar 2019, 2:44:04 UTC

whats the model of the new board?


It is a
Supermicro Server Motherboard MBD-X9DRH-IF REV 1.02 Dual LGA2011 DDR3 Intel Xeon

https://www.ebay.com/itm/Supermicro-Server-Motherboard-MBD-X9DRH-IF-REV-1-02-Dual-LGA2011-DDR3-Intel-Xeon/292998574881?ssPageName=STRK%3AMEBIDX%3AIT&_trksid=p2057872.m2749.l2649#viTabs_0

It is both of a Supermicro and it has 7 PCIe slots. Since it was about $130 and most were priced at $230 I jumped on it like the proverbial starving cat. I am aware that it is "reconditioned" and that it may crap out on me. On the other hand it is a server MB and they did boot it into Linux.

I have seen a number of single CPU boards like the above with 7 slots which looked like good possible choices. My reluctance was because I would be down to "only" 20 threads :)

On the other hand, an e5-2670 (2.6Ghz, 8c/16t) is only $64 so setting a 7 slot gpu MB from scratch wouldn't be that expensive. Anyway, the breadboard version will run 2 e5-2670's till I see how many GPUs I can hang off it. I probably won't be able to test it thoroughly without putting it pretty much, into production in place of my ASROCK Motherboard.

I guess my tipping point is someplace around $300. Which is why I haven't ordered the same MB you have Ian&SteveC.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1985713 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1985721 - Posted: 18 Mar 2019, 3:39:46 UTC - in response to Message 1985713.  

make sure you get a narrow ILM compatible CPU cooler.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1985721 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1985773 - Posted: 18 Mar 2019, 13:52:15 UTC - in response to Message 1985721.  

make sure you get a narrow ILM compatible CPU cooler.


Missed that again. Opps.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1985773 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1985820 - Posted: 18 Mar 2019, 17:05:48 UTC - in response to Message 1985773.  
Last modified: 18 Mar 2019, 17:11:17 UTC

make sure you get a narrow ILM compatible CPU cooler.


Missed that again. Opps.

Tom

I know we discussed this before. So I went looking:

It may do 7. Only way to know will be to try. But it’s an X9 board so with a BIOS update, it should have the Above 4G decoding option.

Also pay attention to the CPU mounting. That board also uses a Narrow-ILM Mount. And will require a heatsink/cooler compatible with it. Most coolers are made for Square ILM only.



https://www.servethehome.com/narrow-ilm-square-ilm-lga2011-heatsink-differences/


https://www.amazon.com/Supermicro-Heatsink-Cooling-Systems-SNK-P0050AP4/dp/B007UXPIX4/ref=sr_1_1?_encoding=UTF8&camp=1789&creative=390957&keywords=Supermicro+SNK-P0050AP4&linkCode=ur2&qid=1552928853&s=gateway&sr=8-1&tag=servecom-20
A proud member of the OFA (Old Farts Association).
ID: 1985820 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1985826 - Posted: 18 Mar 2019, 17:40:14 UTC - in response to Message 1985820.  

yeah those are the same CPU coolers i have. they work very well.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1985826 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1988791 - Posted: 4 Apr 2019, 20:33:51 UTC

Darn. My Ryzen 7 2700 box OS started throwing errors if I ran Seti. So I was able to copy my BONIC all in one folder run through a re-install and restart where I left off.

After letting it install 1 set of post-install security updates I have disabled the updates facility. It no longer checks for updates at all. And none of the categories (Security... etc) are checked.
So I am hoping to stop having those "opps".

Tom
A proud member of the OFA (Old Farts Association).
ID: 1988791 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1988797 - Posted: 4 Apr 2019, 21:21:30 UTC - in response to Message 1988791.  

Darn. My Ryzen 7 2700 box OS started throwing errors if I ran Seti. So I was able to copy my BONIC all in one folder run through a re-install and restart where I left off.

After letting it install 1 set of post-install security updates I have disabled the updates facility. It no longer checks for updates at all. And none of the categories (Security... etc) are checked.
So I am hoping to stop having those "opps".

Tom


. . Hi Tom,

. . If the problem was caused by an update that was incompatible with something you are running, are you not aware that when rebooting there is an option in Ubuntu (and in Lubuntu too I am sure) to boot from the previous version that was working prior to the update? This will not only get the system running but allow you to completely remove the errant update that caused the problem by using synaptic package manager.

Stephen

. .
ID: 1988797 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1988798 - Posted: 4 Apr 2019, 21:24:15 UTC - in response to Message 1988797.  

Darn. My Ryzen 7 2700 box OS started throwing errors if I ran Seti. So I was able to copy my BONIC all in one folder run through a re-install and restart where I left off.

After letting it install 1 set of post-install security updates I have disabled the updates facility. It no longer checks for updates at all. And none of the categories (Security... etc) are checked.
So I am hoping to stop having those "opps".

Tom


. . Hi Tom,

. . If the problem was caused by an update that was incompatible with something you are running, are you not aware that when rebooting there is an option in Ubuntu (and in Lubuntu too I am sure) to boot from the previous version that was working prior to the update? This will not only get the system running but allow you to completely remove the errant update that caused the problem by using synaptic package manager.

Stephen

. .


I have had no luck using the above trick. The only thing that works reliably is a clean install. I suspect it is not necessarily a bad update which is why the above doesn't seem to work. Something gets broken and the only fix has been a clean install.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1988798 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1988812 - Posted: 4 Apr 2019, 22:45:21 UTC - in response to Message 1988798.  

Tom it seems you are a good candidate for using Timeshift application. It is a utility that takes a "snapshot" of your system files every hour on the hour. So if a update causes things to go titsup, then you can at least roll back to a known good configuration that is only an hour old.

http://www.linuxandubuntu.com/home/timeshift-a-system-restore-utility-tool-review

https://www.dedoimedo.com/computers/timeshift.html
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1988812 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1988848 - Posted: 5 Apr 2019, 4:00:33 UTC - in response to Message 1988823.  

So it looks like I shot myself in the foot. Again. Ouch :)

Since I have disabled all the updates for the AMD 2700 I probably won't have that issue. Again.

Anyone who wants to run an AM4 socket motherboard with 6 possibly 7 GPUs should look at the Biostar TB350-BTC. It doesn't have very many system fan jacks. And it doesn't have a power-up button on the motherboard. But it somehow runs 6 GPUs.... And it doesn't cost $500 (I paid $64?).

Tom
A proud member of the OFA (Old Farts Association).
ID: 1988848 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1989980 - Posted: 14 Apr 2019, 5:07:58 UTC

Just noticed my Linux AMD box has been throwing some invalids. Since I had to reinstall the OS but thought I had backedup/restored the BOINC folder I just replaced the cpu task with the one in the AMD folder. They had different sizes/dates so I was either running the Intel cpu app or I missed something else.

I hope that will make a difference. If it doesn't I am going to have to throttle something down so cpu is happier.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1989980 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1989984 - Posted: 14 Apr 2019, 5:50:10 UTC

. . @ Petri

. . Apologies for the pm, in my eagerness I forgot about your policy. My bad! But I am still interested in learning the process ...

. . Also, anytime you want a guinea pig for the new special sauce extra picante, my hand is raised :)

Stephen

:)
ID: 1989984 · Report as offensive     Reply Quote
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1989989 - Posted: 14 Apr 2019, 7:47:35 UTC - in response to Message 1989984.  

. . @ Petri

. . Apologies for the pm, in my eagerness I forgot about your policy. My bad! But I am still interested in learning the process ...

. . Also, anytime you want a guinea pig for the new special sauce extra picante, my hand is raised :)

Stephen

:)


Hi Stephen,

Looking at the results and good validation rate of the latest tests I think that the 0.98 can be released for Linux as soon as TBar is ready to and for MAC when a stable configuration of driver/compiler version is found.
Oddbjornik is working on runnning "one and a sixteenth of WUs at a time" so that some initialization and cleanup can be done on CPU for one task while another task is doing the heavy processing on GPU. Lets see if that brings an additional performance gain to total output and if we can add that to the new executable too.

Petri
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1989989 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1990012 - Posted: 14 Apr 2019, 12:38:53 UTC - in response to Message 1989989.  

Hi Stephen,

Looking at the results and good validation rate of the latest tests I think that the 0.98 can be released for Linux as soon as TBar is ready to and for MAC when a stable configuration of driver/compiler version is found.
Oddbjornik is working on runnning "one and a sixteenth of WUs at a time" so that some initialization and cleanup can be done on CPU for one task while another task is doing the heavy processing on GPU. Lets see if that brings an additional performance gain to total output and if we can add that to the new executable too.

Petri


. . Thanks to you guys we live in exciting times ..........

Stephen

:)
ID: 1990012 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1990360 - Posted: 17 Apr 2019, 11:41:47 UTC

. . The latest from Petri/TBar ...

http://www.arkayn.us/lunatics/BOINC.7z

Stephen
. .
ID: 1990360 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1990690 - Posted: 19 Apr 2019, 18:45:23 UTC

Just was reading through the release notes for Disco Dingo, the new Ubuntu 19.04 distro release and saw this interesting tidbit.

Safe Graphics Mode. A new option is added to the Grub menu which will boot with "NOMODESET" on. This may help you resolve issues on certain graphics cards and allow you to boot and install any propriatary drivers needed by your system.

This will be very welcome as the lack of default nomodeset being used in installation always seems to trip up Linux newbies. I've always thought nomodeset should be the standard configuration.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1990690 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1990695 - Posted: 19 Apr 2019, 19:30:27 UTC - in response to Message 1990690.  

Have they said anything about their plans for all the Apps they broke by only supporting OpenSSL 1.1 in their recent Beta? This is a Libcurl3 repeat. There are probably just as many, if not more, Apps that depend on some 1.0.x version of OpenSSL as there were Libcurl3. As far as I can tell, if your App depends on OpenSSL 1.0.2 it won't work in their last 19.04 Beta.
It seems they have some recent drivers in the Additional Drivers panel, perhaps they have decided to support something higher than CUDA 9.0 in this next release.
They really need to fix the OpenSSL problem, I can see a large number of unhappy people, again, if the 19.04 release is the same as the Beta.
ID: 1990695 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1990701 - Posted: 19 Apr 2019, 20:09:18 UTC - in response to Message 1990695.  

Anybody depending on SSL 1.0 is going to be disappointed and unhappy

removal of OpenSSL 1.0 with intending to only ship OpenSSL 1.1.1 LTS

This is being done because of the security concerns with the old 1.0.
I haven't found any issues yet with OpenSSL 1.1.1b that I have installed. What issues did you find with it?
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1990701 · Report as offensive     Reply Quote
Previous · 1 . . . 95 · 96 · 97 · 98 · 99 · 100 · 101 . . . 162 · Next

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.