Ubuntu Linux GPU woes nearing end

Message boards : Number crunching : Ubuntu Linux GPU woes nearing end
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile David Anderson (not *that* DA) Project Donor
Avatar

Send message
Joined: 5 Dec 09
Posts: 215
Credit: 74,008,558
RAC: 74
United States
Message 1665441 - Posted: 15 Apr 2015, 0:16:36 UTC

Starting in late 2014 Ubuntu users were encountering difficulties getting a working nvidia GPU driver. This was crucial for GPU crunching.

Thousands of machines were listed on Ubuntu bug number 1268257.

The problem was finally understood -- as reported in 1431753.

Most were able to work around this, but...it took a
few commands --- after every new kernel update and reboot.

It's fixed in the most recent Ubuntu and hopefully soon in 14.04,
the current LTS release.
The problem was the nvidia driver setup was (for at least driver 331)
in two parts and the minor kernel build that incorporates such
is not prepared for that (A 2 part driver build
violates the rules for DKMS builds). So (with some randomness
depending on timing on a given new-kernel-install) gpu module builds could fail
with various consequences (rarely dire, but sometimes...).

I'm no kernel expert. This affected me.
Today's new Linux kernel 3.13.0-49 #83 installed for me
with none of the scary messages (which is not proof
that the problem is actually squashed yet, but still it's nice...)

I just wanted to wrap up this issue so anyone else
on 14.04 would know there is light at the end of the tunnel.
ID: 1665441 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1665448 - Posted: 15 Apr 2015, 0:26:43 UTC - in response to Message 1665441.  

Such joy, but on the good side it exercises the muscle between the ears.
ID: 1665448 · Report as offensive
spitfire_mk_2
Avatar

Send message
Joined: 14 Apr 00
Posts: 563
Credit: 27,306,885
RAC: 0
United States
Message 1665460 - Posted: 15 Apr 2015, 1:06:25 UTC

In time for dancing in the streets.
https://www.youtube.com/watch?v=sQdWK041Vuo
ID: 1665460 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1665648 - Posted: 15 Apr 2015, 15:08:58 UTC - in response to Message 1665441.  

Starting in late 2014 Ubuntu users were encountering difficulties getting a working nvidia GPU driver. This was crucial for GPU crunching.

Thousands of machines were listed on Ubuntu bug number 1268257.

The problem was finally understood -- as reported in 1431753.

Most were able to work around this, but...it took a
few commands --- after every new kernel update and reboot.

It's fixed in the most recent Ubuntu and hopefully soon in 14.04,
the current LTS release.
The problem was the nvidia driver setup was (for at least driver 331)
in two parts and the minor kernel build that incorporates such
is not prepared for that (A 2 part driver build
violates the rules for DKMS builds). So (with some randomness
depending on timing on a given new-kernel-install) gpu module builds could fail
with various consequences (rarely dire, but sometimes...).

I'm no kernel expert. This affected me.
Today's new Linux kernel 3.13.0-49 #83 installed for me
with none of the scary messages (which is not proof
that the problem is actually squashed yet, but still it's nice...)

I just wanted to wrap up this issue so anyone else
on 14.04 would know there is light at the end of the tunnel.

Does anyone know if the ATI problem has been fixed in the latest version? Ever since 14.4 the display settings are not being saved. The same problem exists on three different machines. It works fine up to 13.10, since then you can't configure a machine with more than one card. Every time you reboot the machine the settings go back to default. Even with just one card every time you try to check the temps after a reboot you have to run --initial before aticonfig will work.
You'd think after a year it would have been fixed.
ID: 1665648 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1669699 - Posted: 25 Apr 2015, 6:28:39 UTC
Last modified: 25 Apr 2015, 6:30:19 UTC

Well, that was a waste of time. Seems it's still going downhill...sorta like SETI ;-)

Ubuntu 15.04 has downgraded to the point that it will not even make it to the Desktop after installing the AMD proprietary driver. Worse yet, trying to stop or restart lightdm from the console will not even work...gives an error about 'can't connect to Upstart' or something similar.
At least with 14.10 you can stop and start lightdm from the console...even if it still won't make it to the Desktop after installing the proprietary driver.
So, they broke it with 14.04 and it's still going downhill. At least you can get 14.04 to work with a single AMD card. I didn't try 14.10 or 15.04 with a single card, but I doubt it will make a difference seeing how it hangs at the Ubuntu screen. It doesn't matter if you use the repository driver or the one from AMD.

I'm thinking about going back to 12.04 as it appears there's a memory leak in the 13.10 Xorg. At least they are still issuing updates for 12.04...

:-(
ID: 1669699 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20265
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1669758 - Posted: 25 Apr 2015, 13:09:27 UTC - in response to Message 1669699.  
Last modified: 25 Apr 2015, 13:11:38 UTC

Oh dear... A few woes there...

Ubuntu is based on the Debian distro and both have recently moved to using systemd to control how the system components start up. That replaces upstart and there is some controversy for the changes.



Waffling on a bit... ;-)

Complete conjecture on my part: Could well be that AMD hasn't caught up yet with their proprietary drivers...

However, I would expect the Ubuntu repository distro supported drivers to work fine. Check back with those?

Also take care if you have multiple versions of whatever driver installed...


Did a clean install work ok? Are the Ubuntu standard distro drivers recent enough?


If this is only for running Boinc:

An alternative to try that should be on the leading edge is Centos although I strongly dislike the default menu style. That distro is the root of its own distro tree distinct from Debian.

Or one of the small footprint distro even...


Good luck and let us know how you do,

Happy crunch in,
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1669758 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1669841 - Posted: 25 Apr 2015, 16:32:58 UTC - in response to Message 1669758.  
Last modified: 25 Apr 2015, 17:03:11 UTC

...It doesn't matter if you use the repository driver or the one from AMD.

Ubuntu 12.04 was working just fine once I found an AMD driver that worked well with my 2 cards. That is, until I replaced one card with a 6970, connected the main monitor to the 6970, and then managed to download an AP or two. I suspect 12.04 will work well with the 6970 as long as you connect the main monitor to the other card. That's the way it works in 13.10 anyway, don't connect the main monitor to the 6970 and run an AP....if you can find an AP.

I'm not sure about the 13.10 Xorg memory use, but I don't remember it going from around 8% after boot to ~60% after a day or so in 12.04. I suppose I'll just have to revert to Ubuntu 12.04 and check it out. There doesn't seem to be a problem in Ubuntu 14.04 running just one card, after 5 days Xorg still shows ~6% memory use.
ID: 1669841 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1670411 - Posted: 26 Apr 2015, 17:07:18 UTC

So, I appear to have fixed my Ubuntu GPU Woes. Installing a fresh Ubuntu 12.04.3 from my older DVD and AMD driver 14.12 seems to have worked. One symptom was apparent, seems when running the older Ubuntu Installers, such as 13.10 & 12.04.3, the 2 monitors are detected and lit up by the Ubuntu Installer. This is Not the case with Ubuntu Installers 14.04.2 and above. Seeing as how the Propitiatory Video drivers are Not involved with the Ubuntu Installer, I'm going to keep blaming Ubuntu for these GPU Woes. Just as I'm going to blame BOINC for looking in the Wrong location for the AMD OpenCL Library. The AMD driver places the libOpenCL.so.1 file in usr/lib and usr/lib32. BOINC is looking for libOpenCL.so in usr/lib/x86_64-linux-gnu and claiming the system doesn't have OpenCL when it doesn't find it. You have to make a link to libOpenCL.so.1, place it in usr/lib/x86_64-linux-gnu, and rename it to libOpenCL.so to have BOINC see OpenCL. At least that's the way it works with BOINC 7.2.33 and Ubuntu 12.04.4.

As for the Xorg Memory leak, this comment sums it up, "There are leaks all over the place..." So far, after 8 hours, my new system is still showing Xorg using 8.8% (175 MB) of memory. Which is much better than it was with Ubuntu 13.10 and AMD driver 14.6.

12.04 forever!
ID: 1670411 · Report as offensive

Message boards : Number crunching : Ubuntu Linux GPU woes nearing end


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.