Message boards :
Number crunching :
Ubuntu Linux GPU woes nearing end
Message board moderation
Author | Message |
---|---|
David Anderson (not *that* DA) Send message Joined: 5 Dec 09 Posts: 215 Credit: 74,008,558 RAC: 74 |
Starting in late 2014 Ubuntu users were encountering difficulties getting a working nvidia GPU driver. This was crucial for GPU crunching. Thousands of machines were listed on Ubuntu bug number 1268257. The problem was finally understood -- as reported in 1431753. Most were able to work around this, but...it took a few commands --- after every new kernel update and reboot. It's fixed in the most recent Ubuntu and hopefully soon in 14.04, the current LTS release. The problem was the nvidia driver setup was (for at least driver 331) in two parts and the minor kernel build that incorporates such is not prepared for that (A 2 part driver build violates the rules for DKMS builds). So (with some randomness depending on timing on a given new-kernel-install) gpu module builds could fail with various consequences (rarely dire, but sometimes...). I'm no kernel expert. This affected me. Today's new Linux kernel 3.13.0-49 #83 installed for me with none of the scary messages (which is not proof that the problem is actually squashed yet, but still it's nice...) I just wanted to wrap up this issue so anyone else on 14.04 would know there is light at the end of the tunnel. |
betreger Send message Joined: 29 Jun 99 Posts: 11361 Credit: 29,581,041 RAC: 66 |
Such joy, but on the good side it exercises the muscle between the ears. |
spitfire_mk_2 Send message Joined: 14 Apr 00 Posts: 563 Credit: 27,306,885 RAC: 0 |
|
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Starting in late 2014 Ubuntu users were encountering difficulties getting a working nvidia GPU driver. This was crucial for GPU crunching. Does anyone know if the ATI problem has been fixed in the latest version? Ever since 14.4 the display settings are not being saved. The same problem exists on three different machines. It works fine up to 13.10, since then you can't configure a machine with more than one card. Every time you reboot the machine the settings go back to default. Even with just one card every time you try to check the temps after a reboot you have to run --initial before aticonfig will work. You'd think after a year it would have been fixed. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Well, that was a waste of time. Seems it's still going downhill...sorta like SETI ;-) Ubuntu 15.04 has downgraded to the point that it will not even make it to the Desktop after installing the AMD proprietary driver. Worse yet, trying to stop or restart lightdm from the console will not even work...gives an error about 'can't connect to Upstart' or something similar. At least with 14.10 you can stop and start lightdm from the console...even if it still won't make it to the Desktop after installing the proprietary driver. So, they broke it with 14.04 and it's still going downhill. At least you can get 14.04 to work with a single AMD card. I didn't try 14.10 or 15.04 with a single card, but I doubt it will make a difference seeing how it hangs at the Ubuntu screen. It doesn't matter if you use the repository driver or the one from AMD. I'm thinking about going back to 12.04 as it appears there's a memory leak in the 13.10 Xorg. At least they are still issuing updates for 12.04... :-( |
ML1 Send message Joined: 25 Nov 01 Posts: 20265 Credit: 7,508,002 RAC: 20 |
Oh dear... A few woes there... Ubuntu is based on the Debian distro and both have recently moved to using systemd to control how the system components start up. That replaces upstart and there is some controversy for the changes. Waffling on a bit... ;-) Complete conjecture on my part: Could well be that AMD hasn't caught up yet with their proprietary drivers... However, I would expect the Ubuntu repository distro supported drivers to work fine. Check back with those? Also take care if you have multiple versions of whatever driver installed... Did a clean install work ok? Are the Ubuntu standard distro drivers recent enough? If this is only for running Boinc: An alternative to try that should be on the leading edge is Centos although I strongly dislike the default menu style. That distro is the root of its own distro tree distinct from Debian. Or one of the small footprint distro even... Good luck and let us know how you do, Happy crunch in, Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
...It doesn't matter if you use the repository driver or the one from AMD. Ubuntu 12.04 was working just fine once I found an AMD driver that worked well with my 2 cards. That is, until I replaced one card with a 6970, connected the main monitor to the 6970, and then managed to download an AP or two. I suspect 12.04 will work well with the 6970 as long as you connect the main monitor to the other card. That's the way it works in 13.10 anyway, don't connect the main monitor to the 6970 and run an AP....if you can find an AP. I'm not sure about the 13.10 Xorg memory use, but I don't remember it going from around 8% after boot to ~60% after a day or so in 12.04. I suppose I'll just have to revert to Ubuntu 12.04 and check it out. There doesn't seem to be a problem in Ubuntu 14.04 running just one card, after 5 days Xorg still shows ~6% memory use. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
So, I appear to have fixed my Ubuntu GPU Woes. Installing a fresh Ubuntu 12.04.3 from my older DVD and AMD driver 14.12 seems to have worked. One symptom was apparent, seems when running the older Ubuntu Installers, such as 13.10 & 12.04.3, the 2 monitors are detected and lit up by the Ubuntu Installer. This is Not the case with Ubuntu Installers 14.04.2 and above. Seeing as how the Propitiatory Video drivers are Not involved with the Ubuntu Installer, I'm going to keep blaming Ubuntu for these GPU Woes. Just as I'm going to blame BOINC for looking in the Wrong location for the AMD OpenCL Library. The AMD driver places the libOpenCL.so.1 file in usr/lib and usr/lib32. BOINC is looking for libOpenCL.so in usr/lib/x86_64-linux-gnu and claiming the system doesn't have OpenCL when it doesn't find it. You have to make a link to libOpenCL.so.1, place it in usr/lib/x86_64-linux-gnu, and rename it to libOpenCL.so to have BOINC see OpenCL. At least that's the way it works with BOINC 7.2.33 and Ubuntu 12.04.4. As for the Xorg Memory leak, this comment sums it up, "There are leaks all over the place..." So far, after 8 hours, my new system is still showing Xorg using 8.8% (175 MB) of memory. Which is much better than it was with Ubuntu 13.10 and AMD driver 14.6. 12.04 forever! |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.