Message boards :
Number crunching :
NVIDIA GPU blues (750Ti and 250)
Message board moderation
Author | Message |
---|---|
ralphw Send message Joined: 7 May 99 Posts: 78 Credit: 18,032,718 RAC: 38 |
Finally upgraded my video cards, going from a single NVIDIA Geforce 250 to two faster cards (Geforce 750 Ti) First problem: Ubuntu 12.04 kernel PANIC when both cards are installed I temporarily addressed this by removing the second (identical) card. lspci | grep VGA shows this hardware: 02:00.0 VGA compatible controller: NVIDIA Corporation Device 1380 (rev a2) Second problem: No GPU workloads are running. Here's the log for Boinc/SETI Mon 19 Oct 2015 09:22:07 PM EDT | | Starting BOINC client version 7.2.33 for x86_64-pc-linux-gnu Mon 19 Oct 2015 09:22:07 PM EDT | | log flags: file_xfer, sched_ops, task Mon 19 Oct 2015 09:22:07 PM EDT | | Libraries: libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3 Mon 19 Oct 2015 09:22:07 PM EDT | | Data directory: /var/lib/boinc-client Mon 19 Oct 2015 09:22:07 PM EDT | | CUDA: NVIDIA GPU 0: GeForce GTX 750 Ti (driver version unknown, CUDA version 6.5, compute capability 5.0, 2047MB, 1871MB available, 2409 GFLOPS peak) Mon 19 Oct 2015 09:22:07 PM EDT | | App version needs OpenCL but GPU doesn't support it Mon 19 Oct 2015 09:22:07 PM EDT | Milkyway@Home | Application uses missing NVIDIA GPU Mon 19 Oct 2015 09:22:07 PM EDT | | App version needs OpenCL but GPU doesn't support it Suggestions are welcome. I've seen other suggestions that say my updated drivers (340. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
The GTS 250 uses a different driver than the GTX 750, they won't work together, I've tried it myself. Which driver are you using, the one from additional drivers or a manually installed one? Usually you can fix the OpenCL problem by placing a link to libOpenCL.so in /usr/lib depending on which driver you installed. |
ralphw Send message Joined: 7 May 99 Posts: 78 Credit: 18,032,718 RAC: 38 |
Yes, I grabbed the 340.93 driver from the NVidia site (64-bit Linux). I recently read something about drivers up to 341.<something> not working. So I grabbed a 352.55 driver to try next, and will try the link to libOpenCL.so The GTS 250 had SLI support, and the 750 has something new called GSYNC. Hopefully when I resolve the Kernel PANIC issue, I can get the second card going. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
I've had good success with this one, http://www.nvidia.com/download/driverResults.aspx/83686/en-us with Ubuntu 14.04. I'm not sure how it will work with the stock setiathome MBv7 App, but it works great on the APs and CUDA 6.0 App. After dropping into the console and stopping lightdm, you might want to run sudo apt-get remove --purge nvidia-* or something similar to remove the existing driver Before installing the new one. With that driver on my machine OpenCl worked, but I had to make links to libcuda.so in usr/lib to get cuda to work. |
rob smith Send message Joined: 7 Mar 03 Posts: 22202 Credit: 416,307,556 RAC: 380 |
SLI & GSYNC are ways of linking graphics cards - very useful in video processing, but totally unnecessary for SETI@Home. As TBar says it is almost certain that you will never be able to get the '250 and '750s to work in the same machine - they are so different. Indeed I would say that the only use for the '250 is heating the room, they are very very power hungry compared to the much faster '750 - Expirence says that a '250 is, in reallity, about 10% as fast as a 750. To get the '750 working you will need to have OpenCL running, TBar has already described that you will need to set a link to the appropriate library. (Not this is OpenCL, and not OpenGL, they are very different beasts) Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
ralphw Send message Joined: 7 May 99 Posts: 78 Credit: 18,032,718 RAC: 38 |
Thanks for the tips. I've created the CUDA link from /usr/lib/libcuda.so to the appropriate spot. /usr/lib/libOpenCL.so is another matter, it runs through /etc/alternatives, but ends up being a symlink to nothing (there seems to be no 64 bit OpenCL shared object on my system.) The "Additional Drivers" window in Linux shows nvidia_340, still, after I've upgraded. So I'll try purging, the (re)installing the latest NVIDIA driver Just want to confirm, OpenCL for Astropulse is what I need. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Thanks for the tips. Yes, Astropulse AND the New SetiatHome MBv7 App use OpenCL. You have to manually install the Linux cuda App. Additional Drivers just shows which drivers are available in the repository for Your installed version of Linux, it is different for each version of OS. It will Always show the same drivers available for your OS no matter what driver you have installed. If you have manually installed a driver from nVidia, Additional drivers should say something similar to 'continue using manually installed driver'. Which driver did you install? Oops, I forgot clinfo is a AMD thing. Forget about clinfo with nvidia, you'll just have to go with what BOINC says. On my machine I have the manually installed driver 346.59, libnvidia-opencl.so.346.59 is in usr/lib/x86_64-linux-gnu, and I didn't have to make any links to get OpenCL to work. It's sometimes different depending on your OS & driver. If you have libnvidia-opencl.so.xxx and it still doesn't see OpenCL try making a link to it in usr/lib and naming it libOpenCL.so & libOpenCL.so.1. See how that works. |
Zombu2 Send message Joined: 24 Feb 01 Posts: 1615 Credit: 49,315,423 RAC: 0 |
The 250 is a waste of power anyways grab another 750 TI ...gets you around 22k rac a lil more if you oc the cards I came down with a bad case of i don't give a crap |
ralphw Send message Joined: 7 May 99 Posts: 78 Credit: 18,032,718 RAC: 38 |
Despite making the links, something still wasn't right with OpenCL. So I resolved the problem by purging and re-installing the Nvidia package.
|
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
It's good to hear you got 352 working. The fallback that seems to work for everyone is to install the nVidia Toolkit 7.5, it also installs driver 352. I just installed a fresh copy of Ubuntu 15.10 and it also has driver 352 in the Additional Drivers, I believe Ubuntu 14.04 and 15.04 have driver 346. I gave up on the 352 in additional drivers after failing to get BOINC to see OpenCL. In fact, I had to give up on BOINC as well, at least every BOINC up to 7.4.22. The only copy that would work with Ubuntu 15.10 is BOINC 7.4.22 and it has a bug where the last task started doesn't update the progress. Something about; tbar@TBarsIntel:~/BOINC$ ./boincmgr I pasted libwebkitgtk into the Package Manager's Filter box, installed libwebkitgtk-1.0-0, and then BOINC worked....mostly. The first thing I saw was NO USABLE GPU FOUND. Great. I installed nVidia-Modprobe and that gave me CUDA, but Nothing would get BOINC to see OpenCL using driver 352.41 from the repository. So, I installed driver 346.59 that I had downloaded from nVidia. That gave me OpenCL but Not CUDA. I installed Modprobe again, since it went away with the purge, but still No CUDA. So, I made a link to libcuda.so.346.59, moved it to usr/lib, named it libcuda.so, and that gave me CUDA. Success! So far, it looks to be working about the same as Ubuntu 14.04.3, except the last task started doesn't update...oh well. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Despite making the links, something still wasn't right with OpenCL. OK, I see some completed tasks now. It appears you are using the 340 driver from the Ubuntu 12.04.x repository and getting many OpenCL detection Errors, http://setiathome.berkeley.edu/result.php?resultid=4455996022. With my 750Ti I was getting Kernel Panics with my Mac in anything other than Yosemite. Even then I had to install the nVidia WebDriver 346.xx to get it to work in Yosemite. So, I would suggest upgrading to at least Ubuntu 14.04.x. I'm not going to suggest 15.10 because the version of BOINC at Berkeley doesn't work correctly with Ubuntu 15.10. I'm either going to have to relearn how to compile the BOINC Manager or go back to Ubuntu 14.04.3 which I still have on another partition. I would also suggest downloading nVidia driver 346.59 I linked to previously since it gives you OpenCL without any mods and runs APs extremely well with the 750Ti. The way I install a driver is the same with nVidia or AMD. 1) download the driver, unzip it, move the driver part to your home folder and set the execute bit. 2) hit ctrl+alt+F1 to drop into the console and log in 3) enter sudo stop lightdm to stop the xScreen, with 15.xx it's sudo service lightdm stop 4) purge the existing driver with something like sudo apt-get purge "nvidia.*" 5) enter dir to print the driver name then install the driver with sudo ./whatever 6) follow the instructions and build for Ubuntu Hopefully that will stop the Panics and the Errors. Oh, if you use Dual cards in 14.04.x you will probably have this problem, https://bugs.launchpad.net/ubuntu/+source/ubuntu-drivers-common/+bug/1310489 You can solve that by commenting out the lines in the GPU_manager as explained in the thread, a) Edit /etc/init/gpu-manager.conf commenting out lines until it looks like this: Good luck... |
ralphw Send message Joined: 7 May 99 Posts: 78 Credit: 18,032,718 RAC: 38 |
Things are better now - I adjusted the Memory Low Gap setting on my BIOS (setting it to 3). Now I'm crunching happily and correctly on GPU 0 and GPU 1, both GTX 750 with no OpenCL detection problems. HTTP connectivity to the S@home servers has been spotty today, Along the way, I: - updated Ubuntu to 12.04.5 - crashed my box running the nvidia-settings utility from the "upgrade" - reinstalled my driver (NVIDIA-Linux-x86_64.340.93.run from NVIDIA's site) - ran apt-get purge nvidia-*, reinstalled the 3.40 driver. I'll experiment with more recent drivers now that the both cards are installed. I no longer see the OpenCL detection problems you mentioned earlier - http://setiathome.berkeley.edu/result.php?resultid=4465917759 shows one of the last two. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
It appears you are still receiving an OpenCL error; setiathome_7.08_x86_64-pc-linux-gnu__opencl_nvidia_sah: /usr/lib/x86_64-linux-gnu/libOpenCL.so.1: no version information available... http://setiathome.berkeley.edu/result.php?resultid=4471342927 You are also getting Many 'Validation inconclusives'. This is what we saw at Beta when using Drivers older than 350.xx with setiathome_7.08_x86_64-pc-linux-gnu__opencl_nvidia_sah. Most of the tasks were labeled inconclusive with a few eventually being Invalid. The OpenCL version Error is probably also caused by the driver. You're probably going to have to update the driver to around 352.xx to solve the inconclusives, which will probably solve the version error as well. You could try the 346.59 driver but I'm beginning to doubt it will be any better than the other pre-350 drivers with that App. |
ralphw Send message Joined: 7 May 99 Posts: 78 Credit: 18,032,718 RAC: 38 |
I added a GTX 950, though I'm not currently seeing the peak GFLOPS of that card reflected in time to do GPU workunits. OpenCL: GPU 0: GeForce GTX 950 (driver 352.55, device OpenCL 1.2 CUDA, 2047MB, 1790MB available, 3208 GFLOPS peak) OpenCL: GPU 1: GeForce GTX 750 Ti (driver 352.55, device OpenCL 1.2 CUDA, 2048MB, 2011MB available, 2409 GFLOPS peak) OpenCL: GPU 2: GeForce GTX 750 Ti (driver 352.55, device OpenCL 1.2 CUDA, 2048MB, 2011MB available, 2409 GFLOPS peak) At any rate, I updated the driver from 340.X to 352.55 as well. I'll play around to see if I can get CUDA workunits to process, but I think I'm ready to try an optimized client now. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.