Message boards :
Number crunching :
Two Nvidia cards, one showing neither being used
Message board moderation
Previous · 1 · 2 · 3
Author | Message |
---|---|
MarkJ Send message Joined: 17 Feb 08 Posts: 1139 Credit: 80,854,192 RAC: 5 |
If you’re under Debian all you need to install is: sudo apt install nvidia-kernel-dkms If you want OpenCL then: sudo apt install nvidia-opencl-icd As I write this Buster has a 418.74 driver, Buster backports has 430.64. If you are on Stretch it has 390.116 and stretch backports has 418.74 BOINC blog |
MarkJ Send message Joined: 17 Feb 08 Posts: 1139 Credit: 80,854,192 RAC: 5 |
If you’re under Debian all you need to install is: I might add they’re under the non-free category, so make sure your /etc/apt/sources.list has non-free after the URL on each line. Typically you’d have “main contrib non-free†without the quotes. BOINC blog |
Siran d'Vel'nahr Send message Joined: 23 May 99 Posts: 7379 Credit: 44,181,323 RAC: 238 |
Greetings Radjin, I ran the command after purging everything nvidia and I can see that only nouveau is shown where before it was: Kernel modules: nouveau, nvidia_drm, nvidia Being a relative noob to Linux, I don't understand why you purge everything NVIDIA after installing the NVIDIA driver. I remember, something several months ago, about blackballing, er, blacklisting nouveau. ;) Don't ask me how I did it, I do not remember and would have to search the Internet again to find out. Heck, it may have been something I read here in these fora. This is what I get when I run that command you posted: rick@Minty-Winders:~$ lspci | grep ' VGA ' | cut -d" " -f 1 | xargs -i lspci -v -s {} 01:00.0 VGA compatible controller: NVIDIA Corporation TU116 [GeForce GTX 1660 Ti] (rev a1) (prog-if 00 [VGA controller]) Subsystem: eVga.com. Corp. Device 1267 Flags: bus master, fast devsel, latency 0, IRQ 149 Memory at de000000 (32-bit, non-prefetchable) [size=16M] Memory at c0000000 (64-bit, prefetchable) [size=256M] Memory at d0000000 (64-bit, prefetchable) [size=32M] I/O ports at e000 [size=128] [virtual] Expansion ROM at 000c0000 [disabled] [size=128K] Capabilities: <access denied> Kernel driver in use: nvidia Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia 02:00.0 VGA compatible controller: NVIDIA Corporation TU116 [GeForce GTX 1660 Ti] (rev a1) (prog-if 00 [VGA controller]) Subsystem: eVga.com. Corp. Device 1266 Flags: bus master, fast devsel, latency 0, IRQ 150 Memory at dc000000 (32-bit, non-prefetchable) [size=16M] Memory at a0000000 (64-bit, prefetchable) [size=256M] Memory at b0000000 (64-bit, prefetchable) [size=32M] I/O ports at d000 [size=128] [virtual] Expansion ROM at dd000000 [disabled] [size=512K] Capabilities: <access denied> Kernel driver in use: nvidia Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia Have a great day! :) Siran CAPT Siran d'Vel'nahr - L L & P _\\// Winders 11 OS? "What a piece of junk!" - L. Skywalker "Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Yet the drivers will not run. There must be some missing dependency that keeps the drivers from activating.Or you have a damaged videocard. You said all worked fine until you added the 710b, so what happens when you take that one out and then install the drivers? If that works, try exchanging the cards, taking the GTX 1650 out and only putting the GT 710b in. Does that work with those drivers, or does it work when you install the drivers? If it doesn't, you found your culprit. If the 710b works in the PCIe slot of the 1650, try either the 1650 or this 710b solely in the other PCIe slot that the 710b was in originally, to exclude that it's a damaged PCIe slot. |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
I tried it once (very early version) and didn't really find any advantage on the small & simple set of partitions I needed. In general such things don't really come into play unless you have large disc arrays with multiple (dynamic) partitions which are not the usual case for the home user. Beware that if one gets things wrong it is possible not just to destroy the partition you were working on, but the whole array, and there is very little chance of rescuing it. Installing Ubuntu on an old laptop to play with it. On my web server rig, do you recommend desktop or server? Radjin~ |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
Whatever you’re comfortable with. But Server is CLI only. No desktop environment. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
Whatever you’re comfortable with. Thanks. It sounds like everyone that knows the OS uses the desktop version. I’ll go with that. Radjin~ |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
Yet the drivers will not run. There must be some missing dependency that keeps the drivers from activating.Or you have a damaged videocard. You said all worked fine until you added the 710b, so what happens when you take that one out and then install the drivers? I did all the above multiple times as I tried to install the drivers three different ways. However I took your advice and did it again except this time I completely purged anything to do with nvidia and opencl, removed the 710B card and reinstalled using this page: https://www.kinetica.com/docs/install/nvidia_deb.html and it started working. I would like to add the 710 card back in but think I will wait until I have a few days together to troubleshoot. Thanks for the info. Radjin~ |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
nvidia-smi Wed Jan 8 20:55:46 2020 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.44 Driver Version: 440.44 CUDA Version: 10.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 1650 Off | 00000000:01:00.0 Off | N/A | | 54% 55C P0 46W / 75W | 276MiB / 3911MiB | 86% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 1603 C ..._x86_64-pc-linux-gnu__opencl_nvidia_SoG 265MiB | +-----------------------------------------------------------------------------+ Radjin~ |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
01:00.0 VGA compatible controller: NVIDIA Corporation TU107 (rev a1) (prog-if 00 [VGA controller]) Subsystem: ZOTAC International (MCO) Ltd. TU107 Flags: bus master, fast devsel, latency 0, IRQ 16 Memory at f5000000 (32-bit, non-prefetchable) [size=16M] Memory at d0000000 (64-bit, prefetchable) [size=256M] Memory at e0000000 (64-bit, prefetchable) [size=32M] I/O ports at 4000 [size=128] [virtual] Expansion ROM at 000c0000 [disabled] [size=128K] Capabilities: <access denied> Kernel driver in use: nvidia Kernel modules: nouveau, nvidia_drm, nvidia VGA compatible controller: NVIDIA Corporation TU107 (rev a1) (prog-if 00 [VGA controller]) Subsystem: ZOTAC International (MCO) Ltd. TU107 Flags: bus master, fast devsel, latency 0, IRQ 28 Memory at e3000000 (32-bit, non-prefetchable) [size=16M] Memory at d0000000 (64-bit, prefetchable) [size=256M] Memory at e0000000 (64-bit, prefetchable) [size=32M] I/O ports at 3000 [size=128] [virtual] Expansion ROM at 000c0000 [disabled] [size=128K] Capabilities: <access denied> Kernel driver in use: nvidia Kernel modules: nouveau, nvidia_drm, nvidia Here is the 1650 card info before (not working) and after (working) the only difference is the IRQ. I didn’t notice it before but looking in an earlier post both cards we’re showing and IRQ of 16. If I remember my old BBS days this causes both devices to fail. What do you guys think? Radjin~ |
Wiggo Send message Joined: 24 Jan 00 Posts: 37277 Credit: 261,360,520 RAC: 489 |
|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13878 Credit: 208,696,464 RAC: 304 |
I didn’t notice it before but looking in an earlier post both cards we’re showing and IRQ of 16. If I remember my old BBS days this causes both devices to fail. What do you guys think?IRQ sharing has been possible for years, and PCI E doesn't actually use IRQs at all. Grant Darwin NT |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
You still have some sort of big problem there as you're error count is mounting fast. It’s all the CUDA WU’s that downloaded. I guess I can’t process them? Radjin~ |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
I didn’t notice it before but looking in an earlier post both cards we’re showing and IRQ of 16. If I remember my old BBS days this causes both devices to fail. What do you guys think?IRQ sharing has been possible for years, and PCI E doesn't actually use IRQs at all. We’ll burst a bubble, I thought I had it figured out. At least the 1650 appears to be working. Radjin~ |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13878 Credit: 208,696,464 RAC: 304 |
We’ll burst a bubble, I thought I had it figured out. At least the 1650 appears to be working.No, it's not. As Wiggo pointed out all it is doing is producing errors. Computer ID 8816958 Run time 1 sec CPU time Validate state Invalid Cuda error 'Couldn't get cuda device count ' in file 'cuda/cudaAcceleration.cu' in line 138 : invalid device ordinal. setiathome_CUDA: cudaGetDeviceCount() call failed. setiathome_CUDA: No CUDA devices found setiathome_CUDA: Found 0 CUDA device(s): In cudaAcc_initializeDevice(): Boinc passed DevPref 1 setiathome_CUDA: CUDA Device 1 specified, checking... Device cannot be used Cuda device initialisation retry 1 of 6, waiting 5 secs... Cuda error 'Couldn't get cuda device count ' in file 'cuda/cudaAcceleration.cu' in line 138 : invalid device ordinal. setiathome_CUDA: cudaGetDeviceCount() call failed. setiathome_CUDA: No CUDA devices found setiathome_CUDA: Found 0 CUDA device(s): In cudaAcc_initializeDevice(): Boinc passed DevPref 1 setiathome_CUDA: CUDA Device 1 specified, checking... Device cannot be used Cuda device initialisation retry 2 of 6, waiting 5 secs... Cuda error 'Couldn't get cuda device count ' in file 'cuda/cudaAcceleration.cu' in line 138 : invalid device ordinal. setiathome_CUDA: cudaGetDeviceCount() call failed. setiathome_CUDA: No CUDA devices found setiathome_CUDA: Found 0 CUDA device(s): In cudaAcc_initializeDevice(): Boinc passed DevPref 1 setiathome_CUDA: CUDA Device 1 specified, checking... Device cannot be used Cuda device initialisation retry 3 of 6, waiting 5 secs... Cuda error 'Couldn't get cuda device count ' in file 'cuda/cudaAcceleration.cu' in line 138 : invalid device ordinal. setiathome_CUDA: cudaGetDeviceCount() call failed. setiathome_CUDA: No CUDA devices found setiathome_CUDA: Found 0 CUDA device(s): In cudaAcc_initializeDevice(): Boinc passed DevPref 1 setiathome_CUDA: CUDA Device 1 specified, checking... Device cannot be usedetc. I'd suggest exiting BOINC, re-booting, checking to see that the driver has started, then try starting BOINC & see if it will start processing WUs. After BOINC starts, check the Event log, eg- 9/01/2020 14:55:51 | | CUDA: NVIDIA GPU 0: GeForce RTX 2060 (driver version 431.60, CUDA version 10.1, compute capability 7.5, 4096MB, 3556MB available, 14054 GFLOPS peak) 9/01/2020 14:55:51 | | CUDA: NVIDIA GPU 1: GeForce GTX 1070 (driver version 431.60, CUDA version 10.1, compute capability 6.1, 4096MB, 3556MB available, 6852 GFLOPS peak) 9/01/2020 14:55:51 | | OpenCL: NVIDIA GPU 0: GeForce RTX 2060 (driver version 431.60, device version OpenCL 1.2 CUDA, 6144MB, 3556MB available, 14054 GFLOPS peak) 9/01/2020 14:55:51 | | OpenCL: NVIDIA GPU 1: GeForce GTX 1070 (driver version 431.60, device version OpenCL 1.2 CUDA, 8192MB, 3556MB available, 6852 GFLOPS peak) If you don't have the CUDA line in there, it can't process WUs using CUDA. If you don't have a OpenCL line in there, you can't process SoG WUs as they require OpenCL Grant Darwin NT |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14684 Credit: 200,643,578 RAC: 874 |
01:00.0 VGA compatible controller: NVIDIA Corporation TU107 (rev a1) (prog-if 00 [VGA controller])There's something fishy there. According to the Wikipedia list of Nvidia GPUs, a GeForce 1650 card should have a TU117 chip. TU107 doesn't appear anywhere in the list. |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.