Message boards :
Number crunching :
Two Nvidia cards, one showing neither being used
Message board moderation
Author | Message |
---|---|
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
I have two video cards installed . I installed the driver directly from the Nvidia site; both cards take the same driver. === lspci | grep ' VGA ' | cut -d" " -f 1 | xargs -i lspci -v -s {} 01:00.0 VGA compatible controller: NVIDIA Corporation TU107 (rev a1) (prog-if 00 [VGA controller]) Subsystem: ZOTAC International (MCO) Ltd. TU107 Flags: bus master, fast devsel, latency 0, IRQ 16 Memory at f5000000 (32-bit, non-prefetchable) [size=16M] Memory at d0000000 (64-bit, prefetchable) [size=256M] Memory at e0000000 (64-bit, prefetchable) [size=32M] I/O ports at 4000 [size=128] [virtual] Expansion ROM at 000c0000 [disabled] [size=128K] Capabilities: <access denied> Kernel driver in use: nvidia Kernel modules: nouveau, nvidia_drm, nvidia 04:00.0 VGA compatible controller: NVIDIA Corporation GK208 [GeForce GT 710B] (rev a1) (prog-if 00 [VGA controller]) Subsystem: ZOTAC International (MCO) Ltd. GK208B [GeForce GT 710] Flags: fast devsel, IRQ 16 [virtual] Memory at f3000000 (32-bit, non-prefetchable) [size=16M] Memory at e8000000 (64-bit, prefetchable) [size=128M] Memory at f0000000 (64-bit, prefetchable) [size=32M] I/O ports at 2000 [size=128] [virtual] Expansion ROM at f4000000 [disabled] [size=512K] Capabilities: <access denied> Kernel driver in use: nvidia Kernel modules: nouveau, nvidia_drm, nvidia === The first card is a GeForce GTX 1650 the second GeForce GT 710B. The first shows up in the Boinc users computers, the second does not show up at all. Boinc is downloading CUDA WU’s but they are not being processed, Boinc-Manager shows the GPU as missing. This started a couple of days ago upon trying to upgrade the drivers using the Debian repository as Nvidia-detect said the cards work under that standard install. Since that failed I uninstalled and went back to the drivers directly from Nvidia via [sh NVIDIA-Linux-x86_64-440.44.run]. Computer: https://setiathome.berkeley.edu/show_host_detail.php?hostid=8816958 Stderr output from a failed wu below: === Stderr output <core_client_version>7.14.2</core_client_version> <![CDATA[ <message> too many boinc_temporary_exit()s</message> <stderr_txt> mporary exit (180 secs) v8 task detected Cuda error 'Couldn't get cuda device count ' in file 'cuda/cudaAcceleration.cu' in line 138 : invalid device ordinal. setiathome_CUDA: cudaGetDeviceCount() call failed. setiathome_CUDA: No CUDA devices found setiathome_CUDA: Found 0 CUDA device(s): In cudaAcc_initializeDevice(): Boinc passed DevPref 1 setiathome_CUDA: CUDA Device 1 specified, checking... Device cannot be used Cuda device initialisation retry 1 of 6, waiting 5 secs... Cuda error 'Couldn't get cuda device count ' in file 'cuda/cudaAcceleration.cu' in line 138 : invalid device ordinal. setiathome_CUDA: cudaGetDeviceCount() call failed. setiathome_CUDA: No CUDA devices found setiathome_CUDA: Found 0 CUDA device(s): In cudaAcc_initializeDevice(): Boinc passed DevPref 1 setiathome_CUDA: CUDA Device 1 specified, checking... Device cannot be used Cuda initialisation FAILED, Initiating Boinc temporary exit (180 secs) </stderr_txt> ]]> === The last part repeats 6 times so I did not paste here. I must have missed some install step. Suggestions please. Radjin~ |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
First 30 lines of the start up log from Bonic manager. Did you install a cc_config.xml to use all gpus? Otherwise only the most capable GPU will be used. |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
you need to add a flag to your CC config to allow BOINC to use all GPUs when you have mismatched GPUs. cc_config.xml and place in the BOINC main directory: <cc_config> <options> <use_all_gpus>1</use_all_gpus> </options> </cc_config> next, about the errors, I believe the cuda60 app doesnt work well. just wait it out and the servers will send you SoG and sah apps again which seem to process properly on your system as of a few days ago. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
you need to add a flag to your CC config to allow BOINC to use all GPUs when you have mismatched GPUs. I added the lines above so my cc_config.xml Now looks like this: <cc_config> <log_flags> <task>1</task> <file_xfer>1</file_xfer> <sched_ops>1</sched_ops> </log_flags> </cc_config> <cc_config> <options> <use_all_gpus>1</use_all_gpus> </options> </cc_config> Thanks to you both for that tidbit. I rebooted, I need to run for a bit; we’ll see how it works while I’m gone. Radjin~ |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
First 30 lines of the start up log from Bonic manager. Thanks to you both for that tidbit, I just added the code. I have to run; rebooted and let’s see how it runs while I am gone a few hours. Radjin~ |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
<cc_config> <log_flags> <task>1</task> <file_xfer>1</file_xfer> <sched_ops>1</sched_ops> </log_flags> <options> <use_all_gpus>1</use_all_gpus> </options> </cc_config> Might be a better option |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
<cc_config> <log_flags> <task>1</task> <file_xfer>1</file_xfer> <sched_ops>1</sched_ops> </log_flags> <options> <use_all_gpus>1</use_all_gpus> </options> </cc_config> +1 Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
<cc_config> <log_flags> <task>1</task> <file_xfer>1</file_xfer> <sched_ops>1</sched_ops> </log_flags> <options> <use_all_gpus>1</use_all_gpus> </options> </cc_config> Updated and rebooted again. Thanks for that. Radjin~ |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
First 30 lines of the start up log from Bonic manager. What is the name of the start up log? Client state? I didn’t see a file named start up in the Boinc-client directory. Radjin~ |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
its in the GUI. just hit ctrl+shift+E with the boincmgr GUI running. that brings up the log, you can copy and paste the first few lines from there. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Wiggo Send message Joined: 24 Jan 00 Posts: 36755 Credit: 261,360,520 RAC: 489 |
There's no drivers being listed for that rig so I'd look further into that. ;-) Cheers. |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
There's no drivers being listed for that rig so I'd look further into that. ;-) I saw drivers being used in the output from the first post. That does not mean they are installed? The same question when it is showing in computers on the boinc site? In the first post I described running the installer from nvidia; I didn’t get any errors. Is there another step I missed? Radjin~ |
Wiggo Send message Joined: 24 Jan 00 Posts: 36755 Credit: 261,360,520 RAC: 489 |
That rig only shows, "NVIDIA GeForce GTX 1650 (3911MB)", were it should be showing something more like this, "NVIDIA GeForce GTX 1650 (3911MB) driver: 440.44 OpenCL: 1.2", for it to work. ;-)There's no drivers being listed for that rig so I'd look further into that. ;-)I saw drivers being used in the output from the first post. That does not mean they are installed? Cheers. |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
Thanks for that info. At least I know what to look for. Radjin~ |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
next, about the errors, I believe the cuda60 app doesnt work well. just wait it out and the servers will send you SoG and sah apps again which seem to process properly on your system as of a few days ago. Or speed things up with this is cc_config.xml <no_cuda>1</no_cuda> That will prevent the CUDA60 from being used and force the SAH and SoG OpenCL apps on the host. Still need to see OpenCL being detected on BOINC startup for even that to work. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
I wonder how he got and processed SoG and sah tasks already tho. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
Here is the Boinc log. === Tue 31 Dec 2019 03:18:06 PM PST | | Starting BOINC client version 7.14.2 for x86_64-pc-linux-gnu Tue 31 Dec 2019 03:18:06 PM PST | | log flags: file_xfer, sched_ops, task Tue 31 Dec 2019 03:18:06 PM PST | | Libraries: libcurl/7.64.0 OpenSSL/1.1.1d zlib/1.2.11 libidn2/2.0.5 libps l/0.20.2 (+libidn2/2.0.5) libssh2/1.8.0 nghttp2/1.36.0 librtmp/2.3 Tue 31 Dec 2019 03:18:06 PM PST | | Data directory: /var/lib/boinc-client Tue 31 Dec 2019 03:18:07 PM PST | | CUDA: NVIDIA GPU 0: GeForce GTX 1650 (driver version unknown, CUDA versi on 10.2, compute capability 7.5, 3912MB, 3851MB available, 2984 GFLOPS peak) Tue 31 Dec 2019 03:18:07 PM PST | | App version needs OpenCL but GPU doesn't support it Tue 31 Dec 2019 03:18:07 PM PST | SETI@home | Application uses missing NVIDIA GPU Tue 31 Dec 2019 03:18:07 PM PST | | App version needs OpenCL but GPU doesn't support it Tue 31 Dec 2019 03:18:07 PM PST | SETI@home | Application uses missing NVIDIA GPU Tue 31 Dec 2019 03:18:07 PM PST | | App version needs OpenCL but GPU doesn't support it Tue 31 Dec 2019 03:18:07 PM PST | SETI@home | Application uses missing NVIDIA GPU Tue 31 Dec 2019 03:18:07 PM PST | SETI@home | Missing coprocessor for task 26dc19aa.10543.2112.9.36.254.vlar_ 0 Tue 31 Dec 2019 03:18:07 PM PST | SETI@home | Missing coprocessor for task blc56_2bit_guppi_58692_58180_HIP21 489_0021.22559.409.21.44.208.vlar_0 Tue 31 Dec 2019 03:18:07 PM PST | SETI@home | Missing coprocessor for task 13se08ab.7651.8661.14.41.174_2 Tue 31 Dec 2019 03:18:07 PM PST | SETI@home | Missing coprocessor for task 26dc19aa.5186.12746.6.33.76.vlar_1 Tue 31 Dec 2019 03:18:07 PM PST | SETI@home | Missing coprocessor for task blc56_2bit_guppi_58692_60738_HIP23 083_0029.1830.409.21.44.18.vlar_1 Tue 31 Dec 2019 03:18:07 PM PST | SETI@home | Missing coprocessor for task 26dc19aa.5169.17245.5.32.51_1 Tue 31 Dec 2019 03:18:07 PM PST | SETI@home | Missing coprocessor for task 26dc19aa.5186.18063.6.33.3.vlar_1 Tue 31 Dec 2019 03:18:07 PM PST | SETI@home | Missing coprocessor for task 26dc19aa.5169.19699.5.32.49_0 Tue 31 Dec 2019 03:18:07 PM PST | SETI@home | Missing coprocessor for task blc56_2bit_guppi_58692_58497_HIP21 594_0022.11036.409.21.44.128.vlar_1 Tue 31 Dec 2019 03:18:07 PM PST | SETI@home | Missing coprocessor for task 26dc19aa.17700.9474.14.41.81_1 Tue 31 Dec 2019 03:18:07 PM PST | SETI@home | Missing coprocessor for task blc56_2bit_guppi_58692_61070_HIP23 512_0030.13547.818.21.44.14.vlar_1 Tue 31 Dec 2019 03:18:07 PM PST | SETI@home | Missing coprocessor for task 26dc19aa.3075.3748.16.43.253_1 === My system does not process CUDA? Radjin~ |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14679 Credit: 200,643,578 RAC: 874 |
Or speed things up with this is cc_config.xmlI'm worried about that line. It doesn't appear as an option in the User Manual, and it isn't written out in the full template when you change a logging option. Also, it should be written in the config report at startup - and it isn't there in the log Radjin has just posted. |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
I wonder how he got and processed SoG and sah tasks already tho. It was working until I did the update and added the second GPU. Now no matter what I do it does not load. I tried the standard Debian Nvidia-Driver install and that locks the system to where I have to do a hard restart then use recovery to remove it. Then I tried an install using Nvidia-driver install, as described here: https://linuxusers.net/debian/how_install_debian_10_buster_with_nvidia.php on step 5 using back ports. That ran but was not seeing the drivers. Lastly I went back to the drivers directly from Nvidia as described above and that’s where I am now. I have tried each step above with and without the second card installed. Radjin~ |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
You are going to need to figure out how to get the Nvidia OpenCL drivers installed somehow. I don't know what Debian has as an equivalent to the standard Ubuntu distro command: sudo apt-get install ocl-icd-libopencl1 I thought that Jord proved that <no_cuda>1</no_cuda> works over at Github. I see it referenced several times in conversation. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.