Message boards :
Number crunching :
Only 1 out of 3 GPUs being used, cc_config not working.
Message board moderation
Author | Message |
---|---|
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
So i have a system as follows Supermicro X9DRi-LN4F+ v1.10 2x Xeon E5-2690 (v1) 32GB ram 2x GTX 750ti FTW 1x GTX 1060 SC Running on Ubuntu 17.10 x64 with the "special sauce" I have tried reinstalled the latest WHQL drivers (390.48, from ppa), to no avail this system was previously running fine with 2x 750tis and crunching both cards, but of course i can't leave well enough alone :) so i added the 1060, and now BOINC/SETI is ONLY crunching on the 1060 and the 750tis are sitting idle doing nothing. i do have my cc_config.xml file as follows: (from my previous working setup) <cc_config> <options> <use_all_gpus>1</use_all_gpus> </options> </cc_config> app_config.xml <app_config> <app> <name>astropulse_v7</name> <gpu_versions> <gpu_usage>1</gpu_usage> <cpu_usage>1.0</cpu_usage> </gpu_versions> </app> <app> <name>setiathome_v8</name> <gpu_versions> <gpu_usage>1</gpu_usage> <cpu_usage>1.0</cpu_usage> </gpu_versions> </app> </app_config> app_info.xml <app_info> <app> <name>setiathome_v8</name> </app> <file_info> <name>setiathome_x41p_zi3v_x86_64-pc-linux-gnu_cuda90</name> <executable/> </file_info> <file_info> <name>libcudart.so.9.0</name> </file_info> <file_info> <name>libcufft.so.9.0</name> </file_info> <app_version> <app_name>setiathome_v8</app_name> <platform>x86_64-pc-linux-gnu</platform> <version_num>801</version_num> <plan_class>cuda90</plan_class> <cmdline></cmdline> <coproc> <type>NVIDIA</type> <count>1</count> </coproc> <avg_ncpus>0.1</avg_ncpus> <max_ncpus>0.1</max_ncpus> <file_ref> <file_name>setiathome_x41p_zi3v_x86_64-pc-linux-gnu_cuda90</file_name> <main_program/> </file_ref> <file_ref> <file_name>libcudart.so.9.0</file_name> </file_ref> <file_ref> <file_name>libcufft.so.9.0</file_name> </file_ref> </app_version> <app> <name>astropulse_v7</name> </app> <file_info> <name>astropulse_7.08_x86_64-pc-linux-gnu__opencl_nvidia_100</name> <executable/> </file_info> <file_info> <name>AstroPulse_Kernels_r2751.cl</name> </file_info> <file_info> <name>ap_cmdline_7.08_x86_64-pc-linux-gnu__opencl_nvidia_100.txt</name> </file_info> <app_version> <app_name>astropulse_v7</app_name> <platform>x86_64-pc-linux-gnu</platform> <version_num>708</version_num> <plan_class>opencl_nvidia_100</plan_class> <coproc> <type>NVIDIA</type> <count>1</count> </coproc> <avg_ncpus>0.1</avg_ncpus> <max_ncpus>0.1</max_ncpus> <file_ref> <file_name>astropulse_7.08_x86_64-pc-linux-gnu__opencl_nvidia_100</file_name> <main_program/> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2751.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_7.08_x86_64-pc-linux-gnu__opencl_nvidia_100.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app> <name>setiathome_v8</name> </app> <file_info> <name>MBv8_8.22r3711_sse41_x86_64-pc-linux-gnu</name> <executable/> </file_info> <app_version> <app_name>setiathome_v8</app_name> <platform>x86_64-pc-linux-gnu</platform> <version_num>800</version_num> <file_ref> <file_name>MBv8_8.22r3711_sse41_x86_64-pc-linux-gnu</file_name> <main_program/> </file_ref> </app_version> <app> <name>astropulse_v7</name> </app> <file_info> <name>ap_7.05r2728_sse3_linux64</name> <executable/> </file_info> <app_version> <app_name>astropulse_v7</app_name> <version_num>704</version_num> <platform>x86_64-pc-linux-gnu</platform> <plan_class></plan_class> <file_ref> <file_name>ap_7.05r2728_sse3_linux64</file_name> <main_program/> </file_ref> </app_version> </app_info> do i have to add something different when the GPUs are not matching in the same system? all my other machines have matching GPU types and "just work" help please. output of nvidia-smi: @SIERRA-SPARE:~$ nvidia-smi Fri May 18 21:30:24 2018 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 390.48 Driver Version: 390.48 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 750 Ti Off | 00000000:03:00.0 On | N/A | | 42% 46C P0 2W / 52W | 317MiB / 1997MiB | 2% Default | +-------------------------------+----------------------+----------------------+ | 1 GeForce GTX 106... Off | 00000000:04:00.0 Off | N/A | | 46% 64C P2 78W / 120W | 1809MiB / 3019MiB | 93% Default | +-------------------------------+----------------------+----------------------+ | 2 GeForce GTX 750 Ti Off | 00000000:82:00.0 Off | N/A | | 42% 37C P8 1W / 65W | 13MiB / 2002MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 1332 G /usr/lib/xorg/Xorg 10MiB | | 0 1592 G /usr/bin/gnome-shell 50MiB | | 0 1838 G /usr/lib/xorg/Xorg 114MiB | | 0 2000 G /usr/bin/gnome-shell 125MiB | | 1 3100 C ...me_x41p_zi3v_x86_64-pc-linux-gnu_cuda90 1797MiB | +-----------------------------------------------------------------------------+ Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Which folder is the cc_config in? |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
/var/lib/boinc-client/projects/setiathome.berkley.edu/ here is the log showing that BOINC is ignoring it. 18-May-2018 21:10:16 [---] Data directory: /var/lib/boinc-client 18-May-2018 21:10:19 [---] CUDA: NVIDIA GPU 0: GeForce GTX 1060 3GB (driver version 390.48, CUDA version 9.1, compute capability 6.1, 3019MB, 2952MB available, 4228 GFLOPS peak) 18-May-2018 21:10:19 [---] CUDA: NVIDIA GPU 1 (not used): GeForce GTX 750 Ti (driver version 390.48, CUDA version 9.1, compute capability 5.0, 1997MB, 1959MB available, 1622 GFLOPS peak) 18-May-2018 21:10:19 [---] CUDA: NVIDIA GPU 2 (not used): GeForce GTX 750 Ti (driver version 390.48, CUDA version 9.1, compute capability 5.0, 2002MB, 1964MB available, 1622 GFLOPS peak) 18-May-2018 21:10:19 [---] OpenCL: NVIDIA GPU 0: GeForce GTX 1060 3GB (driver version 390.48, device version OpenCL 1.2 CUDA, 3019MB, 2952MB available, 4228 GFLOPS peak) 18-May-2018 21:10:19 [---] OpenCL: NVIDIA GPU 1 (ignored by config): GeForce GTX 750 Ti (driver version 390.48, device version OpenCL 1.2 CUDA, 1997MB, 1959MB available, 1622 GFLOPS peak) 18-May-2018 21:10:19 [---] OpenCL: NVIDIA GPU 2 (ignored by config): GeForce GTX 750 Ti (driver version 390.48, device version OpenCL 1.2 CUDA, 2002MB, 1964MB available, 1622 GFLOPS peak) 18-May-2018 21:10:19 [SETI@home] Found app_info.xml; using anonymous platform Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
/var/lib/boinc-client/projects/setiathome.berkley.edu/ cc_config.xml should be in the boinc-client directory. /var/lib/boinc-client Grant Darwin NT |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Move the cc_config.xml and put it in the boinc folder. It's sitting 2 folders down from where it needs to be placed. Once you do that, close out boinc and relaunch it. It should re-read that xml and begin to use the cards. I also want to see if the event log says it sees the cc_config.xml after you move it. |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
/var/lib/boinc-client/projects/setiathome.berkley.edu/ Move the cc_config.xml and put it in the boinc folder. It's sitting 2 folders down from where it needs to be placed. Once you do that, close out boinc and relaunch it. It should re-read that xml and begin to use the cards. I also want to see if the event log says it sees the cc_config.xml after you move it. yup. this was it. i always thought the cc_config file went in the same directory as the app_config files. but maybe that's only windows. heres the log now: 18-May-2018 22:01:17 [---] Data directory: /var/lib/boinc-client 18-May-2018 22:01:22 [---] CUDA: NVIDIA GPU 0: GeForce GTX 1060 3GB (driver version 390.48, CUDA version 9.1, compute capability 6.1, 3019MB, 2952MB available, 4228 GFLOPS peak) 18-May-2018 22:01:22 [---] CUDA: NVIDIA GPU 1: GeForce GTX 750 Ti (driver version 390.48, CUDA version 9.1, compute capability 5.0, 1997MB, 1959MB available, 1622 GFLOPS peak) 18-May-2018 22:01:22 [---] CUDA: NVIDIA GPU 2: GeForce GTX 750 Ti (driver version 390.48, CUDA version 9.1, compute capability 5.0, 2002MB, 1964MB available, 1622 GFLOPS peak) 18-May-2018 22:01:22 [---] OpenCL: NVIDIA GPU 0: GeForce GTX 1060 3GB (driver version 390.48, device version OpenCL 1.2 CUDA, 3019MB, 2952MB available, 4228 GFLOPS peak) 18-May-2018 22:01:22 [---] OpenCL: NVIDIA GPU 1: GeForce GTX 750 Ti (driver version 390.48, device version OpenCL 1.2 CUDA, 1997MB, 1959MB available, 1622 GFLOPS peak) 18-May-2018 22:01:22 [---] OpenCL: NVIDIA GPU 2: GeForce GTX 750 Ti (driver version 390.48, device version OpenCL 1.2 CUDA, 2002MB, 1964MB available, 1622 GFLOPS peak) 18-May-2018 22:01:22 [SETI@home] Found app_info.xml; using anonymous platform nvidia-smi: @SIERRA-SPARE:~$ nvidia-smi Fri May 18 22:08:16 2018 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 390.48 Driver Version: 390.48 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 750 Ti Off | 00000000:03:00.0 On | N/A | | 51% 72C P0 21W / 52W | 1781MiB / 1997MiB | 93% Default | +-------------------------------+----------------------+----------------------+ | 1 GeForce GTX 106... Off | 00000000:04:00.0 Off | N/A | | 47% 66C P2 103W / 120W | 1825MiB / 3019MiB | 100% Default | +-------------------------------+----------------------+----------------------+ | 2 GeForce GTX 750 Ti Off | 00000000:82:00.0 Off | N/A | | 46% 59C P0 29W / 65W | 1490MiB / 2002MiB | 88% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 1332 G /usr/lib/xorg/Xorg 15MiB | | 0 1523 G /usr/bin/gnome-shell 50MiB | | 0 1756 G /usr/lib/xorg/Xorg 111MiB | | 0 1918 G /usr/bin/gnome-shell 109MiB | | 0 2663 C ...me_x41p_zi3v_x86_64-pc-linux-gnu_cuda90 1460MiB | | 1 2723 C ...me_x41p_zi3v_x86_64-pc-linux-gnu_cuda90 1797MiB | | 2 2706 C ...me_x41p_zi3v_x86_64-pc-linux-gnu_cuda90 1460MiB | +-----------------------------------------------------------------------------+ thanks for the quick help guys! Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
yup. this was it. i always thought the cc_config file went in the same directory as the app_config files. but maybe that's only windows. Nope, i'm on Windows and it's in my BOINC directory. I'd say it's a case of things that affect BOINC as a whole, go in it's directory. Things that affect only a project, go in that project's directory. Glad it was easily sorted. Grant Darwin NT |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
From the above it would appear you have the Monitor attached to one of the 2 GB 750 Ti GPUs. As you can see, the 750 Ti is using Most of the Video RAM;| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 750 Ti Off | 00000000:03:00.0 On | N/A | | 51% 72C P0 21W / 52W | 1781MiB / 1997MiB | 93% Default | +-------------------------------+----------------------+----------------------+ | 1 GeForce GTX 106... Off | 00000000:04:00.0 Off | N/A | | 47% 66C P2 103W / 120W | 1825MiB / 3019MiB | 100% Default | +-------------------------------+----------------------+----------------------+ | 2 GeForce GTX 750 Ti Off | 00000000:82:00.0 Off | N/A | | 46% 59C P0 29W / 65W | 1490MiB / 2002MiB | 88% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 1332 G /usr/lib/xorg/Xorg 15MiB | | 0 1523 G /usr/bin/gnome-shell 50MiB | | 0 1756 G /usr/lib/xorg/Xorg 111MiB | | 0 1918 G /usr/bin/gnome-shell 109MiB | | 0 2663 C ...me_x41p_zi3v_x86_64-pc-linux-gnu_cuda90 1460MiB | | 1 2723 C ...me_x41p_zi3v_x86_64-pc-linux-gnu_cuda90 1797MiB | | 2 2706 C ...me_x41p_zi3v_x86_64-pc-linux-gnu_cuda90 1460MiB | +-----------------------------------------------------------------------------+ 1781MiB of 1997MiB. That means if you opened enough Browser windows that card will run out of vRAM and start trashing tasks with errors. Since you have a 1060 with over a GB of vRAM available, it would be best if you used that GPU for the Monitor. I have run a 3 GB 1060 out of vRAM before, but it takes quite a bit more than it does with a 2 GB GPU. The best is a 4 GB Card, so far I haven't managed to run one of those out of vRAM. |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
From the above it would appear you have the Monitor attached to one of the 2 GB 750 Ti GPUs. As you can see, the 750 Ti is using Most of the Video RAM; yes, i have the monitor connected to one of the 750tis. mainly because my KVM switch is VGA only, and the 1060 does not support analog out, while the 750ti does via the DVI-I port. this system is a cruncher only. not regularly using any browsers, just the BOINC window and a terminal running HTOP Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.