How to use multiple GPU's

Author	Message
Zalster Volunteer tester Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242	Message 1959366 - Posted: 9 Oct 2018, 4:05:19 UTC - in response to Message 1959351. Ok, adding gpus fairly easily ( reboot, reboot,reboot) in the status column (for GPUs) ie: Running (0.18 CPUs + 1 NVIDA GPU).... where is the 0.18 (in this instance as I see almost all pcs are different) configured, and what are the values sources? (if that makes sense) THANX The value of 0.18 is in either 1 of 2 places. First place is in the app_info.xml As you read thru it you will find for cuda applications the value of 0.18 located there. The other place you can find it is in an app_config.xml. If you have the latter, then it will override the value 0.18 for whatever value you set in the app_config.xml. Where is the value source...... From what I remember it was a value that the original creators of the lunatics came up with, ie cuda 32 and cuda 42. However, it was never follow up on once we went to cuda 5.0 How do I know that?? Because I experimented with different values after I started with it and found that 0.18 was too low of a value. Of the top of my head I can't remember what the value I finally settled on but it was higher than 0.18. Found it. It was 0.35 that was the actual amount each cuda 5.0 needed to run correctly. For SoG, I found that it need 0.97 of a core to run per each work unit. ie.. might as well just set it to 1. ID: 1959366 ·

Bravo6 Send message Joined: 28 Oct 99 Posts: 52 Credit: 44,947,756 RAC: 0	Message 1960411 - Posted: 15 Oct 2018, 15:52:25 UTC Last modified: 15 Oct 2018, 15:55:10 UTC Is there an overall effect (neg or pos) of setting use at most 0% of CPUs in the manager (going to try running several [5] GPUs). Would it be better to do in the config files? Also I do not see much performance effect from additional system RAM? THANX "Don't worry about it, nothing is gonna be O.K. anyway.........." ID: 1960411 ·

Tom M Volunteer tester Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462	Message 1960417 - Posted: 15 Oct 2018, 16:13:35 UTC - in response to Message 1960411. Is there an overall effect (neg or pos) of setting use at most 0% of CPUs in the manager (going to try running several [5] GPUs). Would it be better to do in the config files? Also I do not see much performance effect from additional system RAM? THANX Excellent question! Depending on how heavily your CPU is loaded, reducing the CPU cores that are processing Seti tasks can speed up your overall production. I think the numbers bandied around are either something like 10% of the available cores or 1-3 cores should be idled. I have no experience trying to manage the cpu's proper using app_config.xml and/or app_info.xml BUT You can control the total number of tasks the project will run using <project_max_concurrent>5</project_max_concurrent> inside the app_config.xml file. You need to put it inside the outside pair of parameters. And you can control the # of cpu cores you use. If you are using less than 0.50? cpus / gpu then this will not control the number of gpu's that are being run. If you are running 1 CPU / gpu it will. I only have 1 machine right now that is running pure gpu only. I controlled it that way by setting up one of the "locations" in on the Seti website for my computers to be "gpu" only. Since it will not harm your system to set it to 0% cpu cores, you should be able to experiment and get immediate feedback. If the GPUs stop processing when you do this then either you have 1 gpu per CPU setup or I am wrong. Tom A proud member of the OFA (Old Farts Association). ID: 1960417 ·

Zalster Volunteer tester Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242	Message 1960493 - Posted: 16 Oct 2018, 1:46:31 UTC - in response to Message 1960417. I've never understood why people continue to limit CPU usage by using the % option. Makes no sense. You tell the CPU it can only use 10% of all CPU. So what are all the work units (both CPU and GPU) supposed to do?? Cut up that 10% among all of them?? Because that is what you are telling it to doing. Others will say that isn't so but that's not what I've seen. Tell the computer it can use 100% of all cores then limit how many work units you have running at anytime by use of the <project_max_concurrent> in the app_config.xml. my 2 cents... ID: 1960493 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1962364 - Posted: 29 Oct 2018, 6:00:45 UTC - in response to Message 1960493. Last modified: 29 Oct 2018, 6:39:29 UTC I've always used the CPU % setting to set the number of CPUs to run. I've Never had a problem with it either, and it's Global. I have had problems with using <project_max_concurrent> because it's Not Global. I.E. set the SETI preferences to 6 max while running 3 GPUs & 3 CPUs. Now switch a GPU to run SETI Beta. That will result in SETI launching another CPU task making the total be 4 CPU and 3 GPU tasks...which is Too many for me. I never could get it to work correctly, because it's Not Global. So, I don't use an app_config file, the old fashioned way works fine for me. To Prove the % setting only affects the CPU tasks, look at this machine, it is set to run 49% of 8 CPUs and -nobs on the GPUs. Obviously 3 CPU tasks are running and the GPUs are using 2 Full CPUs, one a piece; https://setiathome.berkeley.edu/results.php?hostid=6796475&offset=220&show_names=0&state=0&appid= It's also the same running 24% (One) CPU, and 3 GPUs with -nobs. One CPU task will run and the 3 GPUs will use 3 Full CPUs, proving the CPU % setting Only affects the CPU tasks, not the GPU tasks. Also, my Mining machine is set to run One CPU for when I decide to Bunker Tasks. Right now I'm not running any CPU tasks, but, it still shows around 40-60% CPU usage even though the setting says 24%. When I do change it to run One CPU while Bunkering, that 40-60% goes up by about 12.5%, again proving the CPU % setting Only affects the CPUs, https://setiathome.berkeley.edu/results.php?hostid=6813106&offset=1100 ID: 1962364 ·

jrs Send message Joined: 1 Feb 16 Posts: 3 Credit: 73,979,603 RAC: 127	Message 1968654 - Posted: 4 Dec 2018, 9:46:56 UTC Hi. I am not able to make both my Nvidia 1060 and 1070 GPU work with Boinc. It work with other software like NiceHash miner. OS is windows 10 Pro. In the log file I get this message. 04.12.2018 09.47.32 \| \| Running under account jrs 04.12.2018 09.47.33 \| \| CUDA: NVIDIA GPU 0: GeForce GTX 1070 (driver version 417.22, CUDA version 10.0, compute capability 6.1, 4096MB, 3560MB available, 6803 GFLOPS peak) 04.12.2018 09.47.33 \| \| CUDA: NVIDIA GPU 1 (not used): GeForce GTX 1060 3GB (driver version 417.22, CUDA version 10.0, compute capability 6.1, 3072MB, 2487MB available, 4111 GFLOPS peak) 04.12.2018 09.47.33 \| \| OpenCL: NVIDIA GPU 0: GeForce GTX 1070 (driver version 417.22, device version OpenCL 1.2 CUDA, 8192MB, 3560MB available, 6803 GFLOPS peak) 04.12.2018 09.47.33 \| \| OpenCL: NVIDIA GPU 1 (ignored by config): GeForce GTX 1060 3GB (driver version 417.22, device version OpenCL 1.2 CUDA, 3072MB, 2487MB available, 4111 GFLOPS peak) What I have tested. Driver. I have stopped windows form making drivers updates. Installed the newest Nvidia driver with the fresh option. In the log it now seen to be ok. When I check the driver under system it is 25.21.14.1722. It has the same date as my new instalation. Is it suppose to have another number? Nvidia controll panel. Have turn all on. I have also set the 1060 to be Primary OpenGL, without any changes. app_config.xml I have had a similar problem with and old computer. I then removed this file from the project folder and it start working, but then only one operation for each GPU. (Also 2 GPUs, but old GTX 560.) cc_config.xml Added <use_all_gpus>1</use_all_gpus> under options. Remove and reinstall I have uninstalled Bonic and Oracel VM and installed the newest version. Any suggestion ? Richard Steen Norway ID: 1968654 ·

Jord Volunteer tester Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3	Message 1968665 - Posted: 4 Dec 2018, 11:47:18 UTC - in response to Message 1968654. cc_config.xml Added <use_all_gpus>1</use_all_gpus> under options. Did you exit & restart BOINC completely after adding this? By completely I don't just mean exit BOINC Manager and restart it because this may just restart the Manager, not the client. Command decisions, including all detection of CPU and GPU are made only at BOINC startup. Can you also post the contents of your cc_config.xml file? The only extension on cc_config.xml is xml? If you edited the file with Notepad, it may have added .txt to the end (and with Windows default still hiding known extensions, you may never know!) ID: 1968665 ·

jrs Send message Joined: 1 Feb 16 Posts: 3 Credit: 73,979,603 RAC: 127	Message 1968666 - Posted: 4 Dec 2018, 12:17:44 UTC - in response to Message 1968665. Last modified: 4 Dec 2018, 13:06:22 UTC I have rebooted the computer several times. Is it any command that I can add to the start of the boinc, to force it? My cc_config.xml. Extension is xml. <cc_config> <log_flags> <file_xfer>1</file_xfer> <sched_ops>1</sched_ops> <task>1</task> <app_msg_receive>0</app_msg_receive> <app_msg_send>0</app_msg_send> <async_file_debug>0</async_file_debug> <benchmark_debug>0</benchmark_debug> <checkpoint_debug>0</checkpoint_debug> <coproc_debug>0</coproc_debug> <cpu_sched>0</cpu_sched> <cpu_sched_debug>0</cpu_sched_debug> <cpu_sched_status>0</cpu_sched_status> <dcf_debug>0</dcf_debug> <disk_usage_debug>0</disk_usage_debug> <file_xfer_debug>0</file_xfer_debug> <gui_rpc_debug>0</gui_rpc_debug> <heartbeat_debug>0</heartbeat_debug> <http_debug>0</http_debug> <http_xfer_debug>0</http_xfer_debug> <idle_detection_debug>0</idle_detection_debug> <mem_usage_debug>0</mem_usage_debug> <network_status_debug>0</network_status_debug> <notice_debug>0</notice_debug> <poll_debug>0</poll_debug> <priority_debug>0</priority_debug> <proxy_debug>0</proxy_debug> <rr_simulation>0</rr_simulation> <rrsim_detail>0</rrsim_detail> <sched_op_debug>0</sched_op_debug> <scrsave_debug>0</scrsave_debug> <slot_debug>0</slot_debug> <state_debug>0</state_debug> <statefile_debug>0</statefile_debug> <suspend_debug>0</suspend_debug> <task_debug>0</task_debug> <time_debug>0</time_debug> <trickle_debug>0</trickle_debug> <unparsed_xml>0</unparsed_xml> <work_fetch_debug>0</work_fetch_debug> </log_flags> <options> <use_all_gpus>1</use_all_gpus> <abort_jobs_on_exit>0</abort_jobs_on_exit> <allow_multiple_clients>0</allow_multiple_clients> <allow_remote_gui_rpc>0</allow_remote_gui_rpc> <disallow_attach>0</disallow_attach> <dont_check_file_sizes>0</dont_check_file_sizes> <dont_contact_ref_site>0</dont_contact_ref_site> <lower_client_priority>0</lower_client_priority> <dont_suspend_nci>0</dont_suspend_nci> <dont_use_vbox>0</dont_use_vbox> <dont_use_wsl>0</dont_use_wsl> <exit_after_finish>0</exit_after_finish> <exit_before_start>0</exit_before_start> <exit_when_idle>0</exit_when_idle> <fetch_minimal_work>0</fetch_minimal_work> <fetch_on_update>0</fetch_on_update> <force_auth>default</force_auth> <http_1_0>0</http_1_0> <http_transfer_timeout>300</http_transfer_timeout> <http_transfer_timeout_bps>10</http_transfer_timeout_bps> <max_event_log_lines>2000</max_event_log_lines> <max_file_xfers>8</max_file_xfers> <max_file_xfers_per_project>2</max_file_xfers_per_project> <max_stderr_file_size>0</max_stderr_file_size> <max_stdout_file_size>0</max_stdout_file_size> <max_tasks_reported>0</max_tasks_reported> <proxy_info> <socks_server_name></socks_server_name> <socks_server_port>80</socks_server_port> <http_server_name></http_server_name> <http_server_port>80</http_server_port> <socks5_user_name></socks5_user_name> <socks5_user_passwd></socks5_user_passwd> <socks5_remote_dns>0</socks5_remote_dns> <http_user_name></http_user_name> <http_user_passwd></http_user_passwd> <no_proxy></no_proxy> <no_autodetect>0</no_autodetect> </proxy_info> <rec_half_life_days>10.000000</rec_half_life_days> <report_results_immediately>0</report_results_immediately> <run_apps_manually>0</run_apps_manually> <save_stats_days>30</save_stats_days> <skip_cpu_benchmarks>0</skip_cpu_benchmarks> <simple_gui_only>0</simple_gui_only> <start_delay>0.000000</start_delay> <stderr_head>0</stderr_head> <suppress_net_info>0</suppress_net_info> <unsigned_apps_ok>0</unsigned_apps_ok> <use_all_gpus>0</use_all_gpus> <use_certs>0</use_certs> <use_certs_only>0</use_certs_only> <vbox_window>0</vbox_window> </options> </cc_config> I will test this next. I have another computer with a 1060 GPU. After work I will take out the 1070 GPU and replace it with the other 1060 card. The problem computer will then have two 1060. They are same brand and type. It will force the start up to make some changes. Richard ID: 1968666 ·

Jord Volunteer tester Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3	Message 1968694 - Posted: 4 Dec 2018, 22:10:00 UTC - in response to Message 1968666. Last modified: 4 Dec 2018, 22:16:55 UTC You manually added <use_all_gpus>1</use_all_gpus> to your cc_config.xml file, but there's already one in there. Remove the line you added, scroll down and change the 0 on the later line to 1, then save the file. All options lines should only be used once in cc_config.xml A full cc_config.xml file is saved to the data directory when you add an exclusive app via the menus in BOINC Manager, or when you make a change to the Event Log options menu and save that. When you next edit that file and add anything to the top, BOINC will read the file, find your added line that switches something on, then continue down to the bottom, find the line that switches it off again and that there is the problem you have. Always check if the line you wanted isn't already in the file. They're in alphabetical order, so you can easily glance at commands starting with U. By default BOINC uses only the best GPU it detects, any lesser GPUs are not used. It'll do this based on what it detects in the drivers for compute capability, software version, available memory and speed. ID: 1968694 ·

jrs Send message Joined: 1 Feb 16 Posts: 3 Credit: 73,979,603 RAC: 127	Message 1968772 - Posted: 5 Dec 2018, 7:31:37 UTC - in response to Message 1968694. Last modified: 5 Dec 2018, 7:48:34 UTC Hi. It is now working. Tank you for the help. The two GPUs are now identical. I have replaces the 1070 with a 1060. This computer now has two 1060. I made a clean GPU installation with the same driver. Boinc found the cards without problems. 04.12.2018 20.15.50 \| \| CUDA: NVIDIA GPU 0: GeForce GTX 1060 3GB (driver version 417.22, CUDA version 10.0, compute capability 6.1, 3072MB, 2487MB available, 4111 GFLOPS peak) 04.12.2018 20.15.50 \| \| CUDA: NVIDIA GPU 1: GeForce GTX 1060 3GB (driver version 417.22, CUDA version 10.0, compute capability 6.1, 3072MB, 2487MB available, 4111 GFLOPS peak) 04.12.2018 20.15.50 \| \| OpenCL: NVIDIA GPU 0: GeForce GTX 1060 3GB (driver version 417.22, device version OpenCL 1.2 CUDA, 3072MB, 2487MB available, 4111 GFLOPS peak) 04.12.2018 20.15.50 \| \| OpenCL: NVIDIA GPU 1: GeForce GTX 1060 3GB (driver version 417.22, device version OpenCL 1.2 CUDA, 3072MB, 2487MB available, 4111 GFLOPS peak) My next build is a second hand GPU miningrig. I havent purchase any GPUs yet, but I will make sure they all will be the same. This motherboard has only one PCIe 3.0, but 11 PCIe 2.0 slots. If Bonic might drop GPUs that is not that fast. I will place an old GPU in the PCIe 3.0 slot, and use same GPUs for the rest of the 11 PCIe 2.0 slots. I guess Bonic find these to be the same and faster than the one I will place in the PCIe 3.0 slot. This rig has a slow CPU, that I will replace with a one with more cores. The mistake with two <use_all_gpus>1</use_all_gpus> lines might be the reason for my problem. I guess that the last one will be the one Bonic will use. <use_all_gpus>0</use_all_gpus>. I will test it with the new rig. Richard ID: 1968772 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.