Checking SOG command line again.

Author	Message
kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1936268 - Posted: 19 May 2018, 11:13:27 UTC Last modified: 19 May 2018, 11:52:40 UTC Hiya, crunchers. Just wanna check again to see if I am making the most of my GPUs. I have 2 GTX 1080ti with 11Gb vram and 1 Katana 1070 with 8Gb vram. Currently using the following command line for SOG. -sbs 2048 -period_iterations_num 1 -tt 1500 -high_perf -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 512 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -pref_wg_num_per_cu 6 -oclfft_tune_cw 256 -high_prec_timer Am I leaving any blood on the table, or has this already been properly tweaked to take best advantage of the cards and especially the vram? Thanks for looking. Meow? EDIT.... I am currently running 2 WUs per GPU, vram is in P0 state, have enough CPU spared to service the GPU tasks, and usually have 97-98% GPU usage. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1936268 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80	Message 1936272 - Posted: 19 May 2018, 12:32:23 UTC -pref_wg_num_per_cu 6 seems a little bit much to me. Especially device 2 which seems to be a 1070 produces some invalids. Otherwise all looks O.K. to me. With each crime and every kindness we birth our future. ID: 1936272 ·

Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640	Message 1936545 - Posted: 21 May 2018, 15:54:43 UTC Last modified: 21 May 2018, 15:55:05 UTC let me double on this for a second. what about using the same cmdline values on 3GB 1060s, but 1 WU per GPU instead of 2 Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ID: 1936545 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80	Message 1936546 - Posted: 21 May 2018, 16:08:23 UTC - in response to Message 1936545. let me double on this for a second. what about using the same cmdline values on 3GB 1060s, but 1 WU per GPU instead of 2 Thats to much for a 1060. Try this one instead. -sbs 1024 -period_iterations_num 10 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 512 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 256 With each crime and every kindness we birth our future. ID: 1936546 ·

Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640	Message 1936600 - Posted: 22 May 2018, 2:02:54 UTC - in response to Message 1936546. Last modified: 22 May 2018, 2:08:09 UTC what specifically is too much about it? I'm currently running the following cmdline values on my windows machines (4x 1060s between the 2 machines) for the MB WUs. seems to be working fine. -hp -period_iterations_num 1 -high_perf -sbs 2048 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 32 -oclfft_tune_cw 32 are these significantly less intense that what the OP posted? looks pretty similar, minus a few commands. i really dont know what these commands are really doing, this was just a combo of recommendations in previous posts and the mb readme files recommendations. my linux machines with "special sauce" dont appear to even have the cmdline values set, is it not necessary with that? i had them set previously before the special app, but the mb_cmdline text file seems to be missing after i dropped the necessary "special app" files into the project folder. strange. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ID: 1936600 ·

Zalster Volunteer tester Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242	Message 1936602 - Posted: 22 May 2018, 2:12:53 UTC - in response to Message 1936600. Period interations num of 1 is best left for 1070ti, 1080, 1080Ti or titans. Otherwise it causes significant screen lag sbs 2048 is saying you want to use 2GB of your 3 GB to run the Open Cl...Only problem is, you can't use that much for OpenCl on Nvidia cards. They are restricted by design to use I think 26% of the RAM for OpenCl applications. Now if it was a ATI then you could run anywhere from 40-67% of the RAM depending on manufacturer. So, 1024 is a better number for your 3GB card. Keith knows the exact numbers of how much ram, I just know it's less than 30. ID: 1936602 ·

Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640	Message 1936604 - Posted: 22 May 2018, 2:16:42 UTC - in response to Message 1936602. thanks. period iterations 1 is fine. they are just crunchers. they sit crunching 95% of the time and only really get used when me or my dad go poking around to see how they are doing. and ah, thanks for the update about the sbs parameter. is having the value too high causing any kind of real performance detriment? i mean, it's asking for 2GB, but only getting ~900MB (according to GPUz), but does that really matter? Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ID: 1936604 ·

Zalster Volunteer tester Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242	Message 1936606 - Posted: 22 May 2018, 2:20:19 UTC - in response to Message 1936604. It will only matter if you are running more than 1 work unit at a time. ie...if you ran 2 at a time and 1024 , then you would run out of memory. The reason we put it lower is some people run more than 1 at a time so we don't want to run out of memory. Otherwise task will pause and wait for free memory before running. ID: 1936606 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1936609 - Posted: 22 May 2018, 2:21:58 UTC - in response to Message 1936602. Zalster has the correct figure for OpenCL applications or close enough. 25% is the limit. So setting the sbs figure to 2048 for a 3MB card doesn't work and the application will size it accordingly anyway in use. Look at the stderr.txt output and you will see the task actually used 781 MB of memory on a 3GB card or 1536 MB of memory on a 6GB 1060 card. Best to set the parameter within specs to begin and not have to have the application jump through hoops to set the correct amount in the first place. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1936609 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1936632 - Posted: 22 May 2018, 6:47:56 UTC - in response to Message 1936604. is having the value too high causing any kind of real performance detriment? i mean, it's asking for 2GB, but only getting ~900MB (according to GPUz), but does that really matter? App will use only what it consider needed to use. Unless other options insist it will not use all allowed in -sbs N GPU RAM. Most memory-consuming place is unroll inside PulseFind and memory usage will depend on number of CUs GPU has, number of workgroups (WG) user asked to use and number of threads in single WG. In early days app used all allowable memory and that caused speed degradation due to misconfiguration. Now it's harder to achieve the same. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1936632 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.