GPU task stuck - cannot process anymore GPU work

Author	Message
David@home Volunteer tester Send message Joined: 16 Jan 03 Posts: 755 Credit: 5,040,916 RAC: 28	Message 1888888 - Posted: 9 Sep 2017, 20:39:30 UTC Many thanks, for first run I put the mid-range card suggestions in the readme file. FIrst such GPU work unit completed OK: https://setiathome.berkeley.edu/result.php?result_name=10se08ad.30634.8661.7.34.162_1 Is there anyway to check that those command lines switches were actually read OK? I did press the read config files in BOINC manager but would like the comfort factor to confirm something did change. When I have few work units completed I wil try your recommended command line, but again would like a way to double check that things are being changed so I am comfortable I am doing this right. ID: 1888888 ·

Zalster Volunteer tester Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242	Message 1888890 - Posted: 9 Sep 2017, 20:43:14 UTC - in response to Message 1888888. You will see them in your stderr results of the return work unit. But since your computers are hidden, only you can check to see if they are there. ID: 1888890 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80	Message 1888892 - Posted: 9 Sep 2017, 20:52:06 UTC You can check stderr.txt in you local slots folder before the task finnished. With each crime and every kindness we birth our future. ID: 1888892 ·

David@home Volunteer tester Send message Joined: 16 Jan 03 Posts: 755 Credit: 5,040,916 RAC: 28	Message 1888894 - Posted: 9 Sep 2017, 21:15:42 UTC Thanks Mike and Zalster, found a GPU work unit stderr.txt in my slots folder. It looks good: Maximum single buffer size set to:192MB SpikeFind FFT size threshold override set to:2048 TUNE: kernel 1 now has workgroup size of (64,1,4) So, now to try the recommended cmd line switches from Keith. ID: 1888894 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1888904 - Posted: 9 Sep 2017, 21:43:12 UTC If it's a dedicated cruncher -tt 1500 -hp -period_iterations_num 1 -high_perf -high_prec_timer -sbs 256 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 -cpu_lock in mb_cmdline_win_x86_SSE3_OpenCL_NV_SoG.txt found in the project directory. If it's not a dedicated cruncher, or the system becomes too sluggish, set tt to 600, remove -high_perf and see how responsive the system is then. Also in the project directory I have an app_config.xml file <app_config> <app> <name>setiathome_v8</name> <gpu_versions> <gpu_usage>1.00</gpu_usage> <cpu_usage>1.00</cpu_usage> </gpu_versions> </app> </app_config> That reserves 1 CPU core for each GPU WU, but if you run out of GPU work, it will process CPU work again. Grant Darwin NT ID: 1888904 ·

David@home Volunteer tester Send message Joined: 16 Jan 03 Posts: 755 Credit: 5,040,916 RAC: 28	Message 1888909 - Posted: 9 Sep 2017, 22:01:34 UTC - in response to Message 1888904. Seeing several odd warning messages in the stderr file now: WARNING: can't open binary kernel file for oclFFT plan: C:\ProgramData\BOINC/projects/setiathome.berkeley.edu\MB_clFFTplan_GeForceGTX750Ti_524288_gr256_lr16_wg256_tw0_ls512_bn32_cw32_r3557.bin_37653, continue with recompile... See this work unit for an example: https://setiathome.berkeley.edu/result.php?result_name=10oc08af.2724.81755.13.40.207_1 Is this a problem or can it be ignored? ID: 1888909 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80	Message 1888911 - Posted: 9 Sep 2017, 22:05:19 UTC No worries that`s just because some kernels are created for the first time. It shouldnt happen on the next tasks. With each crime and every kindness we birth our future. ID: 1888911 ·

David@home Volunteer tester Send message Joined: 16 Jan 03 Posts: 755 Credit: 5,040,916 RAC: 28	Message 1888912 - Posted: 9 Sep 2017, 22:06:41 UTC - in response to Message 1888911. No worries that`s just because some kernels are created for the first time. It shouldnt happen on the next tasks. Good news, thanks. ID: 1888912 ·

David@home Volunteer tester Send message Joined: 16 Jan 03 Posts: 755 Credit: 5,040,916 RAC: 28	Message 1888914 - Posted: 9 Sep 2017, 22:11:22 UTC - in response to Message 1888904. If it's a dedicated cruncher -tt 1500 -hp -period_iterations_num 1 -high_perf -high_prec_timer -sbs 256 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 -cpu_lock in mb_cmdline_win_x86_SSE3_OpenCL_NV_SoG.txt found in the project directory. If it's not a dedicated cruncher, or the system becomes too sluggish, set tt to 600, remove -high_perf and see how responsive the system is then. Also in the project directory I have an app_config.xml file <app_config> <app> <name>setiathome_v8</name> <gpu_versions> <gpu_usage>1.00</gpu_usage> <cpu_usage>1.00</cpu_usage> </gpu_versions> </app> </app_config> That reserves 1 CPU core for each GPU WU, but if you run out of GPU work, it will process CPU work again. Thanks Grant, Will try those tomorrow as it is getting late now. I have seen other posts about reserving a CPU core for the GPU, but been puzzled by it. I assume the GPU work units need a helping hand from the CPU for certain tasks but should we dedicate a whole CPU to this or let it context switch along with all the other Windows background tasks? Dedicating a CPU sounds like losing some useful CPU workunit crunching time? ID: 1888914 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80	Message 1888916 - Posted: 9 Sep 2017, 22:20:34 UTC Last modified: 9 Sep 2017, 22:21:26 UTC The - tt switch will not help much on a 750 TI. Just try different values with -period_iterations_num xx, 50 is default. Decrease to speed up or increase to get rid of screen lags. With each crime and every kindness we birth our future. ID: 1888916 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1888917 - Posted: 9 Sep 2017, 22:34:23 UTC - in response to Message 1888916. Last modified: 9 Sep 2017, 22:36:13 UTC The - tt switch will not help much on a 750 TI. Probably depends on the system. When I tried SoG on my C2D (32bit OS, 2*GTX 750Tis and 4GB RAM), dropping the high_perf and changing tt to 600 made the system usable, prior to that screen and input lag was 30-60 seconds. As it says in the sample settings- experimentation required. It will be interesting to see how David's system does with aggressive settings- plenty of cores, plenty of RAM and plenty of clock speed (compared to my poor old C2D) will more than likely offset such aggressive settings. Can only try and see how it goes. David@home I assume the GPU work units need a helping hand from the CPU for certain tasks but should we dedicate a whole CPU to this or let it context switch along with all the other Windows background tasks? Dedicating a CPU sounds like losing some useful CPU workunit crunching time? Even with a lowly GTX 750TI, the loss of CPU processing is more than offset by the boost in GPU processing, and it's just 1 core that is used, unless you add more GPUs. And if you run out of GPU work, it will pick up CPU work till you get more GPU work. It's an issue with the NVidia OpenCL implementation. AFAIK the OpenCL application on AMD cards doesn't take a CPU core to support each GPU WU being processed. However- even the Linux CUDA special application which doesn't use much CPU time at all, gives it's best output when you give it a whole CPU core to support it. The fact is, the faster you process the GPU work, the more CPU support is needed to keep it fed with WUs. Grant Darwin NT ID: 1888917 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80	Message 1888921 - Posted: 9 Sep 2017, 22:47:33 UTC I always make suggestions based on hardware and i also know why it says that in the read me. We shouldn`t confuse people more than necessary. I only make suggestions when helpful, otherwise i stay out of disussion. With each crime and every kindness we birth our future. ID: 1888921 ·

David@home Volunteer tester Send message Joined: 16 Jan 03 Posts: 755 Credit: 5,040,916 RAC: 28	Message 1889268 - Posted: 11 Sep 2017, 16:39:36 UTC Hi All, Just a note to say many thanks to everybody who helped me get though the GPU tasks aborting issue and onto tuning my graphics card. I am amazed at the performance increase this has provided and no doubt helped a lot with the silver badge that I have now earned for my RAC, Many thanks ID: 1889268 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.