Message boards :
Number crunching :
OpenCL NV MultiBeam v8 SoG edition for Windows
Message board moderation
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 18 · Next
Author | Message |
---|---|
Sleepy Send message Joined: 21 May 99 Posts: 219 Credit: 98,947,784 RAC: 28,360 |
I also am testing the SoG version. Thank you for your efforts, Raistmer! I am combining it by using also Franceschini's DLLs. Analysing the data (also comparing where possible with the crunching time of wingmates), which is very difficult due also to the erratic credit system combined to the different behaviour of the SoG with WUs with different AR, it seems (and confirms) that there are WUs that get crunched faster than with the other software versions and others with big improvements. Difficult for me at the moment to draw any reliable conclusion at the moment. Anyhow, I have two crunchers involved. The one with a 660 seems even more stable than with AP crunching and with the Lunatics version. The other cruncher, with a 650 TI, experiences often drivers resets. Of course, I am trying to play with the mb_cmdline.txt file. But also here no joy till now. My latest command line for this cruncher is: -sbs 128 -spike_fft_thresh 2048 -tune 1 16 1 2 -period_iterations_num 60 -total_GPU_instances_num 2 but I am not convinced at all that this is good by any means. I have come to here just changing parameters without yet having found the correct combination. Therefore, I cannot even see it more as a starting point. I have read the documentation, but it is still not really clear to me when increasing a parameter eases life to the GPU and when in works opposite. It seems some parameter are lighter when high (e.g. iterations if I am not wrong) and some that are lighter when low (e.g. sbs). I would be very grateful if we could get a more detailed explanation about this aspect or some links pointing to it (except of course the readme included in the distribution package, which I have read many times already and for some reason does not fully enlighten me and leaves me some doubts)? Thank you in advance! Sleepy |
Mike Send message Joined: 17 Feb 01 Posts: 34382 Credit: 79,922,639 RAC: 80 |
I have read the documentation, but it is still not really clear to me when increasing a parameter eases life to the GPU and when in works opposite. Hello Sleepy. It depends that those params are GPU (host) dependent. We have not much experience with Nvidia cards. I have tested a couple of NV cards but of course not enough. Those results are documented in the read me file. With each crime and every kindness we birth our future. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Well, -period_iterations_num N divides single call to N separate calls so should "make life more easy" for GPU. With -sbs N situation more complex. It defines maximum size of memory buffer that will be used to unroll periods in PulseFind. The more unroll the more items can be used in single workgroup (especially important for low FFT values, that most often occur at VLAR tasks). But from other side too big unroll could decrease number of separate workgroups making some CUs sit idle. To investigate real configs deeper I added -v 8 switch in new builds (hope Eric will deploy them on Beta soon enough, maybe even today). Running with -v 8 will print PulseFind configs into stderr so one can see how -sbs N affects PulseFind geometry and if currently used heuristic works OK or not. New builds are available as standalone packs here: https://cloud.mail.ru/public/DMkN/x4BRCYuAV |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Is there any big differences/changes in this new build, which would make it neccessary for me to upgrade from SoG Build 3366? No. Attempt of better auto-handling of low-end devices and -v 8 option for helping in PulseFind tune. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
"Together with" doesn't mean "because of". -sbs 384 was one of your first tasks with this app so binary cache of kernels formed. When you switched to another sbs value binary cache was already formed so no new warnings received. mb_cmdline_win_x86_SSE3_OpenCL_NV.txt “current†-no_cpu_lock -sbs 192 -instances_per_device 3 -period_iterations_num 40 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 16 -oclfft_tune_cw 16 Looks OK. NV build doesn't have CPUlock enabled by default so -no_cpu_lock excessive in this case but should do no harm. |
Sleepy Send message Joined: 21 May 99 Posts: 219 Credit: 98,947,784 RAC: 28,360 |
Thank you Raistmer. It really seems there is no fixed rule to get the right recipe. I will then experiment a bit more to find the right balance.! Thank you again! Stefano |
Joe Januzzi Send message Joined: 13 Apr 03 Posts: 54 Credit: 307,134,110 RAC: 492 |
Raistmer, Thanks so much for all the great work and support that you do for SETI. NV build doesn't have CPUlock enabled by default so -no_cpu_lock excessive in this case but should do no harm. I removed the -no_cpu_lock from the file. Thanks Joe Real Join Date: Joe Januzzi (ID 253343) 29 Sep 1999, 22:30:36 UTC Try to learn something new everyday. |
Joe Januzzi Send message Joined: 13 Apr 03 Posts: 54 Credit: 307,134,110 RAC: 492 |
Well, -period_iterations_num N divides single call to N separate calls so should "make life more easy" for GPU. FYI I ran with the -v 8 switch. Higher CPU usage with and without the switch. Here are some Wu's before I turned off the -v 8 switch. I think I will go back to Rev. #3366 if I can't tune Rev.#3381. I hope the data will give you some good info, because this is over my head. As always, I vary grateful for any help. Thanks Joe http://setiathome.berkeley.edu/workunit.php?wuid=2074646207 http://setiathome.berkeley.edu/workunit.php?wuid=2074646670 http://setiathome.berkeley.edu/workunit.php?wuid=2074646844 without switch: http://setiathome.berkeley.edu/workunit.php?wuid=2074646604 http://setiathome.berkeley.edu/workunit.php?wuid=2074646641 http://setiathome.berkeley.edu/workunit.php?wuid=2074646616 mb_cmdline_win_x86_SSE3_OpenCL_NV.txt -sbs 256 -instances_per_device 3 -period_iterations_num 40 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 16 -oclfft_tune_cw 16 -v 8 17mr10af.25361.3343.4.31.218_1 http://setiathome.berkeley.edu/result.php?resultid=4751543823 Stderr output <core_client_version>7.4.42</core_client_version> <![CDATA[ <stderr_txt> , pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 3; PulsePoTLen=480 NDRange={64,17,1}, WG={32,1,1},single_period_size=2.7MB, WG num=34, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 4; PulsePoTLen=480 NDRange={64,17,1}, WG={32,1,1},single_period_size=2MB, WG num=34, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 5; PulsePoTLen=480 NDRange={64,17,1}, WG={32,1,1},single_period_size=1.6MB, WG num=34, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 3; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=2.7MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 4; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=2MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 5; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=1.6MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 3; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=2.7MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 4; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=2MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 5; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=1.6MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 3; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=2.7MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 4; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=2MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 5; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=1.6MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 3; PulsePoTLen=480 NDRange={64,17,1}, WG={32,1,1},single_period_size=2.7MB, WG num=34, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 4; PulsePoTLen=480 NDRange={64,17,1}, WG={32,1,1},single_period_size=2MB, WG num=34, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 5; PulsePoTLen=480 NDRange={64,17,1}, WG={32,1,1},single_period_size=1.6MB, WG num=34, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 3; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=2.7MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 4; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=2MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 5; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=1.6MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 3; PulsePoTLen=480 NDRange={64,17,1}, WG={32,1,1},single_period_size=2.7MB, WG num=34, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 4; PulsePoTLen=480 NDRange={64,17,1}, WG={32,1,1},single_period_size=2MB, WG num=34, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 5; PulsePoTLen=480 NDRange={64,17,1}, WG={32,1,1},single_period_size=1.6MB, WG num=34, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 3; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=2.7MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 4; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=2MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 5; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=1.6MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 3; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=2.7MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 4; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=2MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 5; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=1.6MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 3; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=2.7MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 4; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=2MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 5; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=1.6MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 3; PulsePoTLen=480 NDRange={64,17,1}, WG={32,1,1},single_period_size=2.7MB, WG num=34, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 4; PulsePoTLen=480 NDRange={64,17,1}, WG={32,1,1},single_period_size=2MB, WG num=34, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 5; PulsePoTLen=480 NDRange={64,17,1}, WG={32,1,1},single_period_size=1.6MB, WG num=34, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 3; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=2.7MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 4; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=2MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 5; PulsePoTLen=240 NDRange={128,17,1}, WG={32,1,1},single_period_size=1.6MB, WG num=68, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 3; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2.7MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 4; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=2MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_partial_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_partial_kernel_cl, pass 5; PulsePoTLen=960 NDRange={32,17,3}, WG={32,1,1},single_period_size=1.6MB, WG num=51, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 3; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 3; PulsePoTLen=480 NDRange={64,17,1}, WG={32,1,1},single_period_size=2.7MB, WG num=34, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 4; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 4; PulsePoTLen=480 NDRange={64,17,1}, WG={32,1,1},single_period_size=2MB, WG num=34, CU num=16 WARNING: suboptimal config for PC_find_pulse_kernel_cl pass 5; WG size doesn't fill wave starting PC_find_pulse_kernel_cl, pass 5; PulsePoTLen=480 NDRange={64,17,1}, WG={32,1,1},single_period_size=1.6MB, WG num=34, CU num=16 WARNING: suboptimal config f Real Join Date: Joe Januzzi (ID 253343) 29 Sep 1999, 22:30:36 UTC Try to learn something new everyday. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Joe, Is there a new kernal in this version? Edit.. Raistmer What the difference between your nonSOG version and the SoG version? |
Joe Januzzi Send message Joined: 13 Apr 03 Posts: 54 Credit: 307,134,110 RAC: 492 |
Joe, Is there a new kernal in this version? Zalster, Thanks for pointing that out, it should be MultiBeam_Kernels_r3381.cl. I'll update my file. Joe Real Join Date: Joe Januzzi (ID 253343) 29 Sep 1999, 22:30:36 UTC Try to learn something new everyday. |
Joe Januzzi Send message Joined: 13 Apr 03 Posts: 54 Credit: 307,134,110 RAC: 492 |
The right Kernel (MultiBeam_Kernels_r3381.cl) seem to lower my CPU usage:) I think I'll do the -v 8 switch test over. Thanks again Zalster. Joe app_info.xml <app_info> <app> <name>setiathome_v7</name> </app> <file_info> <name>AKv8c_r2549_winx86-64_AVXxjfs.exe</name> <executable/> </file_info> <file_info> <name>libfftw3f-3-3-4_x64.dll</name> <executable/> </file_info> <file_info> <name>cmdline_AKv8c_r2549_winx86-64_AVXxjfs.txt</name> </file_info> <app_version> <app_name>setiathome_v7</app_name> <version_num>700</version_num> <platform>windows_intelx86</platform> <file_ref> <file_name>AKv8c_r2549_winx86-64_AVXxjfs.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>cmdline_AKv8c_r2549_winx86-64_AVXxjfs.txt</file_name> <open_name>mb_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>setiathome_v7</app_name> <version_num>700</version_num> <platform>windows_x86_64</platform> <file_ref> <file_name>AKv8c_r2549_winx86-64_AVXxjfs.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>cmdline_AKv8c_r2549_winx86-64_AVXxjfs.txt</file_name> <open_name>mb_cmdline.txt</open_name> </file_ref> </app_version> <app> <name>astropulse_v7</name> </app> <file_info> <name>AP7_win_x64_AVX_CPU_r2692.exe</name> <executable/> </file_info> <file_info> <name>libfftw3f-3-3-4_x64.dll</name> <executable/> </file_info> <file_info> <name>ap_cmdline_win_x64_AVX_CPU.txt</name> </file_info> <app_version> <app_name>astropulse_v7</app_name> <version_num>703</version_num> <platform>windows_x86_64</platform> <plan_class>sse2</plan_class> <cmdline></cmdline> <file_ref> <file_name>AP7_win_x64_AVX_CPU_r2692.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x64_AVX_CPU.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>703</version_num> <platform>windows_intelx86</platform> <plan_class>sse2</plan_class> <cmdline></cmdline> <file_ref> <file_name>AP7_win_x64_AVX_CPU_r2692.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x64_AVX_CPU.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>703</version_num> <platform>windows_x86_64</platform> <plan_class>sse</plan_class> <cmdline></cmdline> <file_ref> <file_name>AP7_win_x64_AVX_CPU_r2692.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x64_AVX_CPU.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>703</version_num> <platform>windows_intelx86</platform> <plan_class>sse</plan_class> <cmdline></cmdline> <file_ref> <file_name>AP7_win_x64_AVX_CPU_r2692.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x64_AVX_CPU.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>701</version_num> <platform>windows_x86_64</platform> <cmdline></cmdline> <file_ref> <file_name>AP7_win_x64_AVX_CPU_r2692.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x64_AVX_CPU.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>701</version_num> <platform>windows_intelx86</platform> <cmdline></cmdline> <file_ref> <file_name>AP7_win_x64_AVX_CPU_r2692.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x64_AVX_CPU.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>700</version_num> <platform>windows_x86_64</platform> <cmdline></cmdline> <file_ref> <file_name>AP7_win_x64_AVX_CPU_r2692.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x64_AVX_CPU.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>700</version_num> <platform>windows_intelx86</platform> <cmdline></cmdline> <file_ref> <file_name>AP7_win_x64_AVX_CPU_r2692.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x64_AVX_CPU.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app> <name>astropulse_v7</name> </app> <file_info> <name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</name> <executable/> </file_info> <file_info> <name>libfftw3f-3-3-4_x86.dll</name> <executable/> </file_info> <file_info> <name>AstroPulse_Kernels_r2887.cl</name> </file_info> <file_info> <name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</name> </file_info> <app_version> <app_name>astropulse_v7</app_name> <version_num>710</version_num> <platform>windows_x86_64</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>opencl_nvidia_100</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>710</version_num> <platform>windows_x86_64</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>cuda_opencl_100</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>710</version_num> <platform>windows_x86_64</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>opencl_nvidia_cc1</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>710</version_num> <platform>windows_x86_64</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>cuda_opencl_cc1</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>705</version_num> <platform>windows_x86_64</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>opencl_nvidia_100</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>705</version_num> <platform>windows_x86_64</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>cuda_opencl_100</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>705</version_num> <platform>windows_x86_64</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>opencl_nvidia_cc1</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>705</version_num> <platform>windows_x86_64</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>cuda_opencl_cc1</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>710</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>opencl_nvidia_100</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>710</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>cuda_opencl_100</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>710</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>opencl_nvidia_cc1</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>710</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>cuda_opencl_cc1</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>705</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>opencl_nvidia_100</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>705</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>cuda_opencl_100</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>705</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>opencl_nvidia_cc1</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>astropulse_v7</app_name> <version_num>705</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>cuda_opencl_cc1</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>AP7_win_x86_SSE2_OpenCL_NV_r2887.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2887.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_win_x86_SSE2_OpenCL_NV.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app> <name>setiathome_v8</name> </app> <file_info> <name>MB8_win_x64_AVX_VS2010_r3330.exe</name> <executable/> </file_info> <file_info> <name>libfftw3f-3-3-4_x64.dll</name> <executable/> </file_info> <file_info> <name>mb_cmdline_win_x64_AVX_VS2010.txt</name> </file_info> <app_version> <app_name>setiathome_v8</app_name> <version_num>800</version_num> <platform>windows_x86_64</platform> <api_version>7.5.0</api_version> <file_ref> <file_name>MB8_win_x64_AVX_VS2010_r3330.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>mb_cmdline_win_x64_AVX_VS2010.txt</file_name> <open_name>mb_cmdline.txt</open_name> </file_ref> </app_version> <app_version> <app_name>setiathome_v8</app_name> <version_num>800</version_num> <platform>windows_intelx86</platform> <api_version>7.5.0</api_version> <file_ref> <file_name>MB8_win_x64_AVX_VS2010_r3330.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x64.dll</file_name> </file_ref> <file_ref> <file_name>mb_cmdline_win_x64_AVX_VS2010.txt</file_name> <open_name>mb_cmdline.txt</open_name> </file_ref> </app_version> <app> <name>setiathome_v8</name> </app> <file_info> <name>MB8_win_x86_SSE3_OpenCL_NV_r3381_SoG.exe</name> <executable/> </file_info> <file_info> <name>libfftw3f-3-3-4_x86.dll</name> <executable/> </file_info> <file_ref> <file_name>AstroPulse_Kernels_r3381.cl</file_name> </file_ref> <file_info> <name>mb_cmdline_win_x86_SSE3_OpenCL_NV.txt</name> </file_info> <app_version> <app_name>setiathome_v8</app_name> <version_num>800</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>opencl_nvidia_SoG</plan_class> <cmdline></cmdline> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>MB8_win_x86_SSE3_OpenCL_NV_r3381_SoG.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>libfftw3f-3-3-4_x86.dll</file_name> </file_ref> <file_ref> <file_name>mb_cmdline_win_x86_SSE3_OpenCL_NV.txt</file_name> <open_name>mb_cmdline.txt</open_name> </file_ref> </app_version> </app_info> Real Join Date: Joe Januzzi (ID 253343) 29 Sep 1999, 22:30:36 UTC Try to learn something new everyday. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Joe, I'm seeing an interesting (or concerning) pattern developing. Let me know if you see it too The good part first, the amount of CPU required is similar to Cuda to begin with. Now the concerning part, after about 75% of the computation is done on a work unit, I begin to see an increase in the total amount of CPU cores used until the work unit ends. After the work unit ends,the total CPU cores drop down again. Not a big thing unless you start all the work units at the same time, and if they are similar in run time then they all finish about the time same, cause a high level of total CPU usage. In mine after I restarted the work units and they finished the system went to 100% of all cores for about 3 minutes. As the work units finished and dropped off it back off to about 35% of all cores. This said sluggish and unresponsiveness to the computer. Given time, each individual work unit will not finish with the others, so there will be a staggered completion and this spike in total CPU cores should lessen. However, when first starting back up the system, there might certainly be lock ups for a few minutes. (example coming back from Tuesday maintenance once new work is downloaded) If I could stagger the start times of the work units then this occurrence could be lessen, thou the law of averages would say eventually you will get several meeting their completion at the same time. (even a broke clock is right twice a day) I'll keep an eye on them and see how it goes. Hopefully I won't get a lock up or crash. APs seem to do ok with them. Edit.. No sure yet but looks like they might be 2-3 minutes slower than 3366 build |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Yes, cause -v 8 means more verbose output it slowdowns processing (and increase CPU usage cause CPU responsible for processing fprintf() ). This option only for debugging and tuning -sbs N value. When tuning ends it should be switched off/removed.
It did, thanks. More work required in this area and your logs show that. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
The difference just as before - SoG uses SoG path, non-SoG - doesn't. The reason I still provide both - to make speed comparison on wider number of hosts (on beta) before release. For now it seems SoG faster, especially on NV, but I just want be more sure. Maybe on some configs SoG slower then we need to find such configs and devise some plan class that would exclude such hosts from app distribution. Better to do that before release if possible cause BOINC's natural selection mechanism doesn't work right sometimes. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
There is some uncertainty when CPU usage increase? "they finished" and CPU load increases could mean that increase happens on new task start, not on its end. On task start indeed there is some CPU-only amount of time when app's state initialization occurs. But as task progresses (and this holds true after 75% of progress) the less and less non-SoG searches constitute workload (and that was the reason why SoG path so non-linear in its % done reporting) so SoG app the less and less communicate with CPU. Usually that's the time when CPU load could drop in SoG build a lot against CPU usage of non-SoG version. But you report CPU usage increase in that area? Worth to check that more carefully - how CPU usage increase correlates with task progress stages. Cause it would be valuable info. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Heh, it took over 6000 valid WU's before I got the first invalid with SoG. However I doubt it is a real invalid, just a difference between the apps, when it comes to the order they find the signals: To preserve available info on this case: SoG: Pulse: peak=3.108338, time=13.45, period=0.9503, d_freq=1420006799.68, score=1.022, chirp=-3.5473, fft_len=512 Pulse: peak=6.418058, time=73.96, period=2.602, d_freq=1420005226.29, score=1.009, chirp=5.6759, fft_len=128 Pulse: peak=5.558758, time=80.68, period=1.733, d_freq=1420005264.46, score=1.021, chirp=5.6759, fft_len=128 Pulse: peak=5.660017, time=80.68, period=1.737, d_freq=1420005264.45, score=1.04, chirp=6.6214, fft_len=128 Pulse: peak=5.608606, time=80.68, period=1.733, d_freq=1420005264.51, score=1.03, chirp=7.5678, fft_len=128 Pulse: peak=5.448968, time=100.7, period=1.783, d_freq=1420005341.76, score=1, chirp=-7.5678, fft_len=128 Pulse: peak=5.766086, time=100.7, period=1.783, d_freq=1420005342.35, score=1.058, chirp=-11.352, fft_len=128 Pulse: peak=5.745099, time=73.96, period=1.786, d_freq=1420005213.07, score=1.054, chirp=-15.135, fft_len=128 Pulse: peak=5.683627, time=100.7, period=1.783, d_freq=1420005343.04, score=1.043, chirp=-15.135, fft_len=128 Pulse: peak=5.83024, time=73.96, period=1.786, d_freq=1420005219.37, score=1.07, chirp=-16.081, fft_len=128 Pulse: peak=6.33339, time=73.96, period=2.317, d_freq=1420005226.54, score=1, chirp=17.027, fft_len=128 Pulse: peak=10.06626, time=80.68, period=3.493, d_freq=1420005341.03, score=1.04, chirp=17.027, fft_len=128 Pulse: peak=5.553938, time=73.96, period=1.786, d_freq=1420005225.73, score=1.019, chirp=-17.027, fft_len=128 Pulse: peak=6.593595, time=73.96, period=2.317, d_freq=1420005220.24, score=1.041, chirp=17.973, fft_len=128 Pulse: peak=10.49988, time=80.68, period=3.493, d_freq=1420005341.09, score=1.085, chirp=17.973, fft_len=128 Pulse: peak=6.643026, time=73.96, period=2.382, d_freq=1420005213.87, score=1.048, chirp=18.919, fft_len=128 Pulse: peak=10.59672, time=80.68, period=3.493, d_freq=1420005341.08, score=1.095, chirp=18.919, fft_len=128 Pulse: peak=10.10738, time=94.13, period=4.175, d_freq=1420005303.12, score=1.037, chirp=19.865, fft_len=128 Pulse: peak=6.532632, time=73.96, period=2.382, d_freq=1420005207.58, score=1.031, chirp=19.865, fft_len=128 Pulse: peak=6.212343, time=80.68, period=1.747, d_freq=1420005341.15, score=1.141, chirp=19.865, fft_len=128 Pulse: peak=6.00212, time=80.68, period=1.747, d_freq=1420005341.14, score=1.102, chirp=20.811, fft_len=128 Pulse: peak=5.650387, time=80.68, period=1.747, d_freq=1420005341.21, score=1.038, chirp=21.757, fft_len=128 Spike: peak=25.8012, time=56.41, d_freq=1420005916.71, chirp=-0.014788, fft_len=4k Spike: peak=25.89606, time=56.41, d_freq=1420005916.83, chirp=0.029576, fft_len=4k Spike: peak=25.35256, time=56.41, d_freq=1420005916.6, chirp=-0.059153, fft_len=4k Spike: peak=25.63649, time=56.41, d_freq=1420005916.95, chirp=0.073941, fft_len=4k Spike: peak=24.56402, time=56.41, d_freq=1420005916.48, chirp=-0.10352, fft_len=4k Spike: peak=25.03482, time=56.41, d_freq=1420005917.07, chirp=0.11831, fft_len=4k Spike: peak=24.11559, time=56.41, d_freq=1420005917.19, chirp=0.16267, fft_len=4k Spike: peak=24.89538, time=56.41, d_freq=1420005917.07, chirp=-0.17746, fft_len=4k OpenCL queue synchronized SETI@Home Informational message -9 result_overflow NOTE: The number of results detected equals the storage space allocated. Best spike: peak=27.01501, time=60.4, d_freq=1420013971.7, chirp=-11.073, fft_len=128k Best autocorr: peak=16.52522, time=100.7, delay=0.025293, d_freq=1420010420.34, chirp=6.504, fft_len=128k Best gaussian: peak=3.146213, mean=0.6692209, ChiSq=1.396932, time=39.43, d_freq=1420007177.9, score=-7.616322, null_hyp=1.8053, chirp=-0.0073941, fft_len=16k Best pulse: peak=6.25204, time=80.68, period=1.747, d_freq=1420005341.08, score=1.148, chirp=18.919, fft_len=128 Best triplet: peak=0, time=-2.121e+011, period=0, d_freq=0, chirp=0, fft_len=0 Flopcounter: 1869834161008.562700 Spike count: 8 Autocorr count: 0 Pulse count: 22 Triplet count: 0 Gaussian count: 0 CPU: Spike count: 30 Autocorr count: 0 Pulse count: 0 Triplet count: 0 Gaussian count: 0 CUDA: Spike count: 30 Autocorr count: 0 Pulse count: 0 Triplet count: 0 Gaussian count: 0 Unfortunately 2 other apps provide too little info about found signals inside log. And task definitely was purged already so offline re-run impossible. I added more verbose signal output in stock too but perhaps it appears in live app not too soon - only when current CPU stock build will be replaced... And yes, this discrepancy most probably caused by different processing order. Though I tried to emulate serial processing order time to time such discrepancies will happen until validator will be changed to deal with overflows in more correct way. Overflowed result is just subset of all signals and selection one or another such subset quite arbitrary. Good that such invalids rate low enough not to pose any real problem for SoG even with current validator implementation. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
The very first time it happen was just after I installed the new build. All work units on 4 GPUs started at exactly the same time. When the first units approached 60% complete I began to see the complete total of CPU cores increasing. When they approached mid 80 complete the system was a 96% of all cores. As they passed 90% complete, 100% of all core were used. This lasted about 3 minutes roughly. Currently, I have all the work units spread out evenly from 1% up to 76% and the total CPU work is at 69%. I think alot of that has to do with all the work units being spread out with different percentage to complete. On the rare occasion that I get several work units with near completion times, I begin to see total CPU increase but not to the level when I first install. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Next time you will see such coherence please record AR of involved tasks also. What if you load same number of tasks per GPU in offline bench on just single GPU let other 3 remains idle? Do you see same CPU behavior? It's important observation cause can indicate possible issue with SoG on some systems. With ATi card I saw CPU usage increase in SoG (just opposite "usual" NV response) all of that increase was system one. It appeared that ATi OpenCL runtime switches to busy-wait after some amount of time. SoG build allows very long periods of decoupling between CPU and GPU and such long period triggered busy-wait loop inside ATi Runtime. The solution was to force (unneeded!) synching between CPU and GPU just to not allow Runtime to get into busy-wait loop. And CPU usage dropped back indeed. Of course such senseless arbitrary synching just decrease app performance, but it allows to save CPU so total performance win possible (provided we can't improve badly implemented Runtime per se). Is it case with NV OpenCL runtime too? It uses busy-wait even with frequent synching and before your report it seems that (opposite to ATi's one) long periods of decoupling switched it out of busy-wait hence CPU usage drops a lot for VHARs. That's why important to record with what ARs do you see CPU load increase and to re-produce this in offline benchmark. |
Chris Adamek Send message Joined: 15 May 99 Posts: 251 Credit: 434,772,072 RAC: 236 |
Here's a link to my computer that has been running SoG since you posted it over on main. Basically see the same as everyone else. I only run 1 wu at a time because GPU utilization is at 96-98% all the time. Shorties use practically no CPU and the CPU usage goes up to more or less equal the runtime for any of the .44 AR or below ranges. Overall though throughput for the machine has increased. Do you need to look at the ATI varient on a wider range of cards? I've got a good variety but they are all in Macs so I'd have to see about convincing Xcode to spit something useful out of your sources in the repository unless there is already a Mac version out there somewhere. http://setiathome.berkeley.edu/results.php?hostid=7251681 Thanks, Chris |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Yes, that wold be interesting. Especially in comparison with corresponding non-SoG ones. AFAIK TBar and Urs did some ATi builds for OS X. Worth to look at them (perhaps non-SoG). Maybe they could do SoG ones too (-D SIGNALS_ON_GPU in compiler options)... |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.