Message boards :
AstroPulse :
Astropulse v7 on ATi, with max workgroup size 128, not running
Message board moderation
Author | Message |
---|---|
![]() Send message Joined: 11 Dec 09 Posts: 74 Credit: 1,248,766 RAC: 0 ![]() |
The HD6370m in my notebook With Cat 14.4 does not seem to care for the AP v7 app. It is restarting ever 4-5 seconds with the same error. Running on device number: 0 Priority of worker thread raised successfully Priority of process adjusted successfully, below normal priority class used OpenCL platform detected: Advanced Micro Devices, Inc. BOINC assigns device 0 Info: BOINC provided OpenCL device ID used Used GPU device parameters are: Number of compute units: 2 Single buffer allocation size: 256MB Total device global memory: 1024MB max WG size: 128 -unroll default value used: 2 -ffa_block default value used: 512 -ffa_block_fetch default value used: 256 Build features: Non-graphics BLANKIT OpenCL TWIN_FFA OCL_ZERO_COPY COMBINED_DECHIRP_KERNEL FFTW USE_INCREASED_PRECISION USE_SSE2 x86 CPUID: Intel(R) Core(TM) i3 CPU M 390 @ 2.67GHz Cache: L1=64K L2=256K CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3 SSSE3 SSE4.1 SSE4.2 AstroPulse v7 Windows x86 rev 2601, V7 match, by Raistmer with support of Lunatics.kwsn.net team. SSE2 OpenCL version by Raistmer oclFFT fix for ATI GPUs by Urs Echternacht ffa threshold mods by Joe Segur SSE3 dechirping by JDWhale Combined dechirp kernel by Frizz Number of OpenCL platforms: 1 OpenCL Platform Name: AMD Accelerated Parallel Processing Number of devices: 1 Max compute units: 2 Max work group size: 128 Max clock frequency: 750Mhz Max memory allocation: 536870912 Cache type: None Cache line size: 0 Cache size: 0 Global memory size: 1073741824 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 32768 Queue properties: Out-of-Order: No Name: Cedar Vendor: Advanced Micro Devices, Inc. Driver version: 1445.5 (VM) Version: OpenCL 1.2 AMD-APP (1445.5) Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_image2d_from_buffer_read_only cl_khr_spir cl_khr_gl_event state.fold_buf_size_short=65536; state.fold_buf_size_long=262144 WARNING: can't open binary kernel file for oclFFT plan: S:\BOINC/projects/setiweb.ssl.berkeley.edu_beta\AP_clFFTplan_Cedar_32768_gr64_lr8_wg256_tw0_r2601.bin_14455VM, continue with recompile... ERROR: clFFT_CreatePlan failed: -46 Modifying the ap_cmdline_7.02_windows_intelx86__opencl_ati_100.txt with parameters such as [/b]-tune 1 32 4 1[/b] does not change the error or warning, but does add the TUNE: line to the output. TUNE: kernel 1 now has workgroup size of (32,4,1) I would have expected the _wg256_ potion to change when modifying with the tune setting. |
![]() ![]() Send message Joined: 18 Aug 05 Posts: 2423 Credit: 15,878,738 RAC: 0 ![]() |
What -oclFFT 64 8 128 gives? News about SETI opt app releases: https://twitter.com/Raistmer |
![]() Send message Joined: 11 Dec 09 Posts: 74 Credit: 1,248,766 RAC: 0 ![]() |
What -oclFFT 64 8 128 gives? Running on device number: 0 Priority of worker thread raised successfully Priority of process adjusted successfully, below normal priority class used OpenCL platform detected: Advanced Micro Devices, Inc. BOINC assigns device 0 Info: BOINC provided OpenCL device ID used Used GPU device parameters are: Number of compute units: 2 Single buffer allocation size: 256MB Total device global memory: 1024MB max WG size: 128 -unroll default value used: 2 -ffa_block default value used: 512 -ffa_block_fetch default value used: 256 Build features: Non-graphics BLANKIT OpenCL TWIN_FFA OCL_ZERO_COPY COMBINED_DECHIRP_KERNEL FFTW USE_INCREASED_PRECISION USE_SSE2 x86 CPUID: Intel(R) Core(TM) i3 CPU M 390 @ 2.67GHz Cache: L1=64K L2=256K CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3 SSSE3 SSE4.1 SSE4.2 AstroPulse v7 Windows x86 rev 2601, V7 match, by Raistmer with support of Lunatics.kwsn.net team. SSE2 OpenCL version by Raistmer oclFFT fix for ATI GPUs by Urs Echternacht ffa threshold mods by Joe Segur SSE3 dechirping by JDWhale Combined dechirp kernel by Frizz Number of OpenCL platforms: 1 OpenCL Platform Name: AMD Accelerated Parallel Processing Number of devices: 1 Max compute units: 2 Max work group size: 128 Max clock frequency: 750Mhz Max memory allocation: 536870912 Cache type: None Cache line size: 0 Cache size: 0 Global memory size: 1073741824 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 32768 Queue properties: Out-of-Order: No Name: Cedar Vendor: Advanced Micro Devices, Inc. Driver version: 1445.5 (VM) Version: OpenCL 1.2 AMD-APP (1445.5) Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_image2d_from_buffer_read_only cl_khr_spir cl_khr_gl_event state.fold_buf_size_short=65536; state.fold_buf_size_long=262144 WARNING: can't open binary kernel file for oclFFT plan: S:\BOINC/projects/setiweb.ssl.berkeley.edu_beta\AP_clFFTplan_Cedar_32768_gr64_lr8_wg256_tw0_r2601.bin_14455VM, continue with recompile... ERROR: clFFT_CreatePlan failed: -46 |
![]() ![]() Send message Joined: 18 Aug 05 Posts: 2423 Credit: 15,878,738 RAC: 0 ![]() |
nope, option inactive. There should be indication of option in the very beginning of stderr output. Sorry, I named it little different than in code :) -oclFFT_plan A B C : to override defaults for FFT 32k plan generation. Read oclFFT code and explanations in comments before any tweaking. Please try again. News about SETI opt app releases: https://twitter.com/Raistmer |
![]() Send message Joined: 11 Dec 09 Posts: 74 Credit: 1,248,766 RAC: 0 ![]() |
nope, option inactive. There should be indication of option in the very beginning of stderr output. OK it looks like it might be running. Now up to 00:04:00 and no restarting. Output from stderr.txt looks like: ]pre]Running on device number: 0 oclFFT plan class overrides requested: global radix 64; local radix 8; max workgroup size 128 Priority of worker thread raised successfully Priority of process adjusted successfully, below normal priority class used OpenCL platform detected: Advanced Micro Devices, Inc. BOINC assigns device 0 Info: BOINC provided OpenCL device ID used Used GPU device parameters are: Number of compute units: 2 Single buffer allocation size: 256MB Total device global memory: 1024MB max WG size: 128 -unroll default value used: 2 -ffa_block default value used: 512 -ffa_block_fetch default value used: 256 Build features: Non-graphics BLANKIT OpenCL TWIN_FFA OCL_ZERO_COPY COMBINED_DECHIRP_KERNEL FFTW USE_INCREASED_PRECISION USE_SSE2 x86 CPUID: Intel(R) Core(TM) i3 CPU M 390 @ 2.67GHz Cache: L1=64K L2=256K CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3 SSSE3 SSE4.1 SSE4.2 AstroPulse v7 Windows x86 rev 2601, V7 match, by Raistmer with support of Lunatics.kwsn.net team. SSE2 OpenCL version by Raistmer oclFFT fix for ATI GPUs by Urs Echternacht ffa threshold mods by Joe Segur SSE3 dechirping by JDWhale Combined dechirp kernel by Frizz Number of OpenCL platforms: 1 OpenCL Platform Name: AMD Accelerated Parallel Processing Number of devices: 1 Max compute units: 2 Max work group size: 128 Max clock frequency: 750Mhz Max memory allocation: 536870912 Cache type: None Cache line size: 0 Cache size: 0 Global memory size: 1073741824 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 32768 Queue properties: Out-of-Order: No Name: Cedar Vendor: Advanced Micro Devices, Inc. Driver version: 1445.5 (VM) Version: OpenCL 1.2 AMD-APP (1445.5) Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_image2d_from_buffer_read_only cl_khr_spir cl_khr_gl_event state.fold_buf_size_short=65536; state.fold_buf_size_long=262144 WARNING: can't open binary kernel file for oclFFT plan: S:\BOINC/projects/setiweb.ssl.berkeley.edu_beta\AP_clFFTplan_Cedar_32768_gr64_lr8_wg128_tw0_r2601.bin_14455VM, continue with recompile...[/pre] |
![]() ![]() Send message Joined: 18 Aug 05 Posts: 2423 Credit: 15,878,738 RAC: 0 ![]() |
Fine, could you also try this build: https://www.dropbox.com/s/nl2ualzqzaf9acp/AP7_win_x86_SSE2_OpenCL_ATI_r2605.7zw/o any additional options, please. I need check that stock config can work on your device too. News about SETI opt app releases: https://twitter.com/Raistmer |
![]() Send message Joined: 11 Dec 09 Posts: 74 Credit: 1,248,766 RAC: 0 ![]() |
Fine, could you also try this build: https://www.dropbox.com/s/nl2ualzqzaf9acp/AP7_win_x86_SSE2_OpenCL_ATI_r2605.7zw/o any additional options, please. I need check that stock config can work on your device too. So far so good. Here is stderr while waiting for task to complete. Running on device number: 0 Priority of worker thread raised successfully Priority of process adjusted successfully, below normal priority class used OpenCL platform detected: Advanced Micro Devices, Inc. BOINC assigns device 0 Info: BOINC provided OpenCL device ID used Used GPU device parameters are: Number of compute units: 2 Single buffer allocation size: 256MB Total device global memory: 1024MB max WG size: 128 -unroll default value used: 2 -ffa_block default value used: 512 -ffa_block_fetch default value used: 256 Build features: Non-graphics BLANKIT OpenCL TWIN_FFA OCL_ZERO_COPY COMBINED_DECHIRP_KERNEL FFTW USE_INCREASED_PRECISION USE_SSE2 x86 CPUID: Intel(R) Core(TM) i3 CPU M 390 @ 2.67GHz Cache: L1=64K L2=256K CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3 SSSE3 SSE4.1 SSE4.2 AstroPulse v7 Windows x86 rev 2605, V7 match, by Raistmer with support of Lunatics.kwsn.net team. SSE2 OpenCL version by Raistmer oclFFT fix for ATI GPUs by Urs Echternacht ffa threshold mods by Joe Segur SSE3 dechirping by JDWhale Combined dechirp kernel by Frizz Built with uncommitted modifications Number of OpenCL platforms: 1 OpenCL Platform Name: AMD Accelerated Parallel Processing Number of devices: 1 Max compute units: 2 Max work group size: 128 Max clock frequency: 750Mhz Max memory allocation: 536870912 Cache type: None Cache line size: 0 Cache size: 0 Global memory size: 1073741824 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 32768 Queue properties: Out-of-Order: No Name: Cedar Vendor: Advanced Micro Devices, Inc. Driver version: 1445.5 (VM) Version: OpenCL 1.2 AMD-APP (1445.5) Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_image2d_from_buffer_read_only cl_khr_spir cl_khr_gl_event state.fold_buf_size_short=65536; state.fold_buf_size_long=262144 INFO: can't open binary kernel file: S:\BOINC/projects/setiweb.ssl.berkeley.edu_beta\AstroPulse_Kernels_r2605.cl_Cedar.bin_V7_TWIN_FFA_14455VM, continue with recompile... INFO: binary kernel file created |
![]() ![]() Send message Joined: 18 Aug 05 Posts: 2423 Credit: 15,878,738 RAC: 0 ![]() |
Fine, thanks. 7.03 could be expected soon then. News about SETI opt app releases: https://twitter.com/Raistmer |
![]() Send message Joined: 11 Dec 09 Posts: 74 Credit: 1,248,766 RAC: 0 ![]() |
Finished & currently pending. http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17490657 Seems I always have troublesome hardware. All the way back from HD3850 & hybrid app development. |
![]() ![]() Send message Joined: 18 Aug 05 Posts: 2423 Credit: 15,878,738 RAC: 0 ![]() |
WG of 128 is big limitation indeed. Even my entry-level C-60 APU has WG of 256... News about SETI opt app releases: https://twitter.com/Raistmer |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.