Questions and Answers :
GPU applications :
After updating to the latest Omega driver, only one GPU core is doing work
Message board moderation
Author | Message |
---|---|
Otaku_jhp Send message Joined: 14 Sep 99 Posts: 3 Credit: 6,021,647 RAC: 0 |
I own two R9 295x2 cards, and after updating to the latest Omega driver (14.501.1003), only one of the four cores is available for work. The event log appears to indicate that the other three cores are ignored by configuration, but I don't remember changing anything other than the GPU driver. 1/15/2015 5:51:09 PM | | OpenCL: AMD/ATI GPU 0: Hawaii (driver version 1642.5 (VM), device version OpenCL 2.0 AMD-APP (1642.5), 4096MB, 4096MB available, 3583 GFLOPS peak) 1/15/2015 5:51:09 PM | | OpenCL: AMD/ATI GPU 1 (ignored by config): Hawaii (driver version 1642.5 (VM), device version OpenCL 1.2 AMD-APP (1642.5), 3072MB, 3072MB available, 3583 GFLOPS peak) 1/15/2015 5:51:09 PM | | OpenCL: AMD/ATI GPU 2 (ignored by config): Hawaii (driver version 1642.5 (VM), device version OpenCL 1.2 AMD-APP (1642.5), 3072MB, 3072MB available, 3583 GFLOPS peak) 1/15/2015 5:51:09 PM | | OpenCL: AMD/ATI GPU 3 (ignored by config): Hawaii (driver version 1642.5 (VM), device version OpenCL 1.2 AMD-APP (1642.5), 3072MB, 3072MB available, 3583 GFLOPS peak) I can post the full clinfo upon request. Anyone else see this same issue, or have any ideas? My system used to absolutely blast through work units, but now it's just kind of puttering along at a leisurely pace. :) |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Check the contents of the cc_config.xml file in the data directory (default at c:\programdata\boinc\), and else post those contents here. |
Otaku_jhp Send message Joined: 14 Sep 99 Posts: 3 Credit: 6,021,647 RAC: 0 |
Check the contents of the cc_config.xml file in the data directory (default at c:\programdata\boinc\), and else post those contents here. Interestingly enough, I don't see that particular file listed, only the following: C:\ProgramData\BOINC\get_current_version.xml C:\ProgramData\BOINC\get_project_config.xml C:\ProgramData\BOINC\global_prefs.xml C:\ProgramData\BOINC\lookup_account.xml C:\ProgramData\BOINC\master_setiathome.berkeley.edu.xml C:\ProgramData\BOINC\sched_reply_setiathome.berkeley.edu.xml C:\ProgramData\BOINC\sched_request_setiathome.berkeley.edu.xml C:\ProgramData\BOINC\statistics_setiathome.berkeley.edu.xml C:\ProgramData\BOINC\account_setiathome.berkeley.edu.xml C:\ProgramData\BOINC\all_projects_list.xml C:\ProgramData\BOINC\client_state.xml C:\ProgramData\BOINC\client_state_prev.xml C:\ProgramData\BOINC\coproc_info.xml C:\ProgramData\BOINC\daily_xfer_history.xml What might I be missing here? As an additional data point, I have confirmed that rolling back to amd-catalyst-14-9-win7-win8.1-64bit-dd-ccc-whql does allow all four GPU cores to simultaneously do work as it used to. Re-upgrading to amd-catalyst-omega-14.12-with-dotnet45-win8.1-64bit again results in only a single GPU core doing work. Here is the full clinfo output from the system while running the new driver: Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 2.0 AMD-APP (1642.5) Platform Name: AMD Accelerated Parallel Processing Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_event_callback cl_amd_offline_devices Platform Name: AMD Accelerated Parallel Processing Number of devices: 5 Device Type: CL_DEVICE_TYPE_GPU Vendor ID: 1002h Board name: AMD Radeon R9 200 Series Device Topology: PCI[ B#3, D#0, F#0 ] Max compute units: 44 Max work items dimensions: 3 Max work items[0]: 256 Max work items[1]: 256 Max work items[2]: 256 Max work group size: 256 Preferred vector width char: 4 Preferred vector width short: 2 Preferred vector width int: 1 Preferred vector width long: 1 Preferred vector width float: 1 Preferred vector width double: 1 Native vector width char: 4 Native vector width short: 2 Native vector width int: 1 Native vector width long: 1 Native vector width float: 1 Native vector width double: 1 Max clock frequency: 1018Mhz Address bits: 64 Max memory allocation: 3018326016 Image support: Yes Max number of images read arguments: 128 Max number of images write arguments: 64 Max image 2D width: 16384 Max image 2D height: 16384 Max image 3D width: 2048 Max image 3D height: 2048 Max image 3D depth: 2048 Max samplers within kernel: 16 Max size of kernel argument: 1024 Alignment (bits) of base address: 2048 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: No Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: Yes Cache type: Read/Write Cache line size: 64 Cache size: 16384 Global memory size: 4294967296 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 32768 Max pipe arguments: 16 Max pipe active reservations: 16 Max pipe packet size: 3018326016 Max global variable size: 2716493312 Max global variable preferred total size: 4294967296 Max read/write image args: 64 Max on device events: 1024 Queue on device max size: 524288 Max on device queues: 1 Queue on device preferred size: 16384 SVM capabilities: Coarse grain buffer: Yes Fine grain buffer: Yes Fine grain system: No Atomics: No Preferred platform atomic alignment: 0 Preferred global atomic alignment: 0 Preferred local atomic alignment: 0 Kernel Preferred work group size multiple: 64 Error correction support: 0 Unified memory for Host and Device: 0 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: No Queue on Host properties: Out-of-Order: No Profiling : Yes Queue on Device properties: Out-of-Order: Yes Profiling : Yes Platform ID: 00007FFBBC156B60 Name: Hawaii Vendor: Advanced Micro Devices, Inc. Device OpenCL C version: OpenCL C 2.0 Driver version: 1642.5 (VM) Profile: FULL_PROFILE Version: OpenCL 2.0 AMD-APP (1642.5) Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_khr_image2d_from_buffer cl_khr_spir cl_khr_subgroups cl_khr_gl_event cl_khr_depth_images Device Type: CL_DEVICE_TYPE_GPU Vendor ID: 1002h Board name: AMD Radeon R9 200 Series Device Topology: PCI[ B#3, D#0, F#0 ] Max compute units: 44 Max work items dimensions: 3 Max work items[0]: 256 Max work items[1]: 256 Max work items[2]: 256 Max work group size: 256 Preferred vector width char: 4 Preferred vector width short: 2 Preferred vector width int: 1 Preferred vector width long: 1 Preferred vector width float: 1 Preferred vector width double: 1 Native vector width char: 4 Native vector width short: 2 Native vector width int: 1 Native vector width long: 1 Native vector width float: 1 Native vector width double: 1 Max clock frequency: 1018Mhz Address bits: 32 Max memory allocation: 3019112448 Image support: Yes Max number of images read arguments: 128 Max number of images write arguments: 8 Max image 2D width: 16384 Max image 2D height: 16384 Max image 3D width: 2048 Max image 3D height: 2048 Max image 3D depth: 2048 Max samplers within kernel: 16 Max size of kernel argument: 1024 Alignment (bits) of base address: 2048 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: No Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: Yes Cache type: Read/Write Cache line size: 64 Cache size: 16384 Global memory size: 3221225472 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 32768 Max pipe arguments: 0 Max pipe active reservations: 0 Max pipe packet size: 0 Max global variable size: 0 Max global variable preferred total size: 0 Max read/write image args: 0 Max on device events: 0 Queue on device max size: 0 Max on device queues: 0 Queue on device preferred size: 0 SVM capabilities: Coarse grain buffer: No Fine grain buffer: No Fine grain system: No Atomics: No Preferred platform atomic alignment: 0 Preferred global atomic alignment: 0 Preferred local atomic alignment: 0 Kernel Preferred work group size multiple: 64 Error correction support: 0 Unified memory for Host and Device: 0 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: No Queue on Host properties: Out-of-Order: No Profiling : Yes Queue on Device properties: Out-of-Order: No Profiling : No Platform ID: 00007FFBBC156B60 Name: Hawaii Vendor: Advanced Micro Devices, Inc. Device OpenCL C version: OpenCL C 1.2 Driver version: 1642.5 (VM) Profile: FULL_PROFILE Version: OpenCL 1.2 AMD-APP (1642.5) Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event Device Type: CL_DEVICE_TYPE_GPU Vendor ID: 1002h Board name: AMD Radeon R9 200 Series Device Topology: PCI[ B#3, D#0, F#0 ] Max compute units: 44 Max work items dimensions: 3 Max work items[0]: 256 Max work items[1]: 256 Max work items[2]: 256 Max work group size: 256 Preferred vector width char: 4 Preferred vector width short: 2 Preferred vector width int: 1 Preferred vector width long: 1 Preferred vector width float: 1 Preferred vector width double: 1 Native vector width char: 4 Native vector width short: 2 Native vector width int: 1 Native vector width long: 1 Native vector width float: 1 Native vector width double: 1 Max clock frequency: 1018Mhz Address bits: 32 Max memory allocation: 3019112448 Image support: Yes Max number of images read arguments: 128 Max number of images write arguments: 8 Max image 2D width: 16384 Max image 2D height: 16384 Max image 3D width: 2048 Max image 3D height: 2048 Max image 3D depth: 2048 Max samplers within kernel: 16 Max size of kernel argument: 1024 Alignment (bits) of base address: 2048 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: No Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: Yes Cache type: Read/Write Cache line size: 64 Cache size: 16384 Global memory size: 3221225472 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 32768 Max pipe arguments: 0 Max pipe active reservations: 0 Max pipe packet size: 0 Max global variable size: 0 Max global variable preferred total size: 0 Max read/write image args: 0 Max on device events: 0 Queue on device max size: 0 Max on device queues: 0 Queue on device preferred size: 0 SVM capabilities: Coarse grain buffer: No Fine grain buffer: No Fine grain system: No Atomics: No Preferred platform atomic alignment: 0 Preferred global atomic alignment: 0 Preferred local atomic alignment: 0 Kernel Preferred work group size multiple: 64 Error correction support: 0 Unified memory for Host and Device: 0 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: No Queue on Host properties: Out-of-Order: No Profiling : Yes Queue on Device properties: Out-of-Order: No Profiling : No Platform ID: 00007FFBBC156B60 Name: Hawaii Vendor: Advanced Micro Devices, Inc. Device OpenCL C version: OpenCL C 1.2 Driver version: 1642.5 (VM) Profile: FULL_PROFILE Version: OpenCL 1.2 AMD-APP (1642.5) Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event Device Type: CL_DEVICE_TYPE_GPU Vendor ID: 1002h Board name: AMD Radeon R9 200 Series Device Topology: PCI[ B#3, D#0, F#0 ] Max compute units: 44 Max work items dimensions: 3 Max work items[0]: 256 Max work items[1]: 256 Max work items[2]: 256 Max work group size: 256 Preferred vector width char: 4 Preferred vector width short: 2 Preferred vector width int: 1 Preferred vector width long: 1 Preferred vector width float: 1 Preferred vector width double: 1 Native vector width char: 4 Native vector width short: 2 Native vector width int: 1 Native vector width long: 1 Native vector width float: 1 Native vector width double: 1 Max clock frequency: 1018Mhz Address bits: 32 Max memory allocation: 3019112448 Image support: Yes Max number of images read arguments: 128 Max number of images write arguments: 8 Max image 2D width: 16384 Max image 2D height: 16384 Max image 3D width: 2048 Max image 3D height: 2048 Max image 3D depth: 2048 Max samplers within kernel: 16 Max size of kernel argument: 1024 Alignment (bits) of base address: 2048 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: No Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: Yes Cache type: Read/Write Cache line size: 64 Cache size: 16384 Global memory size: 3221225472 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 32768 Max pipe arguments: 0 Max pipe active reservations: 0 Max pipe packet size: 0 Max global variable size: 0 Max global variable preferred total size: 0 Max read/write image args: 0 Max on device events: 0 Queue on device max size: 0 Max on device queues: 0 Queue on device preferred size: 0 SVM capabilities: Coarse grain buffer: No Fine grain buffer: No Fine grain system: No Atomics: No Preferred platform atomic alignment: 0 Preferred global atomic alignment: 0 Preferred local atomic alignment: 0 Kernel Preferred work group size multiple: 64 Error correction support: 0 Unified memory for Host and Device: 0 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: No Queue on Host properties: Out-of-Order: No Profiling : Yes Queue on Device properties: Out-of-Order: No Profiling : No Platform ID: 00007FFBBC156B60 Name: Hawaii Vendor: Advanced Micro Devices, Inc. Device OpenCL C version: OpenCL C 1.2 Driver version: 1642.5 (VM) Profile: FULL_PROFILE Version: OpenCL 1.2 AMD-APP (1642.5) Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event Device Type: CL_DEVICE_TYPE_CPU Vendor ID: 1002h Board name: Max compute units: 8 Max work items dimensions: 3 Max work items[0]: 1024 Max work items[1]: 1024 Max work items[2]: 1024 Max work group size: 1024 Preferred vector width char: 16 Preferred vector width short: 8 Preferred vector width int: 4 Preferred vector width long: 2 Preferred vector width float: 8 Preferred vector width double: 4 Native vector width char: 16 Native vector width short: 8 Native vector width int: 4 Native vector width long: 2 Native vector width float: 8 Native vector width double: 4 Max clock frequency: 4716Mhz Address bits: 64 Max memory allocation: 4272134144 Image support: Yes Max number of images read arguments: 128 Max number of images write arguments: 64 Max image 2D width: 8192 Max image 2D height: 8192 Max image 3D width: 2048 Max image 3D height: 2048 Max image 3D depth: 2048 Max samplers within kernel: 16 Max size of kernel argument: 4096 Alignment (bits) of base address: 1024 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: Yes Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: Yes Cache type: Read/Write Cache line size: 64 Cache size: 16384 Global memory size: 17088536576 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Global Local memory size: 32768 Max pipe arguments: 16 Max pipe active reservations: 16 Max pipe packet size: 4272134144 Max global variable size: 1879048192 Max global variable preferred total size: 1879048192 Max read/write image args: 64 Max on device events: 0 Queue on device max size: 0 Max on device queues: 0 Queue on device preferred size: 0 SVM capabilities: Coarse grain buffer: Yes Fine grain buffer: Yes Fine grain system: Yes Atomics: Yes Preferred platform atomic alignment: 0 Preferred global atomic alignment: 0 Preferred local atomic alignment: 0 Kernel Preferred work group size multiple: 1 Error correction support: 0 Unified memory for Host and Device: 1 Profiling timer resolution: 69 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: Yes Queue on Host properties: Out-of-Order: No Profiling : Yes Queue on Device properties: Out-of-Order: No Profiling : No Platform ID: 00007FFBBC156B60 Name: AMD FX(tm)-9590 Eight-Core Processor Vendor: AuthenticAMD Device OpenCL C version: OpenCL C 1.2 Driver version: 1642.5 (sse2,avx,fma4) Profile: FULL_PROFILE Version: OpenCL 1.2 AMD-APP (1642.5) Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_spir cl_khr_gl_event |
Otaku_jhp Send message Joined: 14 Sep 99 Posts: 3 Credit: 6,021,647 RAC: 0 |
Had to add the following to the cc_config.xml. <cc_config> <options> <use_all_gpus>1</use_all_gpus> </options> </cc_config> Still curious as to why the driver update changed the behavior though. Strange. |
BilBg Send message Joined: 27 May 07 Posts: 3720 Credit: 9,385,827 RAC: 0 |
Your first post show 3 times "(ignored by config)" That is why Ageless asked to check cc_config.xml At that time the "(ignored by config)" lines imply that you have in cc_config.xml some lines to: <ignore_ati_dev>N</ignore_ati_dev> Â Â Â Â Â ignore (don't use) a specific ATI GPU. You can ignore more than one. But yes, if the GPUs/video cards of the same vendor are not reported as identical (in your case OpenCL 2.0 vs 1.2 and 4096MB vs 3072MB) you need <use_all_gpus>1</use_all_gpus> Maybe the current BOINC version say "(ignored by config)" even if that means the default of <use_all_gpus>0</use_all_gpus> is used. (If no cc_config.xml exist BOINC acts using defaults) Â - ALF - "Find out what you don't do well ..... then don't do it!" :) Â |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Maybe the current BOINC version say "(ignored by config)" even if that means the default of <use_all_gpus>0</use_all_gpus> is used. Good question. No, according to the source code, the ignored by config is explicitely mentioned when the user has the <ignore_X_dev> set in cc_config.xml When there is more than one GPU in the system and they're (all) different so that only the best GPU is used, the message on the other one(s) is not used. Example given: http://boinc.berkeley.edu/dev/forum_thread.php?id=7988 |
mz24cn Send message Joined: 28 Jun 15 Posts: 1 Credit: 0 RAC: 0 |
I posted a similar issue on AMD forum: OpenCL 2.0 driver version 1642 vs 1800: https://community.amd.com/thread/183480 |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.