Linux AMD - no GPU support ?

Message boards : Number crunching : Linux AMD - no GPU support ?
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Falken
Volunteer tester

Send message
Joined: 18 May 99
Posts: 18
Credit: 1,067,649
RAC: 1,471
United Kingdom
Message 1979018 - Posted: 6 Feb 2019, 23:18:32 UTC

I'm fairly sure this should work, because

#lpci | grep VGA
00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Wani [Radeon R5/R6/R7 Graphics] (rev 84)

and

#clinfo |grep 'loader Profile'
Device open failed, aborting...
Device open failed, aborting...
Device open failed, aborting...
Device open failed, aborting...
Device open failed, aborting...
ICD loader Profile OpenCL 2.2

but still boinc reports :
...
OpenCL CPU: pthread-AMD Opteron(tm) X3421 APU (OpenCL driver vendor: The pocl project, driver version 1.2,
...
No usable GPUs found
ID: 1979018 · Report as offensive
Profile Bill Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 30 Nov 05
Posts: 247
Credit: 3,808,919
RAC: 5,929
United States
Message 1979053 - Posted: 7 Feb 2019, 3:53:56 UTC - in response to Message 1979018.  

Seems like we're having all sorts of problems with AMD GPUs. Is this coming from the event log? Are you able to download any GPU tasks?

This may be unrelated, but we've been trying to troubleshoot some problems with AMD APUs/GPUs not computing tasks at all here: https://setiathome.berkeley.edu/forum_thread.php?id=83829
Seti@home classic: 1,456 results, 1.613 years CPU time
ID: 1979053 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 9613
Credit: 882,881,495
RAC: 1,746,410
United States
Message 1979058 - Posted: 7 Feb 2019, 4:47:01 UTC

I think the issue is either the drivers necessary for the Opteron APU or the type of AMD device is not recognized by the BOINC code as a AMD device.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1979058 · Report as offensive
Juha
Volunteer tester

Send message
Joined: 7 Mar 04
Posts: 388
Credit: 1,828,235
RAC: 154
Finland
Message 1979139 - Posted: 7 Feb 2019, 17:48:28 UTC - in response to Message 1979018.  

First of all, show logs or command outputs in full. By giving us only a line or two you omitted the part that would have shown that OpenCL GPU drivers are not installed.

Second, what do you have installed?
ID: 1979139 · Report as offensive
Falken
Volunteer tester

Send message
Joined: 18 May 99
Posts: 18
Credit: 1,067,649
RAC: 1,471
United Kingdom
Message 1979176 - Posted: 7 Feb 2019, 20:56:16 UTC - in response to Message 1979139.  

How do I get fuller logs ?
Since moving to systemd (Fedora 2x-something) the only way I know is using a 3rd party ('boinctui') program.

/var/log/boinc is missing as far as I can tell.

How can I tell if I'm trying to download GPU tasks ? I assume if the client says "no GPU support" it won't even try ?

What library could I be missing ? What does setiathome/boinc need to find ?
ID: 1979176 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 9613
Credit: 882,881,495
RAC: 1,746,410
United States
Message 1979179 - Posted: 7 Feb 2019, 21:05:12 UTC - in response to Message 1979176.  

Do you have the BOINC Manager installed? In the Manager is a menu choice to display the Event Log. We need to see the first 30 lines in the Event Log after first BOINC startup. This will show the gpus detected by the client and what drivers are installed for them by the system. We are looking whether you have the AMD OpenCL drivers installed or just the stock video drivers. There are two different components to the drivers, one for displaying graphics and one for compute. If the compute component is missing you won't be able to use the card for BOINC.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1979179 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 9613
Credit: 882,881,495
RAC: 1,746,410
United States
Message 1979181 - Posted: 7 Feb 2019, 21:10:29 UTC
Last modified: 7 Feb 2019, 21:16:12 UTC

You have only downloaded and crunched cpu tasks so far. You would see a gpu task identified with a 8.22 (opencl) type for the application if you downloaded the gpu application for your card and received tasks for it. Since I see none in your list of tasks I assume BOINC does not recognize any valid gpu available and therefore hasn't downloaded an application yet.

I just remembered a thread at Einstein with detailed instructions on how to install the OpenCL drivers for AMD.

https://einsteinathome.org/it-it/content/quick-guide-how-install-opencl-amd-gpus-linux-kubuntu-1804-and-similar-distro
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1979181 · Report as offensive
Juha
Volunteer tester

Send message
Joined: 7 Mar 04
Posts: 388
Credit: 1,828,235
RAC: 154
Finland
Message 1979329 - Posted: 8 Feb 2019, 15:03:01 UTC - in response to Message 1979176.  

How do I get fuller logs ?


Easy. Don't pipe the output to grep.

Since moving to systemd (Fedora 2x-something) the only way I know is using a 3rd party ('boinctui') program.


The logs are probably handled by systemd-journald now. sudo journald boinc-client (or maybe just boinc, I can't remember) should give you the logs.


Now, the stuff you have installed. You have installed Fedora 29 (according to your host list), BOINC (since the host is listed here) and pocl (according to your opening post).

pocl is OpenCL seePU library. No project that I know of has OpenCL CPU apps which makes pocl pretty useless but it shouldn't cause any harm either.

What else? Have you installed any package with amd in name, or mesa, or opencl, or anything at all?
ID: 1979329 · Report as offensive
Falken
Volunteer tester

Send message
Joined: 18 May 99
Posts: 18
Credit: 1,067,649
RAC: 1,471
United Kingdom
Message 1979572 - Posted: 9 Feb 2019, 14:46:42 UTC - in response to Message 1979329.  
Last modified: 9 Feb 2019, 15:19:06 UTC

This machine runs headless. X is not running. So the manager is a bit pointless ?

I have these packages
# rpm -qa|grep -i opencl
mesa-libOpenCL-18.2.8-1.fc29.x86_64
opencl-filesystem-1.0-8.fc29.noarch
mesa-libOpenCL-devel-18.2.8-1.fc29.x86_64

I'm trying the weird extra AMD bits from https://einsteinathome.org/it-it/content/quick-guide-how-install-opencl-amd-gpus-linux-kubuntu-1804-and-similar-distro but this doesn't appear to be supported for Fedora 29 ?

Thanks to you guys, I can now get logs with "journalctl --unit=boinc-client |less" :

-- Reboot --
Feb 09 14:38:29 bookcase systemd[1]: Started Berkeley Open Infrastructure Network Computing Client.
Feb 09 14:38:52 bookcase boinc[862]: 09-Feb-2019 14:38:52 [---] cc_config.xml not found - using defaults
Feb 09 14:38:52 bookcase boinc[862]: 09-Feb-2019 14:38:52 [---] Starting BOINC client version 7.14.2 for x86_64-pc-linux-gnu
Feb 09 14:38:52 bookcase boinc[862]: 09-Feb-2019 14:38:52 [---] log flags: file_xfer, sched_ops, task
Feb 09 14:38:52 bookcase boinc[862]: 09-Feb-2019 14:38:52 [---] Libraries: libcurl/7.61.1 OpenSSL/1.1.1a zlib/1.2.11 brotli/1.0.5 libidn2/2.0.5 libpsl/0.20.2 (+libidn2/2.0.5) lib
ssh/0.8.6/openssl/zlib nghttp2/1.34.0
Feb 09 14:38:52 bookcase boinc[862]: 09-Feb-2019 14:38:52 [---] Data directory: /var/lib/boinc
Feb 09 14:38:55 bookcase boinc[862]: 09-Feb-2019 14:38:55 [---] OpenCL CPU: pthread-AMD Opteron(tm) X3421 APU (OpenCL driver vendor
: The pocl project, driver version 1.2, device version OpenCL 1.2 pocl HSTR: pthread-x86_64-unknown-linux-gnu-bdver4)
Feb 09 14:38:55 bookcase boinc[862]: 09-Feb-2019 14:38:55 [---] No usable GPUs found
Feb 09 14:38:55 bookcase boinc[862]: 09-Feb-2019 14:38:55 [---] [libc detection] gathered: 2.28, GNU libc
ID: 1979572 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 9613
Credit: 882,881,495
RAC: 1,746,410
United States
Message 1979576 - Posted: 9 Feb 2019, 16:35:47 UTC

You'll have to search for Fedora answers on how to install the official AMD drivers FROM AMD. Only the official AMD drivers have compute OpenCL that is usable by BOINC.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1979576 · Report as offensive
Juha
Volunteer tester

Send message
Joined: 7 Mar 04
Posts: 388
Credit: 1,828,235
RAC: 154
Finland
Message 1979604 - Posted: 9 Feb 2019, 20:15:59 UTC - in response to Message 1979576.  

Mesa OpenCL seems to be slowly maturing. There is at least one Einstein participant who is using it. Though judging from his posts Mesa OpenCL has a habit of going from broken to working to broken from one release to another. But in his case detecting always worked. It's the science apps that don't always work.

That said, AMD's own drivers might work better.

@Falken

Could you post the full clinfo output sothat we can see what it found.

It maybe that running headless is the problem here. It's often X that is responsible for loading drivers but you don't run X. If it's easy could you start X and restart BOINC client service to see if that fixes anything. Please get the clinfo output before and after loading X.
ID: 1979604 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 9613
Credit: 882,881,495
RAC: 1,746,410
United States
Message 1979621 - Posted: 9 Feb 2019, 21:08:23 UTC
Last modified: 9 Feb 2019, 21:12:06 UTC

Since AMD has endorsed the open software mission, their efforts for gpu compute seem to be directed toward the ROCm initiative. But ROCm seems to be only available and working in the latest kernel releases. I see the most news about ROCm on the Phoronix.com website.
https://www.phoronix.com/scan.php?page=search&q=ROCm
This article highlights some of the issues with Fedora and ROCm
https://www.phoronix.com/scan.php?page=news_item&px=ROCm-Possible-For-Fedora
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1979621 · Report as offensive
Falken
Volunteer tester

Send message
Joined: 18 May 99
Posts: 18
Credit: 1,067,649
RAC: 1,471
United Kingdom
Message 1979736 - Posted: 10 Feb 2019, 16:57:45 UTC - in response to Message 1979604.  
Last modified: 10 Feb 2019, 16:58:44 UTC

clinfo

[root@bookcase rra]# clinfo
Device open failed, aborting...
Device open failed, aborting...
cl_get_gt_device(): error, unknown device: ffffffff
Device open failed, aborting...
Device open failed, aborting...
cl_get_gt_device(): error, unknown device: ffffffff
Number of platforms                               3
  Platform Name                                   Portable Computing Language
  Platform Vendor                                 The pocl project
  Platform Version                                OpenCL 1.2 pocl 1.2 RelWithDebInfo, LLVM 7.0.0, SLEEF, DISTRO, POCL_DEBUG
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             POCL

  Platform Name                                   Clover
  Platform Vendor                                 Mesa
  Platform Version                                OpenCL 1.1 Mesa 18.2.8
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             MESA

  Platform Name                                   Intel Gen OCL Driver
  Platform Vendor                                 Intel
  Platform Version                                OpenCL 2.0 beignet 1.3
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_khr_gl_sharing
  Platform Extensions function suffix             Intel
Device open failed, aborting...
cl_get_gt_device(): error, unknown device: ffffffff

  Platform Name                                   Portable Computing Language
Number of devices                                 1
  Device Name                                     pthread-AMD Opteron(tm) X3421 APU
  Device Vendor                                   AuthenticAMD
  Device Vendor ID                                0x6c636f70
  Device Version                                  OpenCL 1.2 pocl HSTR: pthread-x86_64-unknown-linux-gnu-bdver4
  Driver Version                                  1.2
  Device OpenCL C Version                         OpenCL C 1.2 pocl
  Device Type                                     CPU
  Device Available                                Yes
  Device Profile                                  FULL_PROFILE
  Max compute units                               4
  Max clock frequency                             2100MHz
  Device Partition                                (core)
    Max number of sub-devices                     4
    Supported partition types                     equally, by counts
  Max work item dimensions                        3
  Max work item sizes                             4096x4096x4096
  Max work group size                             4096
  Compiler Available                              Yes
  Linker Available                                Yes
  Preferred work group size multiple              8
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                               16 / 16      
    int                                                  8 / 8       
    long                                                 4 / 4       
    half                                                 0 / 0        (n/a)
    float                                                8 / 8       
    double                                               4 / 4        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              5643812864 (5.256GiB)
  Error Correction support                        No
  Max memory allocation                           2147483648 (2GiB)
  Unified memory for Host and Device              Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        1048576 (1024KiB)
  Global Memory cache line size                   64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            134217728 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             8192x8192 pixels
    Max 3D image size                             2048x2048x2048 pixels
    Max number of read image args                 128
    Max number of write image args                128
  Local memory type                               Global
  Local memory size                               524288 (512KiB)
  Max constant buffer size                        524288 (512KiB)
  Max number of constant args                     8
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        No
    Profiling                                     Yes
  Prefer user sync for interop                    Yes
  Profiling timer resolution                      1ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            Yes
  printf() buffer size                            16777216 (16MiB)
  Built-in kernels                                
  Device Extensions                               cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_fp64 cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64

  Platform Name                                   Clover
Number of devices                                 0

  Platform Name                                   Intel Gen OCL Driver
Number of devices                                 0

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  Portable Computing Language
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [POCL]
  clCreateContext(NULL, ...) [default]            Success [POCL]
  clCreateContext(NULL, ...) [other]              <error: no devices in non-default plaforms>
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 Portable Computing Language
    Device Name                                   pthread-AMD Opteron(tm) X3421 APU
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  Success (1)
    Platform Name                                 Portable Computing Language
    Device Name                                   pthread-AMD Opteron(tm) X3421 APU
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 Portable Computing Language
    Device Name                                   pthread-AMD Opteron(tm) X3421 APU

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.12
  ICD loader Profile                              OpenCL 2.2

ID: 1979736 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 9613
Credit: 882,881,495
RAC: 1,746,410
United States
Message 1979746 - Posted: 10 Feb 2019, 18:52:30 UTC

The problem is that the Pocl driver is only for cpus and does not handle AMD gpus. The Mesa driver is a work in progress for OpenCL compute apparently. I found this quote from someone who manages a university compute cluster.

POCL is a mediocre substitute, but it's not a vendor solution. It doesn't have any interop capabilities, neither can it be paired with any GPUs context. Any app of mine that could make use of CPU and GPU at the same time (well made load balancing with CPU sub-device to break off a part of it to manage the GPUs) now has to be made multi-platform code, not just multi-device. THIS SUCKS!

Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1979746 · Report as offensive
Juha
Volunteer tester

Send message
Joined: 7 Mar 04
Posts: 388
Credit: 1,828,235
RAC: 154
Finland
Message 1979749 - Posted: 10 Feb 2019, 19:30:11 UTC - in response to Message 1979736.  

Well, clinfo didn't find the AMD GPU either so it's not a problem with just BOINC. That's some sort of a good thing I suppose.

The only AMD GPU you have is the one integrated with the CPU, right?

I think at this point it would be best that you grab a spare monitor and keyboard somewhere and troubleshoot things as if the machine was going to be a normal daily use machine. Right now it's hard to tell if the problem is coming from incompatible drivers or trying to run the machine headless.
ID: 1979749 · Report as offensive
Falken
Volunteer tester

Send message
Joined: 18 May 99
Posts: 18
Credit: 1,067,649
RAC: 1,471
United Kingdom
Message 1979946 - Posted: 11 Feb 2019, 21:20:31 UTC - in response to Message 1979749.  


The only AMD GPU you have is the one integrated with the CPU, right?


This:
# lspci|grep VGA
00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Wani [Radeon R5/R6/R7 Graphics] (rev 84)


is a Opteron™ X3421

So yes ?
ID: 1979946 · Report as offensive
Falken
Volunteer tester

Send message
Joined: 18 May 99
Posts: 18
Credit: 1,067,649
RAC: 1,471
United Kingdom
Message 1979964 - Posted: 11 Feb 2019, 22:19:27 UTC - in response to Message 1979746.  

Hmm. So reading around it seems AMD's driver is now (4.20.x) in the upsteam kernel. But their userland (==opencl libraries) are not in Fedora. Or very well packaged.

I'm going to try https://github.com/RadeonOpenCompute/Experimental_ROC/tree/master/distro_install_scripts/Fedora/Fedora_29/rpm_install (found on https://github.com/RadeonOpenCompute/ROCm/issues/567) later...
ID: 1979964 · Report as offensive
Falken
Volunteer tester

Send message
Joined: 18 May 99
Posts: 18
Credit: 1,067,649
RAC: 1,471
United Kingdom
Message 1980673 - Posted: 16 Feb 2019, 12:26:54 UTC - in response to Message 1979964.  

I'm not sure if it was the half-aborted install of https://github.com/RadeonOpenCompute/Experimental_ROC/tree/master/distro_install_scripts/Fedora/Fedora_29/rpm_install or my removal of a "nomodeset" from the kernel command line options (turns out this doesn't lead to a dead display any more, probably been there for years !), but now :

# sudo journalctl --unit=boinc-client |grep -i opencl
says

Feb 16 12:08:51 bookcase boinc[865]: 16-Feb-2019 12:08:51 [---] OpenCL: AMD/ATI GPU 0: AMD Radeon R7 Graphics (CARRIZO, DRM 3.27.0, 4.20.5-200.fc29.x86_64, LLVM 7.0.0) (driver version 18.2.8, device version OpenCL 1.1 Mesa 18.2.8, 3072MB, 3072MB available, 512 GFLOPS peak)
Feb 16 12:08:51 bookcase boinc[865]: 16-Feb-2019 12:08:51 [---] OpenCL CPU: pthread-AMD Opteron(tm) X3421 APU (OpenCL driver vendor: The pocl project, driver version 1.2, device version OpenCL 1.2 pocl HSTR: pthread-x86_64-unknown-linux-gnu-bdver4)
...
Requesting new tasks for CPU and AMD/ATI GPU

hurrah !

Is there an easy way to see a workunit is GPU-enabled ?
ID: 1980673 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2766
Credit: 546,153,320
RAC: 844,493
Canada
Message 1980674 - Posted: 16 Feb 2019, 12:40:30 UTC - in response to Message 1980673.  

ID: 1980674 · Report as offensive
Profile Bill G Special Project $75 donor
Avatar

Send message
Joined: 1 Jun 01
Posts: 1136
Credit: 169,752,437
RAC: 85,313
United States
Message 1980689 - Posted: 16 Feb 2019, 15:20:03 UTC - in response to Message 1980673.  


Is there an easy way to see a workunit is GPU-enabled ?

Go to Project/Account/Tasks and it will show you there.

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours
ID: 1980689 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : Linux AMD - no GPU support ?


 
©2019 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.