Does Intel GPU on Linux actually work?

Message boards : Number crunching : Does Intel GPU on Linux actually work?
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Gordon Lack

Send message
Joined: 23 Aug 09
Posts: 14
Credit: 727,367
RAC: 3
United Kingdom
Message 2015003 - Posted: 11 Oct 2019, 10:57:37 UTC

I enabled Intel GPU computing for my Linux systems, and several jobs were downloaded.
They each ran in ~20s (some in much less), which looked odd, and indeed the first one is now marked as Invalid when compared against two Nvidia runs.

All(?) of my runs include this text in the output log:
SETI@Home Informational message -9 result_overflow
NOTE: The number of results detected equals the storage space allocated.
See https://setiathome.berkeley.edu/workunit.php?wuid=3688076536 for an example.

So, has anyone used an Intel GPU on Linux successfully?
ID: 2015003 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 2015006 - Posted: 11 Oct 2019, 11:15:28 UTC

It's certainly not doing well at all and until you hear from someone who can help then I'd suggest disabling it for the time being.

Cheers.
ID: 2015006 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 2015018 - Posted: 11 Oct 2019, 12:41:26 UTC

Yes, your wingmen have no -9 so its on your end.

It could be either a driver or temp issue.

Disable iGPU.


With each crime and every kindness we birth our future.
ID: 2015018 · Report as offensive
Profile Joseph Stateson Project Donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 309
Credit: 70,759,933
RAC: 3
United States
Message 2015033 - Posted: 11 Oct 2019, 14:27:56 UTC - in response to Message 2015003.  


So, has anyone used an Intel GPU on Linux successfully?


I gave up trying to get my Haswell CPU to crunch in Ubuntu. All indications that only 6th gen and higher are supported. However, that Haswell works fine in Windows. The same "beta open cl" app worked (but overheated) on my surface pro 4 "Skylake cpu".

https://einsteinathome.org/content/surprise-i-picked-intel-open-cl-app

Running 18.04.3, I was unable to get HD7950 and the RX series or AMD boards to co-exist. Same for HD7950 and S9000 although they are both Tahiti based. Not sure if nVidia is better under Linux as I don't have a mix of different boards like I do for AMD. I have had no problem running a mix of Nvidia and AMD under windows. Would not want to try that under ubuntu.
ID: 2015033 · Report as offensive
Gordon Lack

Send message
Joined: 23 Aug 09
Posts: 14
Credit: 727,367
RAC: 3
United Kingdom
Message 2015038 - Posted: 11 Oct 2019, 14:36:50 UTC - in response to Message 2015033.  
Last modified: 11 Oct 2019, 14:41:48 UTC

I gave up trying to get my Haswell CPU to crunch in Ubuntu.
Not the issue here, though.
The system in question is a Coffee Lake one and runs Collatz jobs with no problems.
As do my Ivy Bridge and Broadwell systems.

I am using the standard Beignet packages on Kubuntu.
ID: 2015038 · Report as offensive
Gordon Lack

Send message
Joined: 23 Aug 09
Posts: 14
Credit: 727,367
RAC: 3
United Kingdom
Message 2015054 - Posted: 11 Oct 2019, 15:53:35 UTC - in response to Message 2015038.  

I am using the standard Beignet packages on Kubuntu.
Which would appear to be at least part of the problem.

It turns out you can now run the Intel-produced OpenCL runtime on standard kernels:
https://software.intel.com/en-us/articles/opencl-drivers
so I've installed those, and the GPU job now runs in a more-expected time. and doesn't produce the "-9 result_overflow" message.

HOWEVER!!!! It turns out that one of the earlier results with a "-9 result_overflow" message has now been validated! The second job run (using an Nvidia GPU on an anonymous platform (it's a GTX 1070 on an Intel Mac) also produced the same message. See:
https://setiathome.berkeley.edu/workunit.php?wuid=3688076542

So, is it possible that there is a generic bug which Beignet is just more likely to trigger?
ID: 2015054 · Report as offensive
Profile Eric B

Send message
Joined: 9 Mar 00
Posts: 88
Credit: 168,875,085
RAC: 762
United States
Message 2015066 - Posted: 11 Oct 2019, 18:30:33 UTC - in response to Message 2015054.  

Will this conflict with my nvidia gpu? I'd like to run both, the nvidia 1660ti and the intel UHD630 (I have a i9-9900k system)
ID: 2015066 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2015075 - Posted: 11 Oct 2019, 19:08:56 UTC - in response to Message 2015054.  

We do get a constant flux of "noisy" work units. Satellite transmissions intercepted or radar pulses cause RFI. So the -9 outcome is correct and expected. If your wingmen validated a noisy task, it is likely real. If on the other hand your card produces nothing but -9 results on all tasks, you have a hardware or software problem.

While is may be possible to run both an external gpu plus the internal Intel one, most people report is isn't worth the bother as the performance is so low compared to a proper external gpu. It also severely reduces the performance of any cpu tasks running on the host.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2015075 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 2015101 - Posted: 11 Oct 2019, 22:34:22 UTC - in response to Message 2015066.  
Last modified: 11 Oct 2019, 22:36:22 UTC

Will this conflict with my nvidia gpu? I'd like to run both, the nvidia 1660ti and the intel UHD630 (I have a i9-9900k system)
I use the iGPU to drive a monitor (no crunching), and that allowed me to use the more aggressive command line settings on my GTX 750Tis. If they were driving a monitor it would have made the system unusable. With the iGPU driving the monitor, it wasn't an issue & they were able to pump out as much as they were capable of. Using the LINUX special application, the addon GPU will still be capable of maximum crunching- however using the iGPU will impact on the CPU performance, and may impact on the output of the addon video card.
Intel iGPU for the monitor, addon GPU for crunching.
Grant
Darwin NT
ID: 2015101 · Report as offensive
Gordon Lack

Send message
Joined: 23 Aug 09
Posts: 14
Credit: 727,367
RAC: 3
United Kingdom
Message 2015140 - Posted: 12 Oct 2019, 1:32:24 UTC - in response to Message 2015075.  

We do get a constant flux of "noisy" work units. Satellite transmissions intercepted or radar pulses cause RFI. So the -9 outcome is correct and expected. If your wingmen validated a noisy task, it is likely real. If on the other hand your card produces nothing but -9 results on all tasks, you have a hardware or software problem.
With Beignet they were all reporting a -9 outcome - and finished in ~10s. Wingmen were (mostly) producing completed runs.
Using the Intel OpenCL code mine are all now completing without this error and take ~30mins.
Given that it's the same hardware I reckon it being a hardware problem is unlikely.
If it's a software problem then it's one that is shared by the Nvidia code on a Mac.

I'm a little confused as to how a -9 result can be expected in one set-up when it is not seen in a different one.
ID: 2015140 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2015176 - Posted: 12 Oct 2019, 16:10:29 UTC - in response to Message 2015140.  

Only the correct OpenCL drivers produce a good result. The other varieties of OpenCL are incompatible with the apps.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2015176 · Report as offensive
Gordon Lack

Send message
Joined: 23 Aug 09
Posts: 14
Credit: 727,367
RAC: 3
United Kingdom
Message 2015337 - Posted: 13 Oct 2019, 19:34:05 UTC - in response to Message 2015176.  

Only the correct OpenCL drivers produce a good result. The other varieties of OpenCL are incompatible with the apps.
Beignet runs OpenCL on an Intel GPU. And it runs. If it were incompatible it should be stopping with an error.
ID: 2015337 · Report as offensive
Gordon Lack

Send message
Joined: 23 Aug 09
Posts: 14
Credit: 727,367
RAC: 3
United Kingdom
Message 2015338 - Posted: 13 Oct 2019, 19:40:33 UTC - in response to Message 2015176.  

Only the correct OpenCL drivers produce a good result. The other varieties of OpenCL are incompatible with the apps.
It seems that various NVidia ones also produce the -9 error.

I'm surprised that these aren't just flagged as an error rather than being marked as successful.
ID: 2015338 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 2015340 - Posted: 13 Oct 2019, 20:01:12 UTC

There are work units that produce legitimate -9 overflow results. ;-)

Cheers.
ID: 2015340 · Report as offensive
Gordon Lack

Send message
Joined: 23 Aug 09
Posts: 14
Credit: 727,367
RAC: 3
United Kingdom
Message 2015356 - Posted: 13 Oct 2019, 23:44:15 UTC - in response to Message 2015341.  
Last modified: 13 Oct 2019, 23:44:42 UTC

Yes, -9 overflow is not necessarily an error.
There's tons and tons of legitimate -9 overflow results.
But there are also some that are not legitimate. So whatever produces them is indeterminate, which makes any -9 result a lottery.
ID: 2015356 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22160
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2015398 - Posted: 14 Oct 2019, 9:39:37 UTC

If you are concerned about any -9 then the best thing to do is to track its progress though to validation. f it is a "real -9" then it will validate as such, otherwise you will get an "invalid" or "error" returned against your task.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2015398 · Report as offensive
Gordon Lack

Send message
Joined: 23 Aug 09
Posts: 14
Credit: 727,367
RAC: 3
United Kingdom
Message 2015465 - Posted: 14 Oct 2019, 23:36:55 UTC - in response to Message 2015398.  

If you are concerned about any -9 then the best thing to do is to track its progress though to validation. f it is a "real -9" then it will validate as such, otherwise you will get an "invalid" or "error" returned against your task.
I don't follow that logic.

I was getting -9 errors for some tasks, that others ran to completion - my job was marked as invalid (which is fine).

But, if I can get an erroneous -9 then so can my wingman - and if we both achieve it then the job gets marked as valid (which has happened).
ID: 2015465 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 2015467 - Posted: 14 Oct 2019, 23:48:07 UTC

There's a thread here called, "Invalid Host Messaging", where bad hosts are singled out. ;-)

Cheers.
ID: 2015467 · Report as offensive
Gordon Lack

Send message
Joined: 23 Aug 09
Posts: 14
Credit: 727,367
RAC: 3
United Kingdom
Message 2015714 - Posted: 17 Oct 2019, 12:08:30 UTC - in response to Message 2015467.  
Last modified: 17 Oct 2019, 12:09:31 UTC

There's a thread here called, "Invalid Host Messaging", where bad hosts are singled out. ;-)

Cheers.

The issue here is that my set-up wasn't producing errors - it was just that ~75% of results were not validated. And it's the ~25% which were validated that is the problem, as I suspect those should not have been either.
ID: 2015714 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2015718 - Posted: 17 Oct 2019, 12:46:29 UTC - in response to Message 2015714.  
Last modified: 17 Oct 2019, 12:53:14 UTC

The issue is you don't understand what a Noise-bomb is. A 'normal' -9 Overflow is a task that has thousands or even millions of signals on it. It only takes a second or two for the App to find and report the 30 signal limit. This means if the 'problem' that creates the constant -9 Overflow on a particular 'Bad' App/GPU takes longer than a second or two, the 'Correct' result will be reached before the 'Incorrect' result. This is how a 'Bad' App/GPU can sometimes produce a 'Good' result. There are Many -9 Overflows that are perfectly normal, but, most machines don't constantly spit out a -9 Overflow and a good way to tell a Bad -9 Overflow is if the Wingman didn't also produce a -9 Overflow. Unfortunately, some particular problems will produce the same bad result on the same type of hardware, such as the AMD RX 5700s. Those are easy to find though, the first indication is when a Host that doesn't have a history of producing Invalids has an Invalid pop-up and on investigation reveals the two other machines are producing mostly Invalids. If you want to test a particular Task to see if it is in fact a 'vaild' -9 Overflow, you would run the task in the Benchmark App on a 'known' good CPU. Thankfully, it doesn't take long to verify a -9 Overflow on a CPU.
ID: 2015718 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : Does Intel GPU on Linux actually work?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.