Revisiting the Onboard Intel GPU Issue

Message boards : Number crunching : Revisiting the Onboard Intel GPU Issue
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 1996543 - Posted: 3 Jun 2019, 7:42:27 UTC

Thanks to Wiggo, Grant (SSSF), betreger, and Bill for your insight in this post:
https://setiathome.berkeley.edu/forum_thread.php?id=84238

Wanted to update and give some accounting for further reference for others, with some charts which I will update a bit later when they level out.

As you guys had suggested, the onboard intel GPU is a serious hog, and appears to have been slowing me down despite its apparent usefulness. I have three machines, a windows HP laptop, a 2014 Mac mini (i5 dual core), and the 2018 Mac mini (i7 six core), the last of which has the two Radeon eGPUs, which are the RX 580, and Vega 56.

It took a few days to work themselves out of work units, but the first two machines are recovering very nicely. You'll see in the charts the point where I removed the mb_blah_intel.txt files, and let them finish chomping away at the remaining work units. They finished about 12 hrs apart, and then you see the dramatic increases in performance. They're far from showing their actual ability just yet, but removing the intel GPU is absolutely shocking, moreso on the windows platform. I'd have to draw strictly from supposition, but believe this is perhaps a windows optimization from the manufacturer to better inline the shared memory scheme. betreger perhaps said it best, "The bottomline is do not use the Igpu, they are a waste of resources." I never even considered it would provoundly hamper the CPUs of resources. However, the charts are not lying. I'll update them once more when they stablize a bit to show their actual potential. Give me a bit to figure out how to post those.
ID: 1996543 · Report as offensive
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 1996544 - Posted: 3 Jun 2019, 7:51:56 UTC

ID: 1996544 · Report as offensive
Profile Bill Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 30 Nov 05
Posts: 282
Credit: 6,916,194
RAC: 60
United States
Message 1996566 - Posted: 3 Jun 2019, 15:12:58 UTC - in response to Message 1996543.  

I'm glad to hear that you've been a bit more efficient in crunching since you stopped running the iGPU. Can I ask how you disabled the iGPU? The reason I ask is that I was attempting to use my iGPU for Einstein, and found that even though I wasn't crunching any iGPU tasks, it appeared that having it enabled was still slowing down my laptop. I am attempting to solve the problem on the boinc forms over here. I guess what I'm getting at is check your event log and see if boinc is constantly suspending computation for no reason. If it is, you may have to add <ignore_intel_dev></ignore_intel_dev> to cc_config to help speed things up even more.
Seti@home classic: 1,456 results, 1.613 years CPU time
ID: 1996566 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1996569 - Posted: 3 Jun 2019, 15:26:37 UTC - in response to Message 1996566.  

I would think that simply setting to not use the iGPU in computing preferences for every project would be sufficient.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1996569 · Report as offensive
Profile Bill Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 30 Nov 05
Posts: 282
Credit: 6,916,194
RAC: 60
United States
Message 1996571 - Posted: 3 Jun 2019, 15:33:00 UTC - in response to Message 1996569.  

I would think that simply setting to not use the iGPU in computing preferences for every project would be sufficient.
I would agree, but I found that if I did just that and had the laptop plugged in the docking station, I would get symptoms similar to thrashing. Ignoring it in cc_config got rid of that symptom. I'm guessing its a bug, but I'm not sure if its a problem I'm having alone or if others can replicate the problem.
Seti@home classic: 1,456 results, 1.613 years CPU time
ID: 1996571 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1996573 - Posted: 3 Jun 2019, 15:42:10 UTC - in response to Message 1996571.  

Sounds like BOINC is detecting normal iGPU duties of driving the laptop screen as "suspend computing when computer in use" and also "suspend gpu computing when computer in use" is taking precedence as designed.

Not sure how excluding the iGPU device would change the behavior. If the computer is busy for whatever reason, suspend computing. Unless you turn that off in the computing preferences of course.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1996573 · Report as offensive
Profile Bill Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 30 Nov 05
Posts: 282
Credit: 6,916,194
RAC: 60
United States
Message 1996576 - Posted: 3 Jun 2019, 15:56:39 UTC - in response to Message 1996573.  

Sounds like BOINC is detecting normal iGPU duties of driving the laptop screen as "suspend computing when computer in use" and also "suspend gpu computing when computer in use" is taking precedence as designed.

Not sure how excluding the iGPU device would change the behavior. If the computer is busy for whatever reason, suspend computing. Unless you turn that off in the computing preferences of course.
I agree, something weird is happening and it doesn't make sense. Task Manager wasn't showing any other activity high enough to suspend Boinc (actually, no activity of any significant percentage). When the computer is docked, the iGPU is driving the laptop display, and the Quadro is driving the external monitor. I'm not sure why having the laptop undocked makes it work better, because it should be just driving the laptop display with the iGPU.

As much as I'm curious why this is a problem, I'm not spending any time troubleshooting it unless someone can identify this as a bug and wants to troubleshoot it. I only wanted the iGPU enabled to try it for Einstein, but I suspect either it would not perform well enough to run, or it would bog down the laptop in the background while using it enough that it wouldn't be worth it.
Seti@home classic: 1,456 results, 1.613 years CPU time
ID: 1996576 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22215
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1996579 - Posted: 3 Jun 2019, 16:15:13 UTC

One of the options within BOINC is "suspend while on battery", make sure that isn't ticked.
Or, if it isn't see what happens when you plug a PSU in.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1996579 · Report as offensive
Profile Bill Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 30 Nov 05
Posts: 282
Credit: 6,916,194
RAC: 60
United States
Message 1996580 - Posted: 3 Jun 2019, 16:18:29 UTC - in response to Message 1996579.  

One of the options within BOINC is "suspend while on battery", make sure that isn't ticked.
Or, if it isn't see what happens when you plug a PSU in.
I do have it suspend while on battery. I'm only crunching when plugged into the PSU, whether docked or undocked.
Seti@home classic: 1,456 results, 1.613 years CPU time
ID: 1996580 · Report as offensive
Profile Cliff Harding
Volunteer tester
Avatar

Send message
Joined: 18 Aug 99
Posts: 1432
Credit: 110,967,840
RAC: 67
United States
Message 1996583 - Posted: 3 Jun 2019, 16:42:31 UTC - in response to Message 1996580.  

One of the options within BOINC is "suspend while on battery", make sure that isn't ticked.
Or, if it isn't see what happens when you plug a PSU in.
I do have it suspend while on battery. I'm only crunching when plugged into the PSU, whether docked or undocked.


I have a desktop I7/7700k with an iGPU. Even though I don't have the 'Use Intel GPU' checked in my global project preferences, I also have the following argument in my cc_config.xml file -- <ignore_intel_dev>N</ignore_intel_dev> ( replace N with the iGPU device #, usually 0). This will require the client to be restarted, and will explicitly ignore the iGPU which you will see at the beginning of the event file. Some might say it's overkill, but better safe than sorry.


I don't buy computers, I build them!!
ID: 1996583 · Report as offensive
Profile Bill Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 30 Nov 05
Posts: 282
Credit: 6,916,194
RAC: 60
United States
Message 1996585 - Posted: 3 Jun 2019, 16:59:00 UTC - in response to Message 1996583.  

I have a desktop I7/7700k with an iGPU. Even though I don't have the 'Use Intel GPU' checked in my global project preferences, I also have the following argument in my cc_config.xml file -- <ignore_intel_dev>N</ignore_intel_dev> ( replace N with the iGPU device #, usually 0). This will require the client to be restarted, and will explicitly ignore the iGPU which you will see at the beginning of the event file. Some might say it's overkill, but better safe than sorry.
I do have ignore_intel_dev in my cc_config. I don't have N or 0 in at all; it looks exactly like this in my cc_config:

<ignore_intel_dev></ignore_intel_dev>

When I have this line in my cc_config, everything is fine and I have no problems. The problem I have is when I delete this line from my cc_config in attempt to download iGPU tasks for Einstein, I still get intermittent computation suspensions. In this situation, I have 'Use Intel GPU' unchecked in Seti, but checked in Einstein. Even though I am allowing iGPU tasks for Einstein, none are being downloaded. So, even though the iGPU is allowed to grab tasks for one project, it isn't, and at the same time I get the thrashing symptom.

I appreciate everyone's help with trying to troubleshoot this, but I feel we're straying away from AllgoodGuy's original post...especially as my problem is more related to Einstein than Seti. My intent was not to sidetrack this thread, but to just point out a potential problem that could be slowing down AllgoodGuy's computers further. I agree with Cliff, adding the ignore_intel_dev line to cc_config is the right way to go if you do not want to use the iGPU at all.
Seti@home classic: 1,456 results, 1.613 years CPU time
ID: 1996585 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1996587 - Posted: 3 Jun 2019, 17:34:05 UTC - in response to Message 1996585.  

At the risk of further hijacking the OP content, Einstein normally requires a fairly hefty gpu for crunching with as much bandwidth as allowed. The iGPU would not be a good fit. Also what does the event log show for work_fetch_debug for Einstein when you attempted?
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1996587 · Report as offensive
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 1996598 - Posted: 3 Jun 2019, 19:04:15 UTC
Last modified: 3 Jun 2019, 19:12:40 UTC

Bill,

The iGPU has some definite bugs if you remember my last post, that's exactly why I was trying to utilize it, because it was flatly ignoring my ignore for project section in cc_config.

I blocked it both at the web preferences page, as well as the Nvidia GPUs as I have none, and in the cc_config as you've shown above, except I explicitly put in dev 0
(<ignore_intel_dev>0</ignore_intel_dev>).

It may be bypassing yours as you don't have a number in there. The tag looks like it expects a numeric argument, but I have no proof of this. Raistmer would be the only one I could think of who would definitively know the answer to that, except programmers for the actual BOINC client you have installed.

**Edit

The only other thing I can think of, which might be why you're having an issue of bypassing, would be that perhaps the GPUs were reordered by the OS, in such a fashion that your numbering changed in the client_state.xml file. I'm not sure how this file obtains its ordering, but I've seen some mention that the ordering is dependent upon the perceived power of individual cards. Make sure that the device_num corresponds to the device_num in that file...just an idea.
ID: 1996598 · Report as offensive
Profile Bill Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 30 Nov 05
Posts: 282
Credit: 6,916,194
RAC: 60
United States
Message 1996599 - Posted: 3 Jun 2019, 19:19:15 UTC - in response to Message 1996587.  

At the risk of further hijacking the OP content, Einstein normally requires a fairly hefty gpu for crunching with as much bandwidth as allowed. The iGPU would not be a good fit. Also what does the event log show for work_fetch_debug for Einstein when you attempted?
I was going to post event log info, but then I realized I would totally be hijacking. I started a new thread in Einstein here. Oddly enough, I removed the exclusion line in cc_config, and so far I have not had any computation suspensions.
Seti@home classic: 1,456 results, 1.613 years CPU time
ID: 1996599 · Report as offensive
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 1996600 - Posted: 3 Jun 2019, 19:25:43 UTC - in response to Message 1996599.  

I'm not so much worried about hijacking as I am getting you straight :)

This might be of limited use for your specific circumstances if you want to try in your cc_config

<exclude_gpu>
<url>http://setiathome.berkeley.edu/</url>
<type>intel</type>
<app>astropulse_v7</app>
</exclude_gpu>
<exclude_gpu>
<url>http://setiathome.berkeley.edu/</url>
<type>intel</type>
<app>setiathome_v8</app>
</exclude_gpu>

But remember that this is what was being bypassed on my Mac mini 2018 with the eGPUs.
ID: 1996600 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1996604 - Posted: 3 Jun 2019, 19:42:41 UTC - in response to Message 1996598.  

I believe you are correct in that the ignore statement is not acted upon if there is no numerical argument.
https://boinc.berkeley.edu/wiki/Client_configuration#Application_configuration
The number value to use is the gpu number that BOINC assigns the gpu in the Event Log startup when it polls for devices and looks for graphics drivers. Likely gpu0 but you need to check for sure.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1996604 · Report as offensive
Profile Bill Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 30 Nov 05
Posts: 282
Credit: 6,916,194
RAC: 60
United States
Message 1996605 - Posted: 3 Jun 2019, 19:47:13 UTC - in response to Message 1996604.  

I believe you are correct in that the ignore statement is not acted upon if there is no numerical argument.
https://boinc.berkeley.edu/wiki/Client_configuration#Application_configuration
The number value to use is the gpu number that BOINC assigns the gpu in the Event Log startup when it polls for devices and looks for graphics drivers. Likely gpu0 but you need to check for sure.
I disagree that you need a numerical value. Yes, it is probably better to do so, but it seems to have worked (with respect to disabling the device) without one. I don't have the iGPU disabled right now, but when I did the Intel GPU would not even be listed in the computer profile.

Here's my computer in question: https://setiathome.berkeley.edu/show_host_detail.php?hostid=8582028
Seti@home classic: 1,456 results, 1.613 years CPU time
ID: 1996605 · Report as offensive
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 1996608 - Posted: 3 Jun 2019, 19:56:49 UTC - in response to Message 1996604.  

[quoteThe number value to use is the gpu number that BOINC assigns the gpu in the Event Log startup when it polls for devices and looks for graphics drivers. Likely gpu0 but you need to check for sure.[/quote]

That would present a problem if it is done from dynamic management from the OS, as it is theoretically, and actually from observation, possible to reorganize devices and their identifiers. In most practice, I haven't seen any reordering of devices happen for onboard devices unless specifically blocked from the BIOS, which I've never seen for an onboard video device.

However, this might be happening if the difference is docking and undocking on a laptop, as docks tend to have their own devices and identifiers. Outside of scripting for these conditions and watching the Windows Registry inside the HKEY_CURRENT_CONFIG or whatever it is named (going from memory here), I couldn't really find a better solution, except perhaps swapping configs based on what I was doing at the time.

It might be worth noting for the two conditions.
ID: 1996608 · Report as offensive
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 1996611 - Posted: 3 Jun 2019, 20:08:56 UTC - in response to Message 1996605.  
Last modified: 3 Jun 2019, 20:25:15 UTC

The <exclude_gpu> tag I posted above may use a device number, but excludes the entire brand of GPU without it. I'm not sure if the other tag is so favorable.

Anything is possible at this point we are pretty much spitting in the wind as to why we are getting these anomalies in our clients. The fact we are seeing similar symptoms across Windows and MacOS clients is kind of indicative that the issue is not specifically OS related, but perhaps client related.

I'll bet you're going to hate my next post when it comes to actually understanding the command line options for the mb_blah.txt files. :) It will be a few days, still formulating a lot of my questions, and awaiting 2 new GPUs that I'm excited about. Vega Frontier Editions, provided they don't arrive DOA.

** Edit

I'll update the graphs above around that time, and include the previous 40 days stats from BOINCstats at that time to show the improvement as these CPUs show their real potentials. Grant (SSSF) really threw me for a loop with that question of...did you measure the performance without it? Hello Mr. Obvious, long time listener, first time caller.
ID: 1996611 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1996618 - Posted: 3 Jun 2019, 21:08:52 UTC - in response to Message 1996611.  

That's the difference between <gpu_exclude>N</gpu_exclude> and <ignore_intel_dev>N</ignore_intel_dev>
The gpu exclude is used to exclude a particular device from a project or an app. I use it to exclude my Turing RTX 2080 from GPUGrid since they don't have an app that doesn't error with any Turing card yet.
But I use the 2080 for Seti, Einstein and MilkyWay just fine.
The ignore means that BOINC totally ignores that device or devices from all projects.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1996618 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : Revisiting the Onboard Intel GPU Issue


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.