NVidia 436.xx and later drivers can cause very long compute times especially on Arecibo VHAR work units

Message boards : Number crunching : NVidia 436.xx and later drivers can cause very long compute times especially on Arecibo VHAR work units
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 20 · Next

AuthorMessage
BoincSpy
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 146
Credit: 124,775,115
RAC: 353
Canada
Message 2013323 - Posted: 25 Sep 2019, 21:28:56 UTC
Last modified: 25 Sep 2019, 21:29:09 UTC

Just saw this happening. I have rebooted to see if GPUs where stuck... Nope.

Have a Geforce GTX 1060 6GB and GeForce GTX 1070 Ti using MB8_win_x86_SSE3_OpenCL_NV_SoG_r3584.exe application. The progress Rate is .720 % /hour.

GeForce Drivers: 436.30 ( updated about a month ago )

WorkUnits being processed: 21jn12ac.5582.19699.15.42.209 and 21jn12ac.24517.23380.3.30.153

I will wait until WU's are complete to see if its going to clear it self....

Any ideas on what may be happening.

Thanks in advance...
BoincSpy
ID: 2013323 · Report as offensive     Reply Quote
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 2013325 - Posted: 25 Sep 2019, 22:02:55 UTC

It's those latest drivers combined with certain work units, roll back to an earlier 430.xx driver. ;-)

Cheers.
ID: 2013325 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2013341 - Posted: 26 Sep 2019, 1:30:49 UTC - in response to Message 2013323.  

The 436 drivers do not work with VHAR tasks of which those 21jn12ac task are at AR=2.7 Revert back to 430 or 431 series drivers.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2013341 · Report as offensive     Reply Quote
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 2013351 - Posted: 26 Sep 2019, 6:57:05 UTC

Does anyone have information as to whether this is just a problem or a permanent change in the drivers.

Because I do play video games and always like to have the latest drivers, if this is a permanent change then I will have to stop crunching on my two windows machines and update the drivers.
ID: 2013351 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2013352 - Posted: 26 Sep 2019, 7:38:21 UTC - in response to Message 2013351.  

I don't have any inside information. I just have my suspicion that the problem started with the 436 drivers new "feature" "integer scaling"

Sounds like some math operation and may be the cause of the issues with compute. Only happens on VHAR work at Seti. But causing issues on other projects too.

You can always just abort any VHAR tasks when you get them instead of crunching them.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2013352 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 2013353 - Posted: 26 Sep 2019, 7:56:58 UTC - in response to Message 2013351.  

Because I do play video games and always like to have the latest drivers, if this is a permanent change then I will have to stop crunching on my two windows machines and update the drivers.
For gaming as well as crunching, it's worth checking out exactly what new drivers have that differs from the existing ones. Unless new drivers address an issue you are having with a particular game, or bring support for a new game that you want to play, there is no benefit in updating video drivers- even for a gaming system.
If it ain't broke, don't fix it.
Grant
Darwin NT
ID: 2013353 · Report as offensive     Reply Quote
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 2013354 - Posted: 26 Sep 2019, 8:02:46 UTC

You can always just abort any VHAR tasks when you get them instead of crunching them.


Well I don't really micro manage my SETI tasks accept during the WOW event, the rest of the year they are "set and forget". I check Boinc Tasks maybe twice a day to see things look OK, other than that no.

As I really need to update my drivers I will stop GPU crunching till or if the problem is fixed.
ID: 2013354 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 2013356 - Posted: 26 Sep 2019, 8:35:33 UTC - in response to Message 2013354.  
Last modified: 26 Sep 2019, 8:36:31 UTC

As I really need to update my drivers I will stop GPU crunching till or if the problem is fixed.
Do you really need to?
I'm not being facetious, but unless you've got a new game and you need a new driver in order to play it, or the new driver addresses an issue that you're having with your present games, or you've got a new video card that's not supported by your current driver then there is no need to upgrade your video driver. Ever.
Grant
Darwin NT
ID: 2013356 · Report as offensive     Reply Quote
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 2013364 - Posted: 26 Sep 2019, 11:48:22 UTC
Last modified: 26 Sep 2019, 11:51:41 UTC

Do you really need to?
I'm not being facetious, but unless you've got a new game and you need a new driver in order to play it, or the new driver addresses an issue that you're having with your present games, or you've got a new video card that's not supported by your current driver then there is no need to upgrade your video driver. Ever.


Sorry I have to disagree;

Security Bulletin: NVIDIA GPU Display Driver - August 2019

So basically any driver before 431.60 is not secure. For Geforce on Windows
ID: 2013364 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2013391 - Posted: 26 Sep 2019, 16:29:29 UTC - in response to Message 2013364.  

Do you really need to?
I'm not being facetious, but unless you've got a new game and you need a new driver in order to play it, or the new driver addresses an issue that you're having with your present games, or you've got a new video card that's not supported by your current driver then there is no need to upgrade your video driver. Ever.


Sorry I have to disagree;

Security Bulletin: NVIDIA GPU Display Driver - August 2019

So basically any driver before 431.60 is not secure. For Geforce on Windows

So the 431.60 drivers are secure. And they don't cause issues with compute. Seems like a no-brainer.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2013391 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 2013472 - Posted: 27 Sep 2019, 7:08:32 UTC - in response to Message 2013364.  

Sorry I have to disagree;

Security Bulletin: NVIDIA GPU Display Driver - August 2019

So basically any driver before 431.60 is not secure. For Geforce on Windows
Yep, security is one reason to update even if there is no other benefit (gaming or compute) in doing so.
Given that all of the security issues addressed require local physical access to the computer, I would consider the updates necessary only if you allow other untrusted people to physically use your systems.
If they don't have physical access to your computer, they can't set them up to allow the exploits.

Or as Keith suggested, upgrade to the v431.60 drivers. They have the security patch, and still compute correctly. AFAIK, none of the later patches have anything to do with security. And if they don't address an issue with one of your existing games, or support a new game you wish to play, you don't need to upgrade to them.
Grant
Darwin NT
ID: 2013472 · Report as offensive     Reply Quote
Jared

Send message
Joined: 29 Jul 07
Posts: 9
Credit: 1,176,291
RAC: 0
United States
Message 2013567 - Posted: 28 Sep 2019, 4:02:07 UTC

So I'm running on another machine temporarily, 3700X with 2060 super clocked. Initially sog workunits were processed in about 4 minutes, now its more like 20. I didn't change any settings to have had this effect. Can someone advise config file settings like how to dedicate a full core to the gpu? Below are some examples of the change over time.

Fast WUs:

Task 8706493142 WU 3665872549
Task 8076493172 WU 3665872554

After getting slow:

Task 8079095203 WU 3667112846
Task 8079095467 WU 3667112712

Thanks for any help in advance.
ID: 2013567 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2013568 - Posted: 28 Sep 2019, 4:04:41 UTC - in response to Message 2013567.  

You updated the video drivers to 436 drivers which are known to cause issues.

Revert to the 431 drivers and your issues should go away.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2013568 · Report as offensive     Reply Quote
Profile Steve Hayes

Send message
Joined: 15 May 99
Posts: 3
Credit: 11,896,037
RAC: 33
United States
Message 2013851 - Posted: 30 Sep 2019, 23:20:10 UTC

I have this same issue with my 3700X and RTX 2060 once it was updated to the 436.30 driver. I'm glad I found this thread!
ID: 2013851 · Report as offensive     Reply Quote
Profile Mr. Kevvy Crowdfunding Project Donor*Special Project $250 donor
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 3776
Credit: 1,114,826,392
RAC: 3,319
Canada
Message 2013852 - Posted: 30 Sep 2019, 23:22:10 UTC - in response to Message 2013851.  
Last modified: 30 Sep 2019, 23:23:40 UTC

Excellent... that makes me glad I renamed and pinned it. :^)
I figured there are going to be a lot of people having this issue...
ID: 2013852 · Report as offensive     Reply Quote
Profile IntenseGuy

Send message
Joined: 25 Sep 00
Posts: 190
Credit: 23,498,825
RAC: 9
United States
Message 2013931 - Posted: 1 Oct 2019, 22:59:57 UTC

Version 436.48 is out now. I am trying to test it.
ID: 2013931 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 2013956 - Posted: 2 Oct 2019, 6:16:31 UTC - in response to Message 2013931.  

Version 436.48 is out now. I am trying to test it.
Given there is no mention of a fix to compute errors under certain circumstances in the release notes, I expect it to still be broken.
Grant
Darwin NT
ID: 2013956 · Report as offensive     Reply Quote
Profile IntenseGuy

Send message
Joined: 25 Sep 00
Posts: 190
Credit: 23,498,825
RAC: 9
United States
Message 2013983 - Posted: 2 Oct 2019, 14:20:33 UTC

Yes some tasks are still failing. Oh well.
ID: 2013983 · Report as offensive     Reply Quote
Jeff

Send message
Joined: 8 May 99
Posts: 5
Credit: 98,361,983
RAC: 150
United States
Message 2014007 - Posted: 2 Oct 2019, 18:41:59 UTC - in response to Message 2013983.  

Yes some tasks are still failing. Oh well.


Thanks for you not only taking the time to try it out, but sharing the results with us here.
You saved me the hassle of giving it a try and then having to revert back to the latest good driver.
I appreciate you sharing your findings here with us. By sharing, many benefit from your efforts.

The knowledge that the latest still doesn't work properly is, in itself, good information to know.
ID: 2014007 · Report as offensive     Reply Quote
BoincSpy
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 146
Credit: 124,775,115
RAC: 353
Canada
Message 2014053 - Posted: 3 Oct 2019, 5:48:10 UTC

Thank you everyone for the feedback.

Based on what I have read from the conversations, am I correct in stating that the issue will most likely not get fixed?

Regards,
BoincSpy.
ID: 2014053 · Report as offensive     Reply Quote
1 · 2 · 3 · 4 . . . 20 · Next

Message boards : Number crunching : NVidia 436.xx and later drivers can cause very long compute times especially on Arecibo VHAR work units


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.