NVidia 436.xx and later drivers can cause very long compute times especially on Arecibo VHAR work units

Message boards : Number crunching : NVidia 436.xx and later drivers can cause very long compute times especially on Arecibo VHAR work units
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 . . . 20 · Next

AuthorMessage
Jacob Klein
Volunteer tester

Send message
Joined: 15 Apr 11
Posts: 149
Credit: 9,783,406
RAC: 9
United States
Message 2030467 - Posted: 2 Feb 2020, 4:49:00 UTC
Last modified: 2 Feb 2020, 4:49:12 UTC

I will only test this on my main System 1 (RTX 2080, GTX 980 Ti, GTX 980). And I tested it.

The results are:

- Win 10 v1909 Release (fully patched)
> DDU and restart makes Windows Update install 432.00, which works fine for Arecibo VHAR

- Win 10 Insider Build 19555
> DDU and restart makes Windows Update install 450.12, which is broken for Arecibo VHAR, in the same broken behaviors as discussed.

Regards,
Jacob
ID: 2030467 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14655
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2030481 - Posted: 2 Feb 2020, 8:23:20 UTC - in response to Message 2030431.  

I think their versions of the drivers have been hit or miss regarding whether they have removed the OpenCL parts. Seems like I remember reports that some releases have been fine while others were missing.
When I first set up that Windows 10 test partition - at least two years ago - Microsoft sent me

* a CUDA (only) driver for the NVidia card
* an OpenCL driver for the Intel GPU
* and the Intel driver allowed OpenCL to run on the NVidia card, too
ID: 2030481 · Report as offensive     Reply Quote
VelocityRC
Avatar

Send message
Joined: 27 Sep 19
Posts: 23
Credit: 1,421,582
RAC: 86
United States
Message 2030531 - Posted: 2 Feb 2020, 15:15:31 UTC - in response to Message 2030340.  
Last modified: 2 Feb 2020, 15:18:04 UTC

Is there a way to check the providence of a Windows Nvidia driver as to origin?

I think all these Windows hosts with driver version 432.00 are using a Microsoft delivered driver of which we know nothing about. The driver could be good or bad, we just don't know. Somebody needs to run the 432.00 driver and some known VHAR Arecibo work and see if the tasks stall out or complete normally in the benchmark tools.


Already answered. So you folks are telling me I need to check after W-10 updates and rollback my nVidia drivers if needed ???
ID: 2030531 · Report as offensive     Reply Quote
VelocityRC
Avatar

Send message
Joined: 27 Sep 19
Posts: 23
Credit: 1,421,582
RAC: 86
United States
Message 2030532 - Posted: 2 Feb 2020, 15:32:11 UTC - in response to Message 2030374.  

Well, since I announced the availability of the test build in this thread (and the similar parallel ATI thread), I'd sort of assumed that people would report back here. I'd prefer the test reports to be publicly visible for peer review, rather than hidden in PMs, but I'll try to read it wherever you post. But be aware I can't see inside private team discussion groups.

But anyway - thanks for the positive vote. I won't do a public release this late in the day, UK time, but I'm minded to do it tomorrow morning - say any time after 12 hours from now.


A lot of this GPU language is over my head and I'm not sure I have the time lo learn all that is needed to understand some of this. I have heard of CUDA but my knowledge is limited. If I recall it is a alternate driver for nVidia. Is that correct ?? Us greenhorns would like a driver download link, preferably an nVidia one but that seems to be an issue ATM. I don't mind using third party GPU driver with a link posted here.

JMHO

Bill S.

Alternative drivers are usually available only for the less common graphics boards for which no BOINC projects produce suitable workunits.

CUDA is a computer language for Nvidia boards only, and only Nvidia produces drivers that can use it. These drivers now can also use another computer language, OpenCL, for which other GPU companies produce drivers that can use it. Microsoft (the source of Windows) edits these drivers to produce alternate versions with an CUDA and OpenCL support removed, and distributes those versions instead.

If you find any other third party drivers, don't expect them to be useful for any BOINC work.


OK, thanks for the clarification. I guess i'm doing the right thing for now, all seems to work.

WOW !!! I have to say to all how nice everyone is on this forum. I ask newb questions that might make some of the more experienced shake their heads, (how many times do we have to cover this) but so far everyone has been genuinely helpful. It's a fresh feeling from some of the other forums I frequent. THANK YOU !!!

Bill S
ID: 2030532 · Report as offensive     Reply Quote
VelocityRC
Avatar

Send message
Joined: 27 Sep 19
Posts: 23
Credit: 1,421,582
RAC: 86
United States
Message 2030535 - Posted: 2 Feb 2020, 15:50:10 UTC - in response to Message 2030335.  

@VelocityRC: You currently are showing 'NVIDIA GeForce GTX 1050 (2048MB) driver: 432.00'. That's earlier than the number in the thread title. You're good to continue as you are - no change needed.

Edit - while I was posting, Keith Myers suggested that one might fall into the gap between 'known good' and 'known bad'. If you have any problems with SETI tasks(*), you might be better following the link for general downloaders.

(*) apart from not getting any...

@everyone else: https://www.nvidia.com/Download/Find.aspx?lang=en-us is probably the place to look. Fill in everything, and you should get something like this:


Don't use the ones I've crossed out: use the green one at the bottom.


431.60 is the one installed from nvidia's "older versions" link a few weeks back. Why is my machine showing 432.00 when I look at it on here ? nVidia doesn't show a driver in their list starting with 432 for 1050's and W1064. Did W10 do something ? Need to roll back again ??

Bill S.
ID: 2030535 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14655
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2030538 - Posted: 2 Feb 2020, 16:13:55 UTC - in response to Message 2030535.  
Last modified: 2 Feb 2020, 16:14:19 UTC

431.60 is the one installed from nvidia's "older versions" link a few weeks back. Why is my machine showing 432.00 when I look at it on here ? nVidia doesn't show a driver in their list starting with 432 for 1050's and W1064. Did W10 do something ? Need to roll back again ??

Bill S.
Your post prompted us to check exactly that.

We've now confirmed:

1) Version 432.00 is supplied by Microsoft when they're doing their monthly security updates.
2) v432.00 seems to be functionally equivalent to v431.60 - you can safely go on using it.
ID: 2030538 · Report as offensive     Reply Quote
VelocityRC
Avatar

Send message
Joined: 27 Sep 19
Posts: 23
Credit: 1,421,582
RAC: 86
United States
Message 2030539 - Posted: 2 Feb 2020, 16:20:26 UTC - in response to Message 2030538.  

431.60 is the one installed from nvidia's "older versions" link a few weeks back. Why is my machine showing 432.00 when I look at it on here ? nVidia doesn't show a driver in their list starting with 432 for 1050's and W1064. Did W10 do something ? Need to roll back again ??

Bill S.
Your post prompted us to check exactly that.

We've now confirmed:

1) Version 432.00 is supplied by Microsoft when they're doing their monthly security updates.
2) v432.00 seems to be functionally equivalent to v431.60 - you can safely go on using it.


OK thanks. I guess I'll have to keep closer tabs on things that W10 is doing.

Much appreciated. Bill S.
ID: 2030539 · Report as offensive     Reply Quote
6BQ5

Send message
Joined: 7 Dec 18
Posts: 29
Credit: 12,725,636
RAC: 357
United States
Message 2030682 - Posted: 3 Feb 2020, 20:27:09 UTC

Check out this link :

https://www.pcgamer.com/nvidias-latest-gpu-driver-offers-a-more-flexible-framerate-cap-to-save-power/

Yes, it's from PC Gamer. It's an article about an update to the Nvidia drivers. You skip through all the frame rate capping but don't skip too fast!

I quote from the article, with emphasis.

" On top of it all, there is the usual round of bug fixes. Here they are:

[The Witcher 3: Wild Hunt - Blood and Wine]: The game may crash when a user reaches a specific cut scene.
[Maxwell GPUs][OpenCL]: SETI@Home shows driver TDR occuring on Maxwell GPUs using OpenCL.
[Call of Duty Modern Warfare]: Streaming of gameplay using OBS will randomly stop.
Battleye][Low-Latency Mode]: Launching Battleye with NVIDIA ......
....
"

Does this help us any?

-=- Boris
ID: 2030682 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14655
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2030683 - Posted: 3 Feb 2020, 20:35:36 UTC - in response to Message 2030682.  

It's well worth checking out, but don't all rush at once - especially with maintenance coming up! We'll be testing

1) the stand-alone download from NVidia
2) the next Microsoft monthly roll-up

and gathering/sharing experiences.
ID: 2030683 · Report as offensive     Reply Quote
Jacob Klein
Volunteer tester

Send message
Joined: 15 Apr 11
Posts: 149
Credit: 9,783,406
RAC: 9
United States
Message 2030685 - Posted: 3 Feb 2020, 20:43:31 UTC
Last modified: 3 Feb 2020, 21:43:02 UTC

Go ahead and rush. I'm told the fixes are in the new 442.19 driver version. I'll test them soon. :)
Edit: Fixed to read 442.19
ID: 2030685 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2030704 - Posted: 3 Feb 2020, 22:11:02 UTC

The release notes for 442.19 have the same comment about Setiathome TDR fix for Maxwell cards.

Not sure why that is specific to Maxwell since those cards are 3 generations old. Unless they are finally getting around to fixing a bug complaint from 2014.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2030704 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14655
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2030707 - Posted: 3 Feb 2020, 22:21:17 UTC - in response to Message 2030704.  

Jacob's original bug report to NVidia mentioned both crashing on Maxwell cards, and stalling on Pascal/Turing cards. They probably filed it under the first part only, as seemingly the more serious problem.
ID: 2030707 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2030714 - Posted: 3 Feb 2020, 22:55:32 UTC - in response to Message 2030707.  

Jacob's original bug report to NVidia mentioned both crashing on Maxwell cards, and stalling on Pascal/Turing cards. They probably filed it under the first part only, as seemingly the more serious problem.

OK, that makes sense I guess. I wonder if the fix was directly in response to Jacob's bug report. He should be able to confirm as he should have been informed of the response to his report for a fix.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2030714 · Report as offensive     Reply Quote
Jacob Klein
Volunteer tester

Send message
Joined: 15 Apr 11
Posts: 149
Credit: 9,783,406
RAC: 9
United States
Message 2030716 - Posted: 3 Feb 2020, 23:03:25 UTC
Last modified: 3 Feb 2020, 23:04:30 UTC

I reported the problem using the Driver Feedback links in the Driver Feedback threads.
And I made sure to provide a clear repro, with a OneDrive location for easy repro files.
And then I followed up with an NVIDIA contact to make sure that a bug was created, and routinely asked for updates.

So, while I don't have a bug ID or bug reference number, and I did not get any direct response about the bug report .... I do know that the squeaky wheel may have gotten some grease.
I'll take some credit. :)

Going to test these shortly.
ID: 2030716 · Report as offensive     Reply Quote
Jacob Klein
Volunteer tester

Send message
Joined: 15 Apr 11
Posts: 149
Credit: 9,783,406
RAC: 9
United States
Message 2030720 - Posted: 3 Feb 2020, 23:10:53 UTC
Last modified: 3 Feb 2020, 23:11:01 UTC

LOL @ Seeing "SETI@Home" in the driver release notes. Epic.
ID: 2030720 · Report as offensive     Reply Quote
Jacob Klein
Volunteer tester

Send message
Joined: 15 Apr 11
Posts: 149
Credit: 9,783,406
RAC: 9
United States
Message 2030722 - Posted: 3 Feb 2020, 23:29:40 UTC

Initial testing is looking very very promising :)
ID: 2030722 · Report as offensive     Reply Quote
Profile Mr. Kevvy Crowdfunding Project Donor*Special Project $250 donor
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 3776
Credit: 1,114,826,392
RAC: 3,319
Canada
Message 2030724 - Posted: 3 Feb 2020, 23:34:11 UTC - in response to Message 2030720.  

LOL @ Seeing "SETI@Home" in the driver release notes. Epic.


It's becoming commonplace these days... 😃

Radeon Software Adrenalin 2020 Edition 20.1.1 Highlights

Fixed Issues

Fixed result overflows that can be experienced with Radeon RX 5700 series when using SETI@Home.

ID: 2030724 · Report as offensive     Reply Quote
BoincSpy
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 146
Credit: 124,775,115
RAC: 353
Canada
Message 2030735 - Posted: 4 Feb 2020, 0:48:39 UTC

New Nvidia drivers out ( 442.19 ) installing, however release notes do not indicate any cuda fixes....
ID: 2030735 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2030737 - Posted: 4 Feb 2020, 0:55:38 UTC - in response to Message 2030735.  

Because there was never any problem with CUDA.

The only issue was with the SoG app, which is OpenCL.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2030737 · Report as offensive     Reply Quote
Jacob Klein
Volunteer tester

Send message
Joined: 15 Apr 11
Posts: 149
Credit: 9,783,406
RAC: 9
United States
Message 2030751 - Posted: 4 Feb 2020, 3:24:04 UTC
Last modified: 4 Feb 2020, 3:25:04 UTC

I think 442.19 has definitely fixed it!

In my testing of the 442.19 drivers, I had no problems processing VHAR work items, on my main rig (RTX 2080, GTX 980 Ti, GTX 980) using Windows 10.
All 3 GPUs acted correctly.
ID: 2030751 · Report as offensive     Reply Quote
Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 . . . 20 · Next

Message boards : Number crunching : NVidia 436.xx and later drivers can cause very long compute times especially on Arecibo VHAR work units


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.