NVidia 436.xx and later drivers can cause very long compute times especially on Arecibo VHAR work units

Questions and Answers : GPU applications : NVidia 436.xx and later drivers can cause very long compute times especially on Arecibo VHAR work units
Message board moderation

To post messages, you must log in.

AuthorMessage
gstar

Send message
Joined: 12 Jan 05
Posts: 2
Credit: 6,687,402
RAC: 91
United States
Message 2015368 - Posted: 14 Oct 2019, 1:38:33 UTC

Task time remaining of 61 and 47 days.
I just installed a new NVidia card, and now I'm getting GPU jobs telling me they are going to take 61 days to complete.
I aborted them, and now have new jobs claiming 47 days to complete.
How do I prevent these long jobs from happening?
ID: 2015368 · Report as offensive     Reply Quote
Profile Mr. Kevvy Crowdfunding Project Donor*Special Project $250 donor
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 3776
Credit: 1,114,826,392
RAC: 3,319
Canada
Message 2015377 - Posted: 14 Oct 2019, 3:41:01 UTC - in response to Message 2015368.  
Last modified: 14 Oct 2019, 3:43:11 UTC

The driver version is 436.48... the NVidia 436 series is known to cause this issue.

I would suggest to download 431.xx WHQL drivers from nvidia.com and install using the Clean Install option to remove existing.

Edit: As I did with the Number Crunching thread, I will rename and pin this one as there's no ETA for a fix from NVidia yet (as far as I know, they have yet to even acknowledge the issue) so there will doubtless be more experiencing the problem.
ID: 2015377 · Report as offensive     Reply Quote
gstar

Send message
Joined: 12 Jan 05
Posts: 2
Credit: 6,687,402
RAC: 91
United States
Message 2015382 - Posted: 14 Oct 2019, 4:22:04 UTC - in response to Message 2015377.  

Thank you for the quick response.
I've installed Nvidia version 431.86.
We'll see how this version runs.
ID: 2015382 · Report as offensive     Reply Quote
Profile Mr. Kevvy Crowdfunding Project Donor*Special Project $250 donor
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 3776
Credit: 1,114,826,392
RAC: 3,319
Canada
Message 2015407 - Posted: 14 Oct 2019, 11:07:45 UTC - in response to Message 2015382.  

You're welcome. I think it will be much improved. :^)
ID: 2015407 · Report as offensive     Reply Quote
ThomaZ

Send message
Joined: 24 Oct 03
Posts: 2
Credit: 5,751,137
RAC: 13
Norway
Message 2020655 - Posted: 26 Nov 2019, 20:53:21 UTC

Lately Boinc has stopped utilizing my RTX2080 for calculations. Everything was working fine for months, but one day I started getting "calculation error" on the GPU-work units, and now Boinc will start a GPU-task but get stuck immediately before reaching 1 % completion. It still says "running" even though it's not. If I remove the project and add it again, the GPU will calculate for some hours, but then get stuck again. I've tried removing and re-adding seti@home many times now but to no avail. Suspending and resuming the GPU-task does nothing, although aborting the GPU-task will sometimes fix the problem for some hours before the problem returns. My GPU is set to always be in use. Everyting is fine with CPU calculations though, and I haven't touched any settings from when everything was running smoothly. Can anyone help? If more information is needed, please let me know.

Thanks!

Thomas
ID: 2020655 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2020658 - Posted: 26 Nov 2019, 21:39:08 UTC - in response to Message 2020655.  

Lately Boinc has stopped utilizing my RTX2080 for calculations. Everything was working fine for months, but one day I started getting "calculation error" on the GPU-work units, and now Boinc will start a GPU-task but get stuck immediately before reaching 1 % completion. It still says "running" even though it's not. If I remove the project and add it again, the GPU will calculate for some hours, but then get stuck again. I've tried removing and re-adding seti@home many times now but to no avail. Suspending and resuming the GPU-task does nothing, although aborting the GPU-task will sometimes fix the problem for some hours before the problem returns. My GPU is set to always be in use. Everyting is fine with CPU calculations though, and I haven't touched any settings from when everything was running smoothly. Can anyone help? If more information is needed, please let me know.

Thanks!

Thomas


revert to the 431.xx drivers.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2020658 · Report as offensive     Reply Quote
ThomaZ

Send message
Joined: 24 Oct 03
Posts: 2
Credit: 5,751,137
RAC: 13
Norway
Message 2020970 - Posted: 28 Nov 2019, 20:55:46 UTC - in response to Message 2020658.  

Thank you!
ID: 2020970 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13733
Credit: 208,696,464
RAC: 304
Australia
Message 2021037 - Posted: 29 Nov 2019, 6:33:12 UTC

Win10 will continue to update the driver as a part of it's Windows Update. As you're not running the Pro version, you will need to edit the registry to stop Windows from updating the driver whenever it gets the urge to do so.
How to stop updates for drivers with Windows Update using the Registry
Grant
Darwin NT
ID: 2021037 · Report as offensive     Reply Quote
Poldek

Send message
Joined: 30 Mar 03
Posts: 11
Credit: 28,383,096
RAC: 35
Poland
Message 2024923 - Posted: 25 Dec 2019, 13:47:26 UTC
Last modified: 25 Dec 2019, 13:52:04 UTC

In my computer I have graphics card GTX 1660 Ti, driver 441. 66. Samples for GPU (SETI@home) are normaly downloaded and most of them is processed. Samples with short term of realization are not processed. All tasks that need to be processed in more than one month are processed normaly. In case of tasks that have to be processed in for example 2 weeks, processing stops around 0,6%. Should I go back to driver 431.xx? Is it the only solution? How long it could take?
Thanks in advance.
ID: 2024923 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2024928 - Posted: 25 Dec 2019, 14:12:29 UTC - in response to Message 2024923.  

Changing the drivers to 431.60 is the only solution if you aren’t willing to change to Linux.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2024928 · Report as offensive     Reply Quote
Poldek

Send message
Joined: 30 Mar 03
Posts: 11
Credit: 28,383,096
RAC: 35
Poland
Message 2024946 - Posted: 25 Dec 2019, 17:21:52 UTC
Last modified: 25 Dec 2019, 17:22:18 UTC

Thank you for your response. I have Linux but on another computer without GeForce ;-) Does NVIDIA know this and have any plans on this? Staying with the old driver is a bit uncomfortable.
ID: 2024946 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2024953 - Posted: 25 Dec 2019, 17:46:56 UTC - in response to Message 2024946.  

Nvidia is aware of the problem, but as I understand it it’s low priority for them.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2024953 · Report as offensive     Reply Quote
Poldek

Send message
Joined: 30 Mar 03
Posts: 11
Credit: 28,383,096
RAC: 35
Poland
Message 2024955 - Posted: 25 Dec 2019, 18:10:31 UTC

Thank you for the information.
ID: 2024955 · Report as offensive     Reply Quote
RAF
Avatar

Send message
Joined: 3 Dec 02
Posts: 1
Credit: 5,963,229
RAC: 18
United Kingdom
Message 2025456 - Posted: 29 Dec 2019, 15:22:45 UTC
Last modified: 29 Dec 2019, 15:24:22 UTC

I'm running SETI@home at few computers/laptops.
One of them is Ryzen 5 3600X with Gigabyte RTX 2060 OC.
Doesn't matter how many cores I use and how much CPU time I set up, CPU works fine (all I need is better cooler as CPU's going to 85C max)
Graphic card is working fast - if works.
GPU finishes one work but has problem to start another.
Task locks GPU at 100% of GPU Clock and GPU Memory Clock but does not calculate new task.
Is there any chance to resolve this problem ?

Windows has installed all updates as well as new drivers.

Gigabyte X570 Pro WiFi
Ryzen 5 3600X
Gigabyte RTX 2060 OC Gaming Pro 6GB
16GB DDR4 3000
2x250GB Samsung 970 EVo (RAID 0)
Windows 10 Home 64bit



https://postimg.cc/xqfS1BXD
ID: 2025456 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2025457 - Posted: 29 Dec 2019, 15:43:06 UTC - in response to Message 2025456.  

as was said in the title of the this thread, the issue is the video drivers. "436.xx and later". you are using 441.xx on most of your systems which classifies as "and later".

revert to driver 431.60 and you will be fine.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2025457 · Report as offensive     Reply Quote
Profile Marcos

Send message
Joined: 28 Sep 00
Posts: 2
Credit: 246,588
RAC: 9
Spain
Message 2046513 - Posted: 23 Apr 2020, 14:12:51 UTC

Hi, I have an AMD 1700/nvidia 2060 RTX, after a windows 10 clean install, (my old nvidia 970 stopped working so I decided to make a full reset) the first days the system was working properly, I had 442 drivers and I had seti working in those new WU after it went hybernation without issues, these last two days I cannot process those gpu nvidia WU anymore, the problem is not that it only runs slow, it halts after 1, 2- or 10% being completed, and then a BSOD, I always get a system reset trying to process those WU
I have tried several drivers, 431, 436, 442, 445, and it happens the same all the time.
I have tried playing just games and the card works fine (max 58 ºC) and it works fine with milkyway@home GPU works units.
It is related or it is a different problem?
ID: 2046513 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13733
Credit: 208,696,464
RAC: 304
Australia
Message 2046623 - Posted: 23 Apr 2020, 22:00:56 UTC - in response to Message 2046513.  
Last modified: 23 Apr 2020, 22:01:23 UTC

It is related or it is a different problem?
Possibly related.
Nvidia dropped the fix from a whole group of drivers; the most recent driver that has the fix is v445.87. Driver v 442.59 is the highest version (before v445.97) that i know worked.
And you need to get it from Nvidia, the ones through Windows update can be problematic.

A few weeks time & it won't be an issue here any way as Seti is winding up. Only dribs & drabs with the occasional burst of work available now.
Grant
Darwin NT
ID: 2046623 · Report as offensive     Reply Quote
Profile Marcos

Send message
Joined: 28 Sep 00
Posts: 2
Credit: 246,588
RAC: 9
Spain
Message 2046663 - Posted: 24 Apr 2020, 1:16:40 UTC - in response to Message 2046623.  

Thank You, I will try v442.59, to see if those work. I know that I could possible abort all those workunits and forget about them, but since I already have them (69, I think) I would like to complete them and give Seti@home a good farewell,.. for now... :)
Thanks!
ID: 2046663 · Report as offensive     Reply Quote

Questions and Answers : GPU applications : NVidia 436.xx and later drivers can cause very long compute times especially on Arecibo VHAR work units


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.