Message boards :
Number crunching :
NVidia 436.xx and later drivers can cause very long compute times especially on Arecibo VHAR work units
Message board moderation
Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 . . . 20 · Next
Author | Message |
---|---|
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14656 Credit: 200,643,578 RAC: 874 |
I have a host (8121358) with 1050 TI It normally runs Windows 7, but has an alternative boot to Windows 10. I'll switch it over. push for updates, and see what comes. May take a while... Bah - wants a Windows 7 update before I can even shut it down... Into Win 10, downloading 1809 and NVidia 26.21.14.3200 - that looks like the one we want. Except my internet just crashed... It'll be 8508571 when it's ready - it does have a 1050, honest! |
W-K 666 Send message Joined: 18 May 99 Posts: 19144 Credit: 40,757,560 RAC: 67 |
Or if the host computer does not do games, click the "Windows Driver Type" and choose Studio Driver. That one also works. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14656 Credit: 200,643,578 RAC: 874 |
Here we go. Device Manager says and that's the one I saw downloading from Microsoft BOINC says 01/02/2020 18:57:07 | | CUDA: NVIDIA GPU 0: GeForce GTX 1050 Ti (driver version 432.00, CUDA version 10.1, compute capability 6.1, 4096MB, 3376MB available, 2138 GFLOPS peak) 01/02/2020 18:57:07 | | OpenCL: NVIDIA GPU 0: GeForce GTX 1050 Ti (driver version 432.00, device version OpenCL 1.2 CUDA, 4096MB, 3376MB available, 2138 GFLOPS peak) 01/02/2020 18:57:07 | | OpenCL CPU: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz (OpenCL driver vendor: Intel(R) Corporation, driver version 7.6.0.0814, device version OpenCL 2.1 (Build 0)) 01/02/2020 18:57:07 | | app version refers to missing GPU type intel_gpuI'll sort that last one out later. But I think we've confirmed what came from where. Next I'll run the same bench test (on overflow tasks) that I did for opencl_nvidia_sah six days ago - then hunt for some VHARs. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14656 Credit: 200,643,578 RAC: 874 |
I suspect 432.00 is likely to be good as well, since I think it was likely a "Release 430" driver, and the problems started with 436.02 from the "Release 435" driver branch.Comparing the release dates from the two screen shots, I think you're right. Microsoft's 432.00 is dated 24 July 2019 NVidia's 431.60 is dated 23 July 2019 The bench test (overflow tasks) ran at normal speed with high Q validation. I'll see if I've got any spare VHARs downstairs. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I only have one VHAR task in my Test WUs directory. I can post it up to my Google drive if you can't find some VHAR tasks in your collection. This one definitely stalled out on Windows 10 with the later drivers. Angle range is 2.7. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14656 Credit: 200,643,578 RAC: 874 |
Still trying to find my Intel HD 530 - I'll give you a shout if I need that VHAR. Ta. |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 |
Richard, If you're looking for a VHAR to test the issue in this thread.. Try this link, for the example that I've been testing with, along with repro steps. https://setiathome.berkeley.edu/forum_thread.php?id=84780&postid=2016218 "MBbench - OpenCL Testing\28oc11aa.6787.6611.5.32.85.wu" .. is the folder that you want. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14656 Credit: 200,643,578 RAC: 874 |
And thanks for that, too. Machine is currently installing 1909 feature update, and I'm gagging for a cup of coffee. Taking a break... |
robertmiles Send message Joined: 16 Jan 12 Posts: 213 Credit: 4,117,756 RAC: 6 |
Well, since I announced the availability of the test build in this thread (and the similar parallel ATI thread), I'd sort of assumed that people would report back here. I'd prefer the test reports to be publicly visible for peer review, rather than hidden in PMs, but I'll try to read it wherever you post. But be aware I can't see inside private team discussion groups. Alternative drivers are usually available only for the less common graphics boards for which no BOINC projects produce suitable workunits. CUDA is a computer language for Nvidia boards only, and only Nvidia produces drivers that can use it. These drivers now can also use another computer language, OpenCL, for which other GPU companies produce drivers that can use it. Microsoft (the source of Windows) edits these drivers to produce alternate versions with an CUDA and OpenCL support removed, and distributes those versions instead. If you find any other third party drivers, don't expect them to be useful for any BOINC work. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14656 Credit: 200,643,578 RAC: 874 |
These M$ updates take ages, don't they? Coffee, pizza, wine - we got there in the end. Downloaded Jacob's test suite, added Keith's WU. @ Jacob - test wouldn't run because the working directory 'Testdatas' was missing. Corrected that, runs fine. Windows 10 task manager shows utilisation, but you have to be careful to interpret it. On the face of it, GPU utilisation is zero - but find and look at the 'Cuda' data (displayed in top-left window above). That wobbled at 97-98-99% throughout the tasks. Result: Quick timetable WU : 21jn12ac.5081.67.5.32.189.wu setiathome_8.22_windows_intelx86__opencl_nvidia_SoG.exe -verb -nog : Elapsed 362.870 secs CPU 156.313 secs WU : 28oc11aa.6787.6611.5.32.85.wu setiathome_8.22_windows_intelx86__opencl_nvidia_SoG.exe -verb -nog : Elapsed 393.679 secs CPU 186.344 secsI'd say that was normal for a working driver on this card, and it's still described as "driver version 432.00, device version OpenCL 1.2 CUDA". |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14656 Credit: 200,643,578 RAC: 874 |
Just realised why I can't see the Intel GPU in this version of BOINC - I haven't installed my own patch to make it visible! Updating to v7.16.something... That's better: 01/02/2020 22:32:09 | | Starting BOINC client version 7.16.3 for windows_x86_64 01/02/2020 22:32:10 | | CUDA: NVIDIA GPU 0: GeForce GTX 1050 Ti (driver version 432.00, CUDA version 10.1, compute capability 6.1, 4096MB, 3376MB available, 2138 GFLOPS peak) 01/02/2020 22:32:10 | | OpenCL: NVIDIA GPU 0: GeForce GTX 1050 Ti (driver version 432.00, device version OpenCL 1.2 CUDA, 4096MB, 3376MB available, 2138 GFLOPS peak) 01/02/2020 22:32:10 | | OpenCL: Intel GPU 0: Intel(R) HD Graphics 530 (driver version 26.20.100.7262, device version OpenCL 2.1 NEO, 3231MB, 3231MB available, 202 GFLOPS peak) |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 |
1) I assure you that Testdatas is there. If you used OneDrive to download it, I just tested that, and it apparently excludes empty folders, on zip creation. What a POS!! I will be filing that Feedback to Microsoft later. 2) I don't recommend trying to use Task Manager to monitor GPU stuffs. Either use GPU-Z to look at the "GPU Load" sensor, or use MSI Afterburner to view GPU Load. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14656 Credit: 200,643,578 RAC: 874 |
Temporary fix - put a 'placeholder.txt' or similar in the folder. Yes, I downloaded it in .zip format. Windows 10 task manager is OK, if you take care to use it carefully. And it's always on the system! |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14656 Credit: 200,643,578 RAC: 874 |
And back to my nice, comfortable Windows 7. Remember that final update I had to do before shut-down? It broke the system - had to run Startup Repair. Be careful of .NET Framework v4.8 |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I'd say that pretty conclusively states that the MS supplied 432.00 driver is functionally equivalent to the official Nvidia downloadable 431.60 version. That high AR test WU did in fact hang on testers machines using a driver past the 431.60 series. My question and observation is . . . . . did MS respond to the community's concern about the high angle range tasks failing on the later drivers and responded by downgrading hosts automatically to a version that is compatible? We haven't seen or heard anything official out of Nvidia on the matter and I know there have been multiple bug reports and issues logged with them. Did Nvidia quietly behind the scenes suggest to MS to roll out the older driver to hosts running BOINC? I know that MS' telemetry probably advertises a BOINC platform host to their servers. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14656 Credit: 200,643,578 RAC: 874 |
I wondered about that, too - it seems like a long time for Microsoft to ignore newer revisions from NVidia. |
robertmiles Send message Joined: 16 Jan 12 Posts: 213 Credit: 4,117,756 RAC: 6 |
I wondered about that, too - it seems like a long time for Microsoft to ignore newer revisions from NVidia. If Microsoft is still removing the CUDA and OpenCL sections of their versions of the Nvidia drivers they distribute, I would not expect them to care whether there are any problems in the sections they removed. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I wondered about that, too - it seems like a long time for Microsoft to ignore newer revisions from NVidia. I think their versions of the drivers have been hit or miss regarding whether they have removed the OpenCL parts. Seems like I remember reports that some releases have been fine while others were missing. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 |
From my experience, the version of NVIDIA drivers that get installed for your PC by Windows Update, depends on a combination of: - What version of Windows 10 you are using (in System > About) - What NVIDIA GPU(s) are in your system I highly doubt Microsoft would roll back any drivers. And, I thought on newer hardware they were issuing 436.30 by default as of this time. But I must retest. I will do some testing tonight to confirm that, on my 2 systems shown below, on both Win 10 Release and Win 10 Insider Fast. - System 1: RTX 2080, GTX 980 Ti, GTX 980 - System 2: GTX 970, GTX 1050, GTX 660 Ti Regarding NVIDIA, I have been in communication with an NVIDIA QA support person who, again, claims our issue is being worked on, and still requires us to be patient. That is all the info I can give at this time. I know it sucks. My workaround continues to be to set Seti@Home to No New Tasks on PCs that I use 436-or-higher drivers on. If Seti@Home supplied a server side block of some sort for these tasks that run indefinitely on my GPUs, then maybe I might consider unsetting No New Tasks. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Well Richard's test this afternoon on a brand new Win 10 installation yielded him the 432.00 drivers on a GTX 1050 Ti. So something else is going on as far as which drivers are offered. Probably on the hardware detected. [Edit] I'm sure that MS is shipping drivers that are compatible with the detected hardware. For example someone that PM'd me today has a rig with a GTX 1080 Ti, certainly the top card for the Pascal generation and was running on the 432.00 drivers. But that is sufficient for that generation. Anyone running the newest Turing cards like the new Supers would of course need a driver that recognized those new models. And those have version numbers past the cutoff point for avoiding the issue of VHAR tasks. MS does not appear to be just automatically upgrading the drivers to the newest releases just on principle. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.