Message boards :
Number crunching :
NVidia 436.xx and later drivers can cause very long compute times especially on Arecibo VHAR work units
Message board moderation
Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 20 · Next
Author | Message |
---|---|
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
8.16 (opencl_nvidia_sah) 9 Jul 2016, 20:20:13 UTC - let me see how that matches up to the code revision log. Edit - r3551 and r3556 are 30 October 2016 and 05 November respectively. I'm not going to distribute that without re-testing first. Edit2 - I got cuda42 VLARs. This may take some time... Edit3 - downloaded it from http://boinc2.ssl.berkeley.edu/beta/download/setiathome_8.16_windows_intelx86__opencl_nvidia_sah.exe. Thank you Eric for being tidy with your file names. Edit4 - anyone like to guess what OpenCL kernel goes with it? The string in the executable is 'MultiBeam_Kernels_r%d' |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
Could you locate and send me the Kernel file, please? That would save time. (or just post the r number - must be 3486 or before) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
Check! Ta. Strangely, that one was a fix for intel_gpu memory issues. He must have sent up a batch of builds for testing. Now to find some overflows for validation testing - shouldn't be too hard right now ;-) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
Well, I can still drive the knabench - it's running fine. Interestingly, I have a v8.04 CPU reference app, timed about 20 minutes after it was deployed at Beta. But the baseline Main app is 8.00, two days later. I remember a last-minute flap about CPU accuracy before the full deployment of v8 - I'll check. Bench only got Q= 99.80% for the reference workunit - that's a bit low. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
Well, I've gathered two Arecibo and half-a-dozen BLC35 from another machine which still has data to crunch. I want to make it a realistic test on current work - that's what it'll have to cope with if we distribute it. Back to the bench... |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
I think this will do: 10 testWU(s) found └─(_WisGenA.wu) └─(_WisGenB.wu) └─(21ja20ad.23426.17245.6.33.66.wu) └─(21ja20ad.23903.6611.9.36.8.wu) └─(blc35_2bit_guppi_58691_62810_HIP23311_0035.19050.409.22.45.121.vlar.wu) └─(blc35_2bit_guppi_58691_63755_HIP23422_0038.31446.818.21.44.30.vlar.wu) └─(blc35_2bit_guppi_58691_64069_HIP23311_0039.12742.0.22.45.81.vlar.wu) └─(blc35_2bit_guppi_58691_64069_HIP23311_0039.14146.818.22.45.38.vlar.wu) └─(blc35_2bit_guppi_58691_64387_HIP23535_0040.23091.818.22.45.12.vlar.wu) └─(blc35_2bit_guppi_58692_01957_HIP80644_0118.23993.409.21.44.246.vlar.wu) 2 reference science app(s) found └─(setiathome_8.00_windows_intelx86.exe -verb -nog) └─(setiathome_8.16_windows_intelx86__opencl_nvidia_sah.exe -verb -nog) 1 science app(s) found └─(MB8_win_x86_SSE3_OpenCL_NV_SoG_r3584.exe -verb -nog)Should give me time to break for lunch, even if they are overflows. Good thing I put in both reference apps - I'm getting errors between 8.00 and 8.04 - and I'm still on the WisGens! Edit - that wasn't what I meant to do, was it? OK, it's giving useful info - I'll let it run. Evidently need more coffee... |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
Well, several re-runs and re-configurations of the bench test suite later, I think I've found a setting which gives us the answer we were looking for. That's not as bad as it sounds. The configuration problems were things like not forcing the test to be done on the NVidia card - and with two GPUs in the system, I wasn't confident about the auto-detection. Anyway, I've now got a run with those 8 randomly-selected VHARs, and 1 reference science app(s) found (setiathome_8.00_windows_intelx86.exe -verb -nog) 2 science app(s) found (MB8_win_x86_SSE3_OpenCL_NV_SoG_r3584.exe -verb -nog) (setiathome_8.16_windows_intelx86__opencl_nvidia_sah.exe -verb -nog)definitively run on the GTX 1050 Ti with driver version 388.43 And it's come up smelling of roses, Q 100.00 or 99.99 throughout, and the right signal counts. Bother. It's probably worth putting it in the installer after all, and that means I'll have to tear apart the one I had ready to go, and make space for an extra entry on the selection screen. cuda32 will probably have to go. Tomorrow, I think. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Always happy to help Richard. Ha ha. I don't think anyone will miss CUDA32. Always crashed and burned on the current popular mix of hardware and drivers anyway. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
EdwardPF Send message Joined: 26 Jul 99 Posts: 389 Credit: 236,772,605 RAC: 374 |
if you need a simple minded tester ... thats me ... I'm in! Ed F |
Wiggo Send message Joined: 24 Jan 00 Posts: 34930 Credit: 261,360,520 RAC: 489 |
Always happy to help Richard. Ha ha. I don't think anyone will miss CUDA32. Always crashed and burned on the current popular mix of hardware and drivers anyway.But there are still those out there that are still using 8xxx and 9xxx and other pre-Fermi GPU's that that app suits. ;-) Cheers. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
OK guys, let's make this an open Beta test. Lunatics Installer v0.46 64-bit is available for testing. For NVidia: includes the 8.16 (opencl_nvidia_sah) application from Beta. This is intended as a temporary workround for drivers above 431.60 on Windows 10. If Microsoft has updated your drivers, or you have updated them yourself for gaming, use the default option 'MB8_win_x86_SSE3_OpenCL_NV_sah_r3486' (the original name for the same file). If you have older drivers, or if you are running a different version of Windows, go ahead and choose the 'SoG' app - with minor update to r3584. For AMD/ATI: updated all apps to match the v8.24 stock release, with safety patch for RX 5700-series 'NAVI' GPUs. You must also upgrade to the Adrenalin 2020 Edition 20.1.2 Optional driver or later. The following link is for a Google Drive folder with two files: the main Installer, and a small configuration file. If you put both files in the same folder, and run the installer, it will run in 'test mode': it will extract the files you request and go through the installation process, but place the results into a separate test folder, leaving your main BOINC installation untouched. If you want to perform the actual installation for real, simply delete the configuration file. The installation process (which is unchanged) has proved itself over the years, so you shouldn't have any problems. The installer will locate your BOINC installation, stop it, install the new files, and restart it. The restart process sometimes fails: if that happens, just wait a few seconds after the installer has closed, and restart it manually. Or you may prefer to stop and then restart BOINC manually - either will do. The installer is designed to preserve all SETI tasks in your cache, and run them with the new applications. This is the potentially difficult part, and I'd like to hear if there are any problems with disappearing caches. Because it's a possibility, and we're likely to go into a long maintenance outage within 24 hours, you may prefer to defer testing until SETI is back up and running. Download from Lunatics v0.46 installer test, and enjoy. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
Minor refresh made to the installer, to provide a better return path from OpenCL_NV_sah to CUDA. If that's your route, please re-download the installer before using it to revert to CUDA. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
Well over 48 hours since I posted the refreshed installer. I've had precisely one PM saying it's looking OK, and no more reported quibbles. Are we good to go? I'd like to see some actual tester feedback, please - good or bad. |
robertmiles Send message Joined: 16 Jan 12 Posts: 213 Credit: 4,117,756 RAC: 6 |
Well over 48 hours since I posted the refreshed installer. I've had precisely one PM saying it's looking OK, and no more reported quibbles. Are we good to go? I'd like to see some actual tester feedback, please - good or bad. Looks OK for me. I couldn't find the method you expected to report this, though. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
Well, since I announced the availability of the test build in this thread (and the similar parallel ATI thread), I'd sort of assumed that people would report back here. I'd prefer the test reports to be publicly visible for peer review, rather than hidden in PMs, but I'll try to read it wherever you post. But be aware I can't see inside private team discussion groups. But anyway - thanks for the positive vote. I won't do a public release this late in the day, UK time, but I'm minded to do it tomorrow morning - say any time after 12 hours from now. |
VelocityRC Send message Joined: 27 Sep 19 Posts: 23 Credit: 1,421,582 RAC: 86 |
Well, since I announced the availability of the test build in this thread (and the similar parallel ATI thread), I'd sort of assumed that people would report back here. I'd prefer the test reports to be publicly visible for peer review, rather than hidden in PMs, but I'll try to read it wherever you post. But be aware I can't see inside private team discussion groups. A lot of this GPU language is over my head and I'm not sure I have the time lo learn all that is needed to understand some of this. I have heard of CUDA but my knowledge is limited. If I recall it is a alternate driver for nVidia. Is that correct ?? Us greenhorns would like a driver download link, preferably an nVidia one but that seems to be an issue ATM. I don't mind using third party GPU driver with a link posted here. JMHO Bill S. |
Mr. Kevvy Send message Joined: 15 May 99 Posts: 3776 Credit: 1,114,826,392 RAC: 3,319 |
A lot of this GPU language is over my head and I'm not sure I have the time lo learn all that is needed to understand some of this. I have heard of CUDA but my knowledge is limited. Graphics, especially 3D, is all math/computation. CUDA is simply a framework to allow your GPU to use that capability to do actual computation like a CPU does. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874 |
@VelocityRC: You currently are showing 'NVIDIA GeForce GTX 1050 (2048MB) driver: 432.00'. That's earlier than the number in the thread title. Edit - while I was posting, Keith Myers suggested that one might fall into the gap between 'known good' and 'known bad'. If you have any problems with SETI tasks(*), you might be better following the link for general downloaders. (*) apart from not getting any... @everyone else: https://www.nvidia.com/Download/Find.aspx?lang=en-us is probably the place to look. Fill in everything, and you should get something like this: Don't use the ones I've crossed out: use the green one at the bottom. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Is there a way to check the providence of a Windows Nvidia driver as to origin? I think all these Windows hosts with driver version 432.00 are using a Microsoft delivered driver of which we know nothing about. The driver could be good or bad, we just don't know. Somebody needs to run the 432.00 driver and some known VHAR Arecibo work and see if the tasks stall out or complete normally in the benchmark tools. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 |
Keith: I'm willing to do what you want, if you explain it more clearly. 1) Are you proposing that I DDU (to get rid of all current drivers), then see what Windows gives me, and if I get 432.00, test it? 2) Also, do you know which GPU is likely to get 432.00 from Windows Update? I think my newer GPUs get offered a newer driver from Windows Update, though not 100% sure. Note: I have previously tested 431.60 (public release from NVIDIA), as well as 431.68 (Hotfix release from NVIDIA). Both were found to be good. I suspect 432.00 is likely to be good as well, since I think it was likely a "Release 430" driver, and the problems started with 436.02 from the "Release 435" driver branch. The Release branches are found in the Release Notes of the drivers. Let me know, Jacob |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.