Message boards :
Number crunching :
NVidia 436.xx and later drivers can cause very long compute times especially on Arecibo VHAR work units
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 20 · Next
Author | Message |
---|---|
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
Geforce experience isn’t the culprit. But you did the right thing avoiding that software anyway. It’s been know to cause all kinds of problems, most unrelated to SETI. As for why you didn’t see anything until recently, well the problem only happens on one type of WU, Arecibo VHAR. So you probably didn’t get one until recently. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
billy ewell 1931 Send message Joined: 1 Apr 03 Posts: 23 Credit: 24,295,322 RAC: 2 |
Of course and thanks; good information. Bill |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 |
Could anyone having this problem, please do the steps below, to report your findings to NVIDIA? If NVIDIA gets more useful reports, they will fix it. 1) Find my repro steps here: https://setiathome.berkeley.edu/forum_thread.php?id=84780&postid=2016218 2) Test the 431.60 drivers. Verify that they work for all your GPUs. 3) Test the 440.97 drivers. Verify that they FAIL for some of your GPUs. 4) Report your findings to NVIDIA, mentioning "SETI OpenCL", to the Driver Feedback page located here: https://forms.gle/kJ9Bqcaicvjb82SdA Hopefully the steps are straightforward enough to complete, but I admit they were written hastily. Thank you! |
Holdolin Send message Joined: 10 Apr 19 Posts: 68 Credit: 88,777,750 RAC: 30 |
Thanks for the information all. While all my dedicated crunching rigs run Linux, I do have this old beat up Windows machine I used and decided to put it to work as well when I found the exact problem discussed in this thread. Sure enough, reverting to the 431.6 drivers fixed the issue. Again, thanks to all who contributed to the solution. |
Dave Dickinson Send message Joined: 30 Oct 05 Posts: 4 Credit: 605,985 RAC: 0 |
Thanks to all who posted above. I have just noticed that some workunits in Seti were running really slow and the gpu process was 10% instead of the usual 90%. Glad i read this thread as i was starting to believe that my GPU had gone faulty. Will be rolling back the drivers shortly! |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 |
I just heard back from NVIDIA, regarding my feedback about breaking SETI OpenCL, and NVIDIA is looking into it. Hurray! |
Holdolin Send message Joined: 10 Apr 19 Posts: 68 Credit: 88,777,750 RAC: 30 |
Well that's a good thing. Thanks for your effort. |
Lemon Wolf Send message Joined: 19 Jul 09 Posts: 9 Credit: 1,114,148 RAC: 0 |
Isnt that what i said a couple of posts ago? |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 |
Yes. And no. Maybe. :) Thank you for reporting that you got a response from them too. That is good. My understanding was that they thought they fixed it... and now know that they didn't and are again looking at it. Maybe. Don't know for sure. But do know that they are for sure looking at it presently, which is good. Thank you again for helping with that! |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13855 Credit: 208,696,464 RAC: 304 |
Isnt that what i said a couple of posts ago?Someone here posting about contact from Nvidia directly to them is considerably more re-assuring than a post from someone here, that has heard from someone, that has a contact with someone at Nvidia. Grant Darwin NT |
Sebastian Brack Send message Joined: 22 Aug 99 Posts: 9 Credit: 11,186,556 RAC: 16 |
The Nvidia Quadro driver 440.xx does also not work correctly. https://setiathome.berkeley.edu/result.php?resultid=8156230119 |
Lemon Wolf Send message Joined: 19 Jul 09 Posts: 9 Credit: 1,114,148 RAC: 0 |
Isnt that what i said a couple of posts ago?Someone here posting about contact from Nvidia directly to them is considerably more re-assuring than a post from someone here, that has heard from someone, that has a contact with someone at Nvidia. I am sure to keep my mouth shut in the future then. Certainly dont want to spread any hearsay. |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 |
Come on. Same team. Thank you for helping to work together toward getting this problem fixed. I hope you continue to do that. Thanks, Jacob |
Stephen Thomas Home - Delhi Send message Joined: 16 Sep 99 Posts: 11 Credit: 507,164,357 RAC: 357 |
I did just tested NVIDIA GRD 441.08 but no change. I switch back to 431.60. |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 |
I confirm. The 441.08 drivers do not fix the problem. We will have to continue waiting for a fix. 431.60 continues to be the workaround. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Has anyone with Windows 10 tried the VHARs with the Non-SoG OpenCL App? ...There you go, from BETA, where you can find the Non-SoG App; https://setiweb.ssl.berkeley.edu/beta/result.php?resultid=34768072 Operating System : Microsoft Windows 10 Professional x64 Edition, (10.00.18362.00) Driver version: 440.97 WU true angle range is : 4.344386 OpenCL Platform Name: NVIDIA CUDA Number of devices: 1 Max compute units: 10 Max work group size: 1024 Max clock frequency: 1784Mhz Max memory allocation: 1610612736 Cache type: Read/Write Cache line size: 128 Cache size: 491520 Global memory size: 6442450944 Constant buffer size: 65536 Max number of constant args: 9 Local memory type: Scratchpad Local memory size: 49152 Queue properties: Out-of-Order: Yes Name: GeForce GTX 1060 6GB Vendor: NVIDIA Corporation Driver version: 440.97 Version: OpenCL 1.2 CUDA Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics Work Unit Info: ............... Credit multiplier is : 2.85 WU true angle range is : 4.344386 Used GPU device parameters are: Number of compute units: 10 Single buffer allocation size: 128MB Total device global memory: 6144MB max WG size: 1024 local mem type: Real FERMI path used: yes LotOfMem path: no LowPerformanceGPU path: no HighPerformanceGPU path: no period_iterations_num=50 Triplet: peak=7.870685, time=83.23, period=0.3428, d_freq=1419212646.48, chirp=0, fft_len=8 Triplet: peak=8.022648, time=4.461, period=0.0811, d_freq=1419210815.43, chirp=0, fft_len=16 Triplet: peak=8.850973, time=100.4, period=0.1901, d_freq=1419204101.56, chirp=0, fft_len=16 Spike: peak=24.75023, time=46.98, d_freq=1419207139.83, chirp=0.79302, fft_len=128k Spike: peak=24.16562, time=78.85, d_freq=1419208781.99, chirp=-33.037, fft_len=32k Triplet: peak=8.684459, time=46.9, period=0.2966, d_freq=1419211939.82, chirp=-41.094, fft_len=32 Spike: peak=24.23859, time=105.7, d_freq=1419210593.08, chirp=-85.017, fft_len=32k Spike: peak=24.04153, time=105.7, d_freq=1419210593.09, chirp=-85.076, fft_len=32k Best spike: peak=24.75023, time=46.98, d_freq=1419207139.83, chirp=0.79302, fft_len=128k Best autocorr: peak=17.08628, time=33.55, delay=6.5962, d_freq=1419209320.8, chirp=10.026, fft_len=128k Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.124e+011, d_freq=0, score=-12, null_hyp=0, chirp=0, fft_len=0 Best pulse: peak=2.854268, time=80.09, period=0.1032, d_freq=1419205759.15, score=0.8821, chirp=-82.188, fft_len=32 Best triplet: peak=8.850973, time=100.4, period=0.1901, d_freq=1419204101.56, chirp=0, fft_len=16 Spike count: 4 Autocorr count: 0 Pulse count: 0 Triplet count: 4 Gaussian count: 0Another, https://setiweb.ssl.berkeley.edu/beta/result.php?resultid=34767640 |
Aaron Send message Joined: 1 May 10 Posts: 19 Credit: 4,725,508 RAC: 36 |
Be nice when Nvidia fixes the issue. Until then i'll just keep swapping between drivers for games and crunching. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
At this point it's pretty clear the problem doesn't exist when using the Non-SoG version of the App. So, if you are using the Lunatics version of SoG it's as simple as changing the App over in the app_info.xml to the version here; http://boinc2.ssl.berkeley.edu/beta/download/setiathome_8.16_windows_intelx86__opencl_nvidia_sah.exe Download the 8.16 App, place it in your setiathome.berkeley.edu folder, then change your app_info.xml to name the new App instead of the SoG version. Since Lunatics usually creates a 5+ page app_info.xml, it would probably be best to use find & replace ;-) I think both Apps uses the same libfftw3f-3-3-4_x86.dll, so, you probably don't need to change anything else. You can see the setiathome_8.16_windows_intelx86__opencl_nvidia_sah.exe App working with Arecibo VHARs at beta, https://setiweb.ssl.berkeley.edu/beta/results.php?hostid=87127&offset=220 |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 |
While that may be true... To my knowledge... Normal crunchers don't mess with app_info.xml or app_config.xml files. Normal crunchers will want NVIDIA to fix this as soon as possible. Normal crunchers will do semi-normal things like reverting drivers, as a workaround for this issue, if they're even aware of the issue and seeking a workaround. I am hopeful NVIDIA will help out the Normal crunchers soon! :) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13855 Credit: 208,696,464 RAC: 304 |
At this point it's pretty clear the problem doesn't exist when using the Non-SoG version of the App.So I guess the questions are 1 What's different between he applications? 2 What's different about the recent drivers? Grant Darwin NT |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.