Message boards :
Number crunching :
NVidia 436.xx and later drivers can cause very long compute times especially on Arecibo VHAR work units
Message board moderation
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 20 · Next
Author | Message |
---|---|
Wiggo Send message Joined: 24 Jan 00 Posts: 36873 Credit: 261,360,520 RAC: 489 |
They may even have to get their heads together with M$ heads to find out what has gone wrong. Cheers. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13855 Credit: 208,696,464 RAC: 304 |
I'm just saying that it wouldn't be the first time that M$ itself has broken something driver wise over the years by throwing in an undocumented update and that in itself would not surprise me that it's happened again.True, but this issue is affecting WIn10 systems across multiple builds. If it affected only systems after a particular build, then you could put the blame on M$, or some of the blame on M$ as well (even then- it could be a case of them fixing something that was broken, and Nvidia got caught out by making use of what was known to be a bug). At this stage, everything is pointing to Nvidia (Of course if Nvidia fixed an issue with their driver, and this is the result, then things will get very ugly). Grant Darwin NT |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
You can see this behavior in that Win 8.1 user's task, on their GTX 960 (Pascal): GTX 960 (and all 900 series) is Maxwell, not Pascal Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 |
You're right, I had a typo there. And it looks like I can't fix it. Dang. |
Wiggo Send message Joined: 24 Jan 00 Posts: 36873 Credit: 261,360,520 RAC: 489 |
You're right, I had a typo there. And it looks like I can't fix it. Dang.I wouldn't worry about that too much. ;-) But at least we can narrow the problem down to being Cheers. |
tullio Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1 |
I am running Arecibo tasks via Science United on a Windows 8.1 PC with a GTX 1050 Ti and 436.30 driver. No problem. Tullio |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
I am running Arecibo tasks via Science United on a Windows 8.1 PC with a GTX 1050 Ti and 436.30 driver. No problem. This is the host he’s talking about. https://setiathome.berkeley.edu/show_host_detail.php?hostid=8815395 But as has been stated, the problem only affects Windows 10 with drivers 436+ (so far). Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
tullio Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1 |
Right. It is running also Milkyway@home (ten thousand tasks so far), Asteroids@home (one thousand tasks), and supports also a Linux Virtual Machine with SuSE Tumbleweed and kernel 5.7.1, updated very frequently. This also runs Science United tasks, including a long range climateprediction.net. A rare Linux task in Climateprediction.net. Tullio |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Here is another machine at Beta showing the Non-SoG App 8.16 working while the SoG App 8.22 fails, https://setiweb.ssl.berkeley.edu/beta/results.php?hostid=88461&offset=40 Coprocessors: NVIDIA TITAN X (Pascal) (4095MB) driver: 441.08 OpenCL: 1.2 Operating System: Microsoft Windows 10 Core x64 Edition, (10.00.18362.00) You could probably find a few more if you look a little more. |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 |
NVIDIA released 441.12 drivers today. I tested them, and they still have the "SETI OpenCL SoG VHAR" problems. Maxwell: Tasks crash with error. Pascal/Turing: Tasks run indefinitely with no load on the GPU. We must continue to be patient. |
tullio Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1 |
After all my Milkyway@home and Seti@home started making computer errors on driver 436.30 I installed 441.12 and at least SETI@home are working. This on a Science United PC with Windows 8.1. Tullio |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
After all my Milkyway@home and Seti@home started making computer errors on driver 436.30 I installed 441.12 and at least SETI@home are working. This on a Science United PC with Windows 8.1. The problem doesn’t exist on Win 8.1. Only Windows 10. The 441 driver has no change for this issue. It doesn’t fix anything. Looking at your specific errors on that machine, it looks like your whole system had an issue or you had the driver crash. A simple reboot would have likely resolved your problem. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Jesse Viviano Send message Joined: 27 Feb 00 Posts: 100 Credit: 3,949,583 RAC: 0 |
Unfortunately, the old drivers have major security vulnerabilities that have been patched in driver 441.12. I have no choice but to recommend that we update to 441.12 and then crunch with the CPU for now. The older insecure drivers have escalation of privilege, information disclosure, and denial of service vulnerabilities. See https://nvidia.custhelp.com/app/answers/detail/a_id/4907 for details. |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
Unfortunately, the old drivers have major security vulnerabilities that have been patched in driver 441.12. I have no choice but to recommend that we update to 441.12 and then crunch with the CPU for now. The older insecure drivers have escalation of privilege, information disclosure, and denial of service vulnerabilities. See https://nvidia.custhelp.com/app/answers/detail/a_id/4907 for details. the nvidia_opencl_sah app still works. you do not need to stop GPU crunching, and you can use the newest drivers. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13855 Credit: 208,696,464 RAC: 304 |
Unfortunately, the old drivers have major security vulnerabilities that have been patched in driver 441.12. I have no choice but to recommend that we update to 441.12 and then crunch with the CPU for now. The older insecure drivers have escalation of privilege, information disclosure, and denial of service vulnerabilities. See https://nvidia.custhelp.com/app/answers/detail/a_id/4907 for details.And are only an issue if you allow the people that want to attack your system physical access to it to set up the attack. No physical access, no breach possible. Grant Darwin NT |
Bernie Vine Send message Joined: 26 May 99 Posts: 9958 Credit: 103,452,613 RAC: 328 |
And are only an issue if you allow the people that want to attack your system physical access to it to set up the attack. No physical access, no breach possible. Perhaps Nvidia need to make things a lot clearer then, as in the CVE list only one vulnerability is described as "The attacker requires local system access." and that CVE does not apply to GForce on Windows. With no similar statement in the 5 CVE's that do affect GForce cards on Windows I am definitely in the "better safe than sorry" camp. Also as a gamer who uses Geforce Experience in game overlay in online games, I will always have up to date drivers. All I have done is re-run the Lunatics installer and selected Cuda 50, it now takes my 1660ti about 5 times longer to crunch but is better than nothing and hopefully it won't suffer the problem. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13855 Credit: 208,696,464 RAC: 304 |
All I have done is re-run the Lunatics installer and selected Cuda 50, it now takes my 1660ti about 5 times longer to crunch but is better than nothing and hopefully it won't suffer the problem.If running the CUDA50 application, it would be worth seeing how 2 WUs at a time go; while not a high-end card, the GTX 1660Tis are very capable (particularly compared to what was highend when CUDA50 made it's appearance). 2 WUs at a time could result in more work per hour than just 1 WU. My cheat sheet. 10 15 20 30 40 50 60 70 80 90 1x 6 4 3 2 1.5 1.2 1 0.86 0.75 0.67 2x 12 8 6 4 3 2.4 2 1.7 1.5 1.3 3x 18 12 9 6 4.5 3.6 3 2.58 2.25 2 10, 15, 20 etc are the number of WUs per hour. 1x, 2x 3x are for the number of WUs running. The 6, 4, 1, 0.75 etc are the runtimes in min (0.75= 45sec). So for example- if it takes 4min to process a WU, one at a time, that's 15 WUs per hour. When processing 2 at a time, you'd want the run times to be less than 8min to be getting more than 15 WUs per hour output. 3WU at a time, less than 12min runtime would be needed to make it worthwhile. Grant Darwin NT |
Bernie Vine Send message Joined: 26 May 99 Posts: 9958 Credit: 103,452,613 RAC: 328 |
If running the CUDA50 application, it would be worth seeing how 2 WUs at a time go; while not a high-end card, the GTX 1660Tis are very capable (particularly compared to what was highend when CUDA50 made it's appearance). I decided to see how it performed first, but I had intended to try running with 2 wu's. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13855 Credit: 208,696,464 RAC: 304 |
I decided to see how it performed first, but I had intended to try running with 2 wu's.Makes sense- if you don't have a 1 WU baseline, you'll have no idea if more is better, or if it's worse. Although it would be easier to get an idea of the baseline if we were back to just the 1 or 2 files of the same data being split at a time again... On my GTX 1070, I found 2 at a time generally gave the best performance- unless you had an Arecibo & GBT WU running together. Then the Arecibo WU would take 2.5 to 3 times longer than usual to finish. Grant Darwin NT |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
You can check the performance of the different Apps at Beta, just look at the Application Details page. Here are a few running the three eligible Apps; https://setiweb.ssl.berkeley.edu/beta/host_app_versions.php?hostid=81711 SETI@home v8 8.01 windows_intelx86 (cuda50) Average processing rate: 179.51 GFLOPS SETI@home v8 8.16 windows_intelx86 (opencl_nvidia_sah) Average processing rate: 647.25 GFLOPS SETI@home v8 8.22 windows_intelx86 (opencl_nvidia_SoG) Average processing rate: 599.67 GFLOPS https://setiweb.ssl.berkeley.edu/beta/host_app_versions.php?hostid=88461 SETI@home v8 8.01 windows_intelx86 (cuda50) Average processing rate: 101.27 GFLOPS SETI@home v8 8.16 windows_intelx86 (opencl_nvidia_sah) Average processing rate: 427.21 GFLOPS SETI@home v8 8.22 windows_intelx86 (opencl_nvidia_SoG) Average processing rate: 425.44 GFLOPS https://setiweb.ssl.berkeley.edu/beta/host_app_versions.php?hostid=71641 SETI@home v8 8.00 windows_intelx86 (cuda50) Average processing rate: 161.09 GFLOPS SETI@home v8 8.16 windows_intelx86 (opencl_nvidia_sah) Average processing rate: 213.59 GFLOPS SETI@home v8 8.22 windows_intelx86 (opencl_nvidia_SoG) Average processing rate: 333.04 GFLOPS https://setiweb.ssl.berkeley.edu/beta/host_app_versions.php?hostid=88527 SETI@home v8 8.01 windows_intelx86 (cuda50) Average processing rate: 136.00 GFLOPS SETI@home v8 8.16 windows_intelx86 (opencl_nvidia_sah) Average processing rate: 519.36 GFLOPS SETI@home v8 8.22 windows_intelx86 (opencl_nvidia_SoG) Average processing rate: 358.09 GFLOPS Well, I know which App I wouldn't be using... |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.