Message boards :
Number crunching :
NVidia 436.xx and later drivers can cause very long compute times especially on Arecibo VHAR work units
Message board moderation
Previous · 1 . . . 14 · 15 · 16 · 17 · 18 · 19 · 20 · Next
Author | Message |
---|---|
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 ![]() |
I barely know what I'm doing. And I don't have plans to re-test. Hopefully the outputs from my runs, which took 2 days to get, are still useful. |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 ![]() |
Keith, how do I use the .cmd to run a CPU comparison? |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
Keith, how do I use the .cmd to run a CPU comparison?With difficulty. Two possibilities: 1) Use the rescmpv5 tool manually, by adapting the parameterised code in mb_validate.cmd 2) Place the reference result files in ..\bench\Testdatas\ref (naming them according to convention - "ref-[appname]-[wuname].res") and running a live test on the same WU. I'll be doing either or both of those later today on your results - many thanks for those - but I'm running slow this morning, and about to go out for lunch. |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 ![]() |
Okay. Thanks. I'm not going to be redoing any testing. I did the best I could, and feel you all have valuable info if your goal is to validate the fix. From my brief inspections, I think the fix is good :) |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Keith, how do I use the .cmd to run a CPU comparison? I don't like the Lunatics MBbench application. I think it has been surpassed in simplicity and capability by Rick's benchmark application benchMT. https://github.com/Ricks-Lab/benchMT But this tool is for Linux only unfortunately. https://setiathome.berkeley.edu/forum_thread.php?id=83566#1965982 That is the one that I use. Already comes with preconfigured and run CPU tasks with the reference cpu application. Does both AP and MB. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
Hi folks,OK, I'm starting to work out how best to handle those now. I think I'm going to concentrate on the results from: * Driver 442.19 * OpenCL SoG app (so far as I'm aware, after re-reading the first 100 or so posts in this thread, that should be the only fixed version of the only app which was causing a problem) I've found 6 results from 28oc11aa.6787.6611.5.32.85 passing that filter (three GPUs in each of two machines). I'll need to rename the files to keep them separate. And the same 6 results from each of 28oc11aa.13844.8247.7.34.54 28oc11aa.13844.8656.7.34.81 28oc11aa.2079.16836.10.37.50 28oc11aa.2079.18881.10.37.49 28oc11aa.2090.18472.12.39.249 28oc11aa.2090.22562.12.39.170 28oc11aa.2108.22562.13.40.187 28oc11aa.30967.11928.8.35.113 28oc11aa.30986.13155.9.36.98 in 'WUs from Old BOINC Data Copy' I'd better make sure I've got all those WUs before I go much further... |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
I'd better make sure I've got all those WUs before I go much further...Yup, got 'em. Consolidated 10 WUs, 60 result files - now to test my bulk renaming skills. |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 ![]() |
I had to leave some of the work for you :) Hope you get it all figured out! |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
I think I've got it prepped up. KNAbench relies on lots of distinct file names in a single folder: your preferred style is identical file names in separate folders. It took a while to transfer from one standard to the other, but I got there - having each set in a tight, formatted, structure helped a lot. Taking a breather to let my eyes relax in front of some telly - I'll go back and finish it off later. |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
Unfortunately, it's not doing the automatic comparisons I had hoped for, but it's generating a local set of result files. I'll either have to hack the bench script, or write my own to compare them offline. Tomorrow. [All day at home, sheltering from an expected big storm. Hope the power stays on...] |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
All that file renaming yesterday (and again to correct my error this morning!) was worth it. This is what I wanted to see. All Jacob's results have been renamed as reference, with a spoof app name to indicate where they came from. The test was run on the GTX 1660 SUPER in my host 4292666, which is doing about 75% of the tasks in that machine (there's a 750 Ti in there as well). The machine itself is currently showing State: All (1841) · In progress (167) · Validation pending (508) · Validation inconclusive (87) · Valid (1075) · Invalid (0) · Error (4)which is pretty good. The four errors are all download errors from 28 December, so can't be blamed on the cards. And the winner is ... ... NVidia's 442.19 driver. You'll see the occasional Q= 99.99% (probably due to floating point rounding errors - that's why we can't use direct file comparison for this sort of work), but the majority are Q= 100.0%. If we saw that level of accuracy for a new application, we'd have no hesitation in saying it was ready. = MB Knabench 2.10 W32-W64 2012-02-18 by Kna + Simon + Joe = mods: quick timetable, stderr, speedup/ratio, AppTimes = /ref/ by Raistmer = BOINC install detection by Richard Haselgrove 10 testWU(s) found └─(28oc11aa.13844.8247.7.34.54.wu) └─(28oc11aa.13844.8656.7.34.81.wu) └─(28oc11aa.2079.16836.10.37.50.wu) └─(28oc11aa.2079.18881.10.37.49.wu) └─(28oc11aa.2090.18472.12.39.249.wu) └─(28oc11aa.2090.22562.12.39.170.wu) └─(28oc11aa.2108.22562.13.40.187.wu) └─(28oc11aa.30967.11928.8.35.113.wu) └─(28oc11aa.30986.13155.9.36.98.wu) └─(28oc11aa.6787.6611.5.32.85.wu) 0 reference science app(s) found 1 science app(s) found └─(MB8_win_x86_SSE3_OpenCL_NV_SoG_r3584.exe -v 0) ====================================== ------------ Running app : MB8_win_x86_SSE3_OpenCL_NV_SoG_r3584.exe -v 0 with WU : 28oc11aa.13844.8247.7.34.54.wu Started at : 10:29:47.723 Ended at : 10:35:52.465 364.516 secs Elapsed 24.991 secs CPU time R2: .\ref\ref-RacerX_dev_0.exe-28oc11aa.13844.8247.7.34.54.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-RacerX_dev_1.exe-28oc11aa.13844.8247.7.34.54.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-RacerX_dev_2.exe-28oc11aa.13844.8247.7.34.54.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_0.exe-28oc11aa.13844.8247.7.34.54.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_1.exe-28oc11aa.13844.8247.7.34.54.wu.res Result : Strongly similar, Q= 99.99% R2: .\ref\ref-Speed_dev_2.exe-28oc11aa.13844.8247.7.34.54.wu.res Result : Strongly similar, Q= 100.0% ------------ Running app : MB8_win_x86_SSE3_OpenCL_NV_SoG_r3584.exe -v 0 with WU : 28oc11aa.13844.8656.7.34.81.wu Started at : 10:35:55.981 Ended at : 10:41:08.505 312.464 secs Elapsed 52.073 secs CPU time R2: .\ref\ref-RacerX_dev_0.exe-28oc11aa.13844.8656.7.34.81.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-RacerX_dev_1.exe-28oc11aa.13844.8656.7.34.81.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-RacerX_dev_2.exe-28oc11aa.13844.8656.7.34.81.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_0.exe-28oc11aa.13844.8656.7.34.81.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_1.exe-28oc11aa.13844.8656.7.34.81.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_2.exe-28oc11aa.13844.8656.7.34.81.wu.res Result : Strongly similar, Q= 100.0% ------------ Running app : MB8_win_x86_SSE3_OpenCL_NV_SoG_r3584.exe -v 0 with WU : 28oc11aa.2079.16836.10.37.50.wu Started at : 10:41:12.113 Ended at : 10:45:42.439 270.274 secs Elapsed 44.148 secs CPU time R2: .\ref\ref-RacerX_dev_0.exe-28oc11aa.2079.16836.10.37.50.wu.res Result : Strongly similar, Q= 99.99% R2: .\ref\ref-RacerX_dev_1.exe-28oc11aa.2079.16836.10.37.50.wu.res Result : Strongly similar, Q= 99.99% R2: .\ref\ref-RacerX_dev_2.exe-28oc11aa.2079.16836.10.37.50.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_0.exe-28oc11aa.2079.16836.10.37.50.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_1.exe-28oc11aa.2079.16836.10.37.50.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_2.exe-28oc11aa.2079.16836.10.37.50.wu.res Result : Strongly similar, Q= 99.99% ------------ Running app : MB8_win_x86_SSE3_OpenCL_NV_SoG_r3584.exe -v 0 with WU : 28oc11aa.2079.18881.10.37.49.wu Started at : 10:45:45.910 Ended at : 10:51:01.698 315.654 secs Elapsed 28.439 secs CPU time R2: .\ref\ref-RacerX_dev_0.exe-28oc11aa.2079.18881.10.37.49.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-RacerX_dev_1.exe-28oc11aa.2079.18881.10.37.49.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-RacerX_dev_2.exe-28oc11aa.2079.18881.10.37.49.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_0.exe-28oc11aa.2079.18881.10.37.49.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_1.exe-28oc11aa.2079.18881.10.37.49.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_2.exe-28oc11aa.2079.18881.10.37.49.wu.res Result : Strongly similar, Q= 100.0% ------------ Running app : MB8_win_x86_SSE3_OpenCL_NV_SoG_r3584.exe -v 0 with WU : 28oc11aa.2090.18472.12.39.249.wu Started at : 10:51:05.141 Ended at : 10:56:38.273 333.060 secs Elapsed 44.055 secs CPU time R2: .\ref\ref-RacerX_dev_0.exe-28oc11aa.2090.18472.12.39.249.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-RacerX_dev_1.exe-28oc11aa.2090.18472.12.39.249.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-RacerX_dev_2.exe-28oc11aa.2090.18472.12.39.249.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_0.exe-28oc11aa.2090.18472.12.39.249.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_1.exe-28oc11aa.2090.18472.12.39.249.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_2.exe-28oc11aa.2090.18472.12.39.249.wu.res Result : Strongly similar, Q= 100.0% ------------ Running app : MB8_win_x86_SSE3_OpenCL_NV_SoG_r3584.exe -v 0 with WU : 28oc11aa.2090.22562.12.39.170.wu Started at : 10:56:41.840 Ended at : 11:01:14.430 272.536 secs Elapsed 35.927 secs CPU time R2: .\ref\ref-RacerX_dev_0.exe-28oc11aa.2090.22562.12.39.170.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-RacerX_dev_1.exe-28oc11aa.2090.22562.12.39.170.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-RacerX_dev_2.exe-28oc11aa.2090.22562.12.39.170.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_0.exe-28oc11aa.2090.22562.12.39.170.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_1.exe-28oc11aa.2090.22562.12.39.170.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_2.exe-28oc11aa.2090.22562.12.39.170.wu.res Result : Strongly similar, Q= 100.0% ------------ Running app : MB8_win_x86_SSE3_OpenCL_NV_SoG_r3584.exe -v 0 with WU : 28oc11aa.2108.22562.13.40.187.wu Started at : 11:01:29.902 Ended at : 11:06:53.038 319.665 secs Elapsed 47.705 secs CPU time R2: .\ref\ref-RacerX_dev_0.exe-28oc11aa.2108.22562.13.40.187.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-RacerX_dev_1.exe-28oc11aa.2108.22562.13.40.187.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-RacerX_dev_2.exe-28oc11aa.2108.22562.13.40.187.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_0.exe-28oc11aa.2108.22562.13.40.187.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_1.exe-28oc11aa.2108.22562.13.40.187.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_2.exe-28oc11aa.2108.22562.13.40.187.wu.res Result : Strongly similar, Q= 100.0% ------------ Running app : MB8_win_x86_SSE3_OpenCL_NV_SoG_r3584.exe -v 0 with WU : 28oc11aa.30967.11928.8.35.113.wu Started at : 11:06:56.518 Ended at : 11:11:59.870 303.292 secs Elapsed 32.573 secs CPU time R2: .\ref\ref-RacerX_dev_0.exe-28oc11aa.30967.11928.8.35.113.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-RacerX_dev_1.exe-28oc11aa.30967.11928.8.35.113.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-RacerX_dev_2.exe-28oc11aa.30967.11928.8.35.113.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_0.exe-28oc11aa.30967.11928.8.35.113.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_1.exe-28oc11aa.30967.11928.8.35.113.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_2.exe-28oc11aa.30967.11928.8.35.113.wu.res Result : Strongly similar, Q= 99.99% ------------ Running app : MB8_win_x86_SSE3_OpenCL_NV_SoG_r3584.exe -v 0 with WU : 28oc11aa.30986.13155.9.36.98.wu Started at : 11:12:03.294 Ended at : 11:16:39.078 275.730 secs Elapsed 48.938 secs CPU time R2: .\ref\ref-RacerX_dev_0.exe-28oc11aa.30986.13155.9.36.98.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-RacerX_dev_1.exe-28oc11aa.30986.13155.9.36.98.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-RacerX_dev_2.exe-28oc11aa.30986.13155.9.36.98.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_0.exe-28oc11aa.30986.13155.9.36.98.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_1.exe-28oc11aa.30986.13155.9.36.98.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_2.exe-28oc11aa.30986.13155.9.36.98.wu.res Result : Strongly similar, Q= 100.0% ------------ Running app : MB8_win_x86_SSE3_OpenCL_NV_SoG_r3584.exe -v 0 with WU : 28oc11aa.6787.6611.5.32.85.wu Started at : 11:16:42.492 Ended at : 11:21:17.168 274.629 secs Elapsed 45.739 secs CPU time R2: .\ref\ref-RacerX_dev_0.exe-28oc11aa.6787.6611.5.32.85.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-RacerX_dev_1.exe-28oc11aa.6787.6611.5.32.85.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-RacerX_dev_2.exe-28oc11aa.6787.6611.5.32.85.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_0.exe-28oc11aa.6787.6611.5.32.85.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_1.exe-28oc11aa.6787.6611.5.32.85.wu.res Result : Strongly similar, Q= 100.0% R2: .\ref\ref-Speed_dev_2.exe-28oc11aa.6787.6611.5.32.85.wu.res Result : Strongly similar, Q= 100.0% ------------(ignore the timings - I left BOINC running in the background, so the card was doing its day job at the same time) |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 ![]() |
Just to be clear, you compared what to what? I ask, because I provided results for 6 GPUs across 4 apps across 4 drivers... And I wasn't sure if anyone did a CPU comparison. Did your comparison just use the 442.19 OpenCL results that I provided? And what were you benchmarking as comparison - 442.19 OpenCL, on your end? |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
The object of the exercise was to confirm that NVidia had produced a correct, working driver at v442.19 - fixing the delays and hangups in recent history (which you have confirmed), without introducing any numerical errors along the way. With that brief, only the driver v442.19 results are significant, so I only took those. I compared them with a live test run on my GTX 1660 SUPER, using the Windows 7 version of driver v441.12 and the MB8_win_x86_SSE3_OpenCL_NV_SoG_r3584.exe application. I have sufficient confidence in each of those separate components to judge that it was a valid test - in effect, I was using the SoG r3584 as the reference app, and your results as the test pieces, reversing the process we went through three years ago before releasing the app. In an ideal world, we would have used a wider range of test WUs, including WUs plucked from the wild as examples of unexpected validation failures between otherwise reliable hosts. But this is good enough for now. |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 ![]() |
Okay. I'm glad you weren't comparing 442.19 against 442.19! Even more glad that results are good! :) Thanks for the validation. |
Alfred LK Cheng Send message Joined: 12 Jul 99 Posts: 2 Credit: 157,511,087 RAC: 478 ![]() ![]() |
Actually, we are not testing a new application. We are testing a driver which is suppose to fix the problems occurred after version 436.xx. if we compare the result of 442.19 against 436.xx, we could do a result file level comparison which should be numerically 100% match? |
VelocityRC ![]() Send message Joined: 27 Sep 19 Posts: 23 Credit: 1,421,582 RAC: 86 ![]() ![]() |
Since we are on the topic of GPU's. I'm considering pulling my GTX 1050 SSC 2gb and getting a 1050 ti 4gb. Looking at my production would I see much improvement ? BTW the MoBo has a PCI-e 2.0 slot so not sure where the point of diminishing return would be with these PCI-e 3.0 cards. Thanks. |
Ian&Steve C. ![]() Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 ![]() ![]() |
you'll see a small increase in production. nothing drastic. just an increase in line with the relative performance difference between those 2 cards. PCIe 2.0 x16 is more than enough. no worries there. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ![]() ![]() |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 ![]() |
Let's get back on topic, please. Richard, do we need any 432.00 validation? I'm not going to redo any of my testing. |
Alfred LK Cheng Send message Joined: 12 Jul 99 Posts: 2 Credit: 157,511,087 RAC: 478 ![]() ![]() |
Actually, we are not testing a new application. We are testing a driver which is suppose to fix the problems occurred after version 436.xx. if we compare the result of 442.19 against 436.xx, we could do a result file level comparison which should be numerically 100% match? The result of Richard gives us the confidence that the new driver does indeed fix the problem with the previous version. I am not suggesting further test. I am suggesting this alternative to consider in the future so that it might be easier to match the result. |
Jacob Klein Send message Joined: 15 Apr 11 Posts: 149 Credit: 9,783,406 RAC: 9 ![]() |
If you find my .zip file from a few posts back, you'll see it has results from 431.60, 431.68, 432.00, and 442.19. Here's a link: https://setiathome.berkeley.edu/forum_thread.php?id=84694&postid=2031331 You are welcome to unzip that and do comparisons. Offhand, I think there are slight floating point discrepancies on each run, that are usually considered "normal" and "within tolerance". But I'm no expert, especially with SETI. |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.