How Does This Make Sense? (CUDA42 vs CUDA50 on Similar Machines)

Author	Message
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340	Message 1763681 - Posted: 9 Feb 2016, 15:41:49 UTC I have been running stock since v8 came out. When the GPU versions came, I expected both of my crunchers to settle on the CUDA50 versions, as both of my crunchers have dual GTX 980s (thanks, Craigslist!). (One is an i7-4820K, the other is an i7-4790K). NOTE: For about 10-12 days, both ran 2 WUs/GPU, then I bumped them both to 3/GPU, which they had been running before v8. But Nooooo! The i7-4790K fairly quickly settled as using CUDA50, but the i7-4820K, after almost 3 weeks of running both CUDA50 and -42, finally has settled as CUDA42. How is this possible? Same graphics cards, but different CUDA seems rather strange to me. Can anyone please explain to me how this could happen? I mean, Maxwell is Maxwell, and definitely not Kepler, right? ID: 1763681 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1763687 - Posted: 9 Feb 2016, 16:04:38 UTC - in response to Message 1763681. Application details for i7-4790K cuda42: 100.99 GFLOPS cuda50: 127.37 GFLOPS Application details for i7-4820K cuda42: 113.13 GFLOPS cuda50: 100.49 GFLOPS So, the server is reacting as expected to the data in its possession, but the question is why the (averaged) speeds are so different. Same driver? Yes, check. Same motherboard, case? (case might affect cooling) Same PSU? (undervoltage might cause throttling) Or, most likely: same work mix? VHAR 'shorties' aren't as short as expected, these days, so a run of shorties would drive APR down. And we got a block of 'near VLAR' the other day, which ran even slower. If the second machine hit either of those groups of workunits while running cuda50, that could have forced the apps to swap places. ID: 1763687 ·

Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1763698 - Posted: 9 Feb 2016, 16:51:18 UTC - in response to Message 1763681. I have been running stock since v8 came out. When the GPU versions came, I expected both of my crunchers to settle on the CUDA50 versions, as both of my crunchers have dual GTX 980s (thanks, Craigslist!). (One is an i7-4820K, the other is an i7-4790K). NOTE: For about 10-12 days, both ran 2 WUs/GPU, then I bumped them both to 3/GPU, which they had been running before v8. But Nooooo! The i7-4790K fairly quickly settled as using CUDA50, but the i7-4820K, after almost 3 weeks of running both CUDA50 and -42, finally has settled as CUDA42. How is this possible? Same graphics cards, but different CUDA seems rather strange to me. Can anyone please explain to me how this could happen? I mean, Maxwell is Maxwell, and definitely not Kepler, right? It may not be as "settled" as you think. You had Cuda50 running as recently as yesterday, it appears. One thing that I found can be a problem is a mixed workload of AP and MB tasks. When an AP and MB run together, the MB run times suffer, thus driving down the APR for whatever Cuda flavor is currently favored. That causes the scheduler to tilt toward the other flavor for a while. If the APs then stop for awhile, the scheduler sticks with the most recently favored cuda....until the next burst of APs comes along. Then the flip-flopping tends to resume. It's kind of like musical chairs. When the [AP] music stops, one Cuda gets a seat and the other is left out (temporarily). ID: 1763698 ·

Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340	Message 1763699 - Posted: 9 Feb 2016, 16:55:51 UTC - in response to Message 1763687. Last modified: 9 Feb 2016, 17:00:25 UTC Richard - thanks for your response. Same case, yes - Cooler Master HAF 932; temps are all well below any problem areas. MBs different for the CPUs (one is skt 1150, the other 2011). Both 8gb RAM. I had noticed the inverted speeds for i7-4820K before, but just didn't think about it, except to note that it had been getting both CUDA42 and -50 from the servers for almost 3 weeks. Given the speed inversion, if I go Lunatics on these systems (as I am planning to do soon, having established stock RAC as best I can), can I specify CUDA50 for one and CUDA42 for the other (I forget how the setup goes for Lunatics graphics)? Jeff - thanks for that. Problem is, I never saw that before. Of course, I had been running Lunatics for a long time before v8. ID: 1763699 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1763701 - Posted: 9 Feb 2016, 16:56:31 UTC - in response to Message 1763681. Last modified: 9 Feb 2016, 16:58:55 UTC Distilling down things as best I can, - what app is 'chosen' is based on what the server estimates your tasks will take. - those estimates are connected to CreditNew, which we know is unstable - there isn't a 'huge' performance difference between Cuda4.2 and Cuda5, at least compared to how much those estimates are unstable Let's say there were 20%, +/- 10%, difference between the applications on the same host (which there wouldn't be that much, but it'll illustrate. Next, let's say the machines are truly identical in usage/loading, temps, clocks, and the Angle ranges of work they receive. Even under that impossible scenario, for one app server estimates will have ~+/- 37% variation, and logic says 50 % will be on the high side (estimate 37% too long), and 50% on the low side (estimating 37% short). So the choice is swamped by noise. Performing some statistical voodoo, then under these ideal circumstances you get some probabilities. Chance of given host receiving the 'correct' application (A), given B is the wrong application: P(A\|Bwrong)= ( PAestimateLow x PBestimateHigh) / PBestimateHigh --> ( 50% x 50% ) / 50% = 50% so 50-50 chance for each of your machines to get the right app or the wrong app, and you had two coin tosses. For the formula to change to be less of a useless coin toss, the estimates would need to be closer than the 'actual' difference on the same host. So the +/37% server estimate slop would need to be improved to better than the +/- 10% difference between apps, i.e. by a factor of four or more times closer to actual. Quite doable in engineering terms, though I don't think the Will is there at the moment. If really 'stuck' Workarounds are to either force the issue on the one stuck on the 'wrong' app (I think a project reset might reset those numbers, though haven't checked), or by running anonymous platform. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1763701 ·

Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340	Message 1763709 - Posted: 9 Feb 2016, 17:07:55 UTC - in response to Message 1763701. If really 'stuck' Workarounds are to either force the issue on the one stuck on the 'wrong' app (I think a project reset might reset those numbers, though haven't checked), or by running anonymous platform. Verrryyyyy Interesting! OK, I give up, I will go with Lunatics in a week or two; again, can I force the issue there or not? Given the APR numbers quoted above, plus the fact that I have done a few thousand v8 on each machine already, would it be wise to force i7-4820K to CUDA42 (if doable)? ID: 1763709 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1763712 - Posted: 9 Feb 2016, 17:11:32 UTC - in response to Message 1763687. Last modified: 9 Feb 2016, 17:12:12 UTC Application details for i7-4790K cuda42: 100.99 GFLOPS cuda50: 127.37 GFLOPS Application details for i7-4820K cuda42: 113.13 GFLOPS cuda50: 100.49 GFLOPS The numbers make it simpler to picture. 'real' vales are probably around 106 GFlops for the Cuda42 app, and 115 GFlops for the Cuda50 one, So way closer to one another than the server's foggy glasses, so the apps look the same. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1763712 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1763713 - Posted: 9 Feb 2016, 17:17:32 UTC - in response to Message 1763701. If really 'stuck' Workarounds are to either force the issue on the one stuck on the 'wrong' app (I think a project reset might reset those numbers, though haven't checked), or by running anonymous platform. I'd say 'no' to the project reset - that only resets the numbers on the local computer, not those on the server. If you really needed to rebase the server figures, you need to acquire a new HostID - which is too drastic for this small cuda42/50 difference. Running Anonymous Platform (Lunatics) will certainly ensure that you run the same application consistently. ID: 1763713 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1763714 - Posted: 9 Feb 2016, 17:23:02 UTC - in response to Message 1763709. If really 'stuck' Workarounds are to either force the issue on the one stuck on the 'wrong' app (I think a project reset might reset those numbers, though haven't checked), or by running anonymous platform. Verrryyyyy Interesting! OK, I give up, I will go with Lunatics in a week or two; again, can I force the issue there or not? Given the APR numbers quoted above, plus the fact that I have done a few thousand v8 on each machine already, would it be wise to force i7-4820K to CUDA42 (if doable)? I would try both with using Lunatics, selecting Cuda50, then See what the APRs 'stabilise' at. Then for the sakes of comparison Cuda4.2. If you were able to compare actual runtime medians and variance, you'd probably see some overlap in real performance depending on work mix, making the server's confusion partly understandable. Though as humans we can look at the runtimes and say one or another is clearly better, the server's view has some fairly bad cataracts. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1763714 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.