Message boards :
Number crunching :
Upgraded to a GTX 1060
Message board moderation
Author | Message |
---|---|
Mark Loukko Send message Joined: 7 Jun 99 Posts: 52 Credit: 40,406,567 RAC: 108 |
I installed a GTX 1060 into my dedicated cruncher. The system used to have a 650 TI. I picked the 1060 because it requires just a single 6 pin power connector and my old rig has only a single connector. I’m comparing the 1060 to my laptop (NVIDIA Quadro M3000M) which also crunches 24x7. To compare apples to apples I ran both systems like this: <gpu_usage>1</gpu_usage> <cpu_usage>1</cpu_usage> The runs times are: Arecibo Guppi M3000M 11 min 22 min 1060 15 min 33 min I was expecting better performance from the 1060. I then noticed the 1060 runs SETI@home V8.00 (cuda50) and the M3000M runs SETI@home v8.12 (opencl_nvidia_sah) I’m guessing this is causing the difference in performance, not the older CPU since it’s not maxed out. The 1060 in action I upgraded the rig so SETI retained the same computer ID. With a new computer I believe SETI would send many different applications to figure out the best one. Is there a way to ask SETI to reevaluate all the application for my system? Or is it better to get a new computer ID (if that’s even possible)? Cheers Mark |
Shaggie76 Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 |
I ran my analysis script that works out your stats for your pair of hosts (I double-checked and the oldest of the tasks was on your GTX 1060 so this isn't confounded by your old card). Host, API, Device, Credit/Hour, Work Units 7992114, cuda, NVIDIA GeForce GTX 1060 6GB, 126.410155457742, 108 7906450, cuda, NVIDIA Quadro M3000M, 126.410155457742, 108 7906450, opencl, NVIDIA Quadro M3000M, 375.972593465412, 90 From what I can tell your new card has been getting fed CUDA work-units which seem to yield less credit/hour than OpenCL tasks. Maybe give it time for the RAC to settle? I'm currently adding CUDA vs OpenCL stat breakdown to my SETI@home v8 benchmarks. |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Shaggie, just wanted to let you know that you're a rock star, if no ones mentioned it to you recently. Thanks for all your work on providing this information to all of us. |
Shaggie76 Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 |
Thanks buddy -- I just finished retrofitting my scripts and am re-running my scan of the DB to breakdown CUDA vs OpenCL -- preliminary results are fascinating. |
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
Hey Mark, Cuda## is suboptimal if you don't configure it to run multiple tasks in parallel. To do so, you need to configure Boinc S@h to use only 1 GPU app (not stock setup). The easiest way to do that is to run Lunatics v0.45 beta3 and select Cuda50. Then, you'll need to config app_config.xml to run multiple tasks on GPU. On my GTX 750 Ti, nothing beats Cuda50 running 3 tasks at a time (~13mins throughput/task) ...except possibly 4 tasks at a time (test to be done this week). NV_SoG takes between 15-16mins/task on avg even with commandlines. As for GTX 1060, you're breaking new ground so optimal setup is TBD. Let me know what your intent is (highest RAC, highest throuput, etc) and I can possible guide you down a few diff paths to your goal. Cheers, Rob :-) |
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
Hey Shag! I concur with Al: WoW! For your GPU charts and stats, any plans on comparing Anonymous Platform to top stock output? My guess is that it would show a big benefit for optimization. Also, if you can run my 2 rigs through your script, that would be great. On both I task-swap: Guppis processed on CPU, and nonVLARs on GPU. 7996377 is running NV_SoG with commandline 8010413 is running Cuda50 with 3 tasks in parallel. Fyi, I moved late Friday 7996377 to be colocated with the other as I was getting much more nonVLARS than expected so I wanted too rule out ISP variable. All that to say 7996377 was down for a few hrs during the early morning of Sat (UTC time). Cheers, Rob |
Shaggie76 Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 |
Host, API, Device, Credit, Seconds, Credit/Hour, Work Units 7996377, cpu, Intel Xeon W3550 @ 3.07GHz, 14626.79, 69005.37875, 763.077385471201, 189 7996377, gpu, NVIDIA GeForce GTX 750 Ti, 58.47, 913.21, 230.496818913503, 1 8010413, cpu, Intel Xeon W3550 @ 3.07GHz, 481.36, 2148.6725, 806.496103989789, 7 8010413, gpu, NVIDIA GeForce GTX 750 Ti, 16087.02, 699745.91, 82.7632876053537, 193 Note this does not factor in the x3 for your second GPU so it's more like 241 Cr/hr for SoG vs 247 Cr/hr for Cuda50. I'm also not sure why your SoG host has only one GPU task validated at the moment (maybe related to colo downtime?). Honestly I'm not super confident in the aggregation of your stats -- the disproportion of CPU vs GPU tasks seems wrong. Scanning Anonymous Platforms tasks is tricky -- I can do it but it isn't enabled by default because I can't tell how many tasks are in parallel (I assume this is relatively rare for people running stock and a winsorized mean will exclude them. It's also non-trivial to know if the GPU task is CUDA or OpenCL -- I could with an extra PHP query per work unit but that would be painful. I'd much rather petitioning for a tasks.gz dump so I can digest it without hurting the servers (I don't know who to ask for this though). |
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
Host, API, Device, Credit, Seconds, Credit/Hour, Work Units 7996377, cpu, Intel Xeon W3550 @ 3.07GHz, 14626.79, 69005.37875, 763.077385471201, 189 7996377, gpu, NVIDIA GeForce GTX 750 Ti, 58.47, 913.21, 230.496818913503, 1 8010413, cpu, Intel Xeon W3550 @ 3.07GHz, 481.36, 2148.6725, 806.496103989789, 7 8010413, gpu, NVIDIA GeForce GTX 750 Ti, 16087.02, 699745.91, 82.7632876053537, 193 Perfect, thanks!!! It's as I suspected: When swapping tasks locally with Mr Kevvy's script or by another means (which I use on the 8010413), it seems the client reports the task as reassigned but the server ignores it. Host, API, Device, Credit, Seconds, Credit/Hour, Work Units 7996377, cpu, Intel Xeon W3550 @ 3.07GHz, [u]14626.79[/u], 69005.37875, 763.077385471201, 189 7996377, gpu, NVIDIA GeForce GTX 750 Ti, [b]58.47[/b], 913.21, 230.496818913503, 1 8010413, cpu, Intel Xeon W3550 @ 3.07GHz, [b]481.36[/b], 2148.6725, 806.496103989789, 7 8010413, gpu, NVIDIA GeForce GTX 750 Ti, [u]16087.02[/u], 699745.91, 82.7632876053537, 193 This will be an issue in the future if an AI is ver used to reassign tasks for optimal task throughput. In the meantime, when dealing with "anonymous platforms", your incredible script won't easily be able to determine if there is task-swapping going on unless you look for abnormal runtimes, such as > 60mins on a GTX 750 Ti (since 3 Cuda50 in parallel take ~39mins and sometimes up to 50some mins) Thanks again, RobG :-D |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Scanning Anonymous Platforms tasks is tricky -- I can do it but it isn't enabled by default because I can't tell how many tasks are in parallel (I assume this is relatively rare for people running stock and a winsorized mean will exclude them. It's also non-trivial to know if the GPU task is CUDA or OpenCL -- I could with an extra PHP query per work unit but that would be painful. I'd much rather petitioning for a tasks.gz dump so I can digest it without hurting the servers (I don't know who to ask for this though). With enough data for anonymous platform, for a given device you should see several distinct peaks, which heuristically you can interpret as number of instances. If they are so close together it becomes an inseparable blob, then it won't matter. The rub with this is that stock can run multi-instance too, via the app_config.xml interface. On the whole, with enough numbers I'd just trust the simplest interpretation, suggesting fairly linear progression of the Cuda devices, with the latest 1070 looking like it learned from the 750ti. Wouldn't be surprised if the RX480 and 1060 end up duelling. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
I'd much rather petitioning for a tasks.gz dump so I can digest it without hurting the servers (I don't know who to ask for this though). A tasks.gz dump would be HUGE! My guess is you're more likely to get a VPN connection to SQL the server directly during non-peak times. I have no clue who you could ask. The volunteer-devs might suggest a specific project staff through a PM. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
I'd much rather petitioning for a tasks.gz dump so I can digest it without hurting the servers (I don't know who to ask for this though). If you were really nice to Matt or Jeff they might be able to work something out (guessing). I can't imagine what the bribe cost might be though. Data mining is an awesomely powerful thing for good, but abuse by companies (like m$) and governments (NSA), has cast a dark cloud on statistics. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Mark Loukko Send message Joined: 7 Jun 99 Posts: 52 Credit: 40,406,567 RAC: 108 |
Hi Rob, Right now my intent is just comparing the 1060 and M3000M. Of course this is impossible while using different applications. I'm not allowing any new tasks on the 1060. I will wait for the cache to clear and then uninstall BOINC and change my computer name. If my plan works (insert evil laugh here) this should give me a new computer ID and won't be mixing the stats from the old 650 TI. Also, it will be interesting to see which application SETI ends up saying is best for the 1060. Cheers Mark |
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
I upgraded the rig so SETI retained the same computer ID. With a new computer I believe SETI would send many different applications to figure out the best one. In Boinc Manager on the Project tab, select: S@h project, and then press "Reset project" button on left side. This will empty your "C:\ProgramData\BOINC\projects\setiathome.berkeley.edu" folder. As you stated, letting your cache run dry is preferable (or at least Aborting all your tasks before you Reset project). So, there's no need to go for a new computer ID |
Mark Loukko Send message Joined: 7 Jun 99 Posts: 52 Credit: 40,406,567 RAC: 108 |
Thanks Shaggie, The number of Work Units seem low for the M3000M card. Is "108" and "90" just recent work units? Cheers Mark |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14649 Credit: 200,643,578 RAC: 874 |
I upgraded the rig so SETI retained the same computer ID. With a new computer I believe SETI would send many different applications to figure out the best one. Unfortunately, this process won't clear the contents of the host_app_version table on the server: memory of the previous card/application performance is associated with the HostID on the server, not locally. So a completely clean slate does require a new HostID (and the long training period associated with it). |
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
Right now my intent is just comparing the 1060 and M3000M. Of course this is impossible while using different applications. I'm not allowing any new tasks on the 1060. I will wait for the cache to clear and then uninstall BOINC and change my computer name. If my plan works (insert evil laugh here) this should give me a new computer ID and won't be mixing the stats from the old 650 TI. Also, it will be interesting to see which application SETI ends up saying is best for the 1060. Other than Shaggie's script, I don't know of an easy way to compare the output of 2 different GPUs on the same rig. You'll likely get NV_SoG as the default GPU app from stock since it is optimized for 1 task/GPU but also requires (most of the time) a full CPU core to support it. In order to compare Cuda50 to NV_SoG, you'll need to configure Cuda50 to run 3+ tasks in parallel on each GPU (that has 2+GB of RAM). Have you used Lunatics v0.45 beta3 yet to custimize/optimize your S@h setup? It's great since you don't even need to empty your cache! (keep in mind that it just wont rename the GPU app labeled in the BOINC Manager for the tasks that were already downloaded prior to running Lunatics to customize S@h) |
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
Unfortunately, this process won't clear the contents of the host_app_version table on the server: memory of the previous card/application performance is associated with the HostID on the server, not locally. So a completely clean slate does require a new HostID (and the long training period associated with it). Thanks for the clarification Richard! |
Shaggie76 Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 |
The number of Work Units seem low for the M3000M card. Is "108" and "90" just recent work units? Yes, all I have access to is the recent records on the website. |
Mark Loukko Send message Joined: 7 Jun 99 Posts: 52 Credit: 40,406,567 RAC: 108 |
Hi Richard, Am I correct in thinking I can simply change the computer name to get a new HostID? Or, is it tied to my PC in some other fashion? Cheers, Mark |
Shaggie76 Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 |
I've updated my GPU performance analysis scripts to try to discriminate which API was used. It would seem that although the overwhelming majority of tasks assigned to modern NVIDIA GPUs are using the OpenCL app there are a few oddballs like Mark's machine that seem to get CUDA tasks instead. Here's a slice of the GeForce 980 Ti scan: Host, API, Device, Credit, Seconds, Credit/Hour, Work Units 7986065, opencl, NVIDIA GeForce GTX 980 Ti, 11750.3, 51689.85, 818.363373080014, 129 4990970, opencl, NVIDIA GeForce GTX 1080, 2068.49, 9485.16, 785.075212226257, 22 7937550, opencl, NVIDIA GeForce GTX 980 Ti, 162.25, 789.21, 740.107195803398, 2 7629870, opencl, NVIDIA GeForce GTX 980 Ti, 102.28, 303.22, 1214.32623177891, 1 7908796, opencl, NVIDIA GeForce GTX 980 Ti, 67.52, 301.30, 806.7441088616, 1 7794896, cuda, NVIDIA GeForce GTX 980 Ti, 1864.9, 22185.08, 302.61959839676, 23 7794896, opencl, NVIDIA GeForce GTX 980 Ti, 76.54, 301.27, 914.608158794437, 1 7968342, opencl, NVIDIA GeForce GTX 980 Ti, 9617.5, 51285.39, 675.104547318447, 113 7833637, opencl, NVIDIA GeForce GTX 980 Ti, 390.42, 2664.02, 527.590633703951, 3 7997117, opencl, NVIDIA GeForce GTX 980 Ti, 298.38, 1753.93, 612.434931838785, 3 8005939, cuda, NVIDIA GeForce GTX 980 Ti, 213.49, 3575.89, 214.929430155849, 2 8005939, opencl, NVIDIA GeForce GTX 980 Ti, 636.61, 2987.03, 767.249073494407, 8 7874878, opencl, NVIDIA GeForce GTX 980 Ti, 585.22, 3573.72, 589.523521708472, 9 8007389, opencl, NVIDIA GeForce GTX 980 Ti, 8589.97, 47572.21, 650.041105931383, 125 7907958, opencl, NVIDIA GeForce GTX 980 Ti, 3068.94, 15159.09, 728.815779839027, 29 5526548, opencl, NVIDIA GeForce GTX 980 Ti, 9055.04, 48524.02, 671.793969254815, 101 6966163, opencl, NVIDIA GeForce GTX 980 Ti, 9140.14, 57289.56, 574.354280256298, 120 7960872, opencl, NVIDIA GeForce GTX 980 Ti, 6663.05, 29805.22, 804.791241265791, 83 7989090, opencl, NVIDIA GeForce GTX 980 Ti, 9272.42, 49595.32, 673.061732437657, 110 7453121, opencl, NVIDIA GeForce GTX 980 Ti, 39.88, 152.25, 942.975369458128, 1 8035870, opencl, NVIDIA GeForce GTX 980 Ti, 7887.25, 37959.9, 748.002497372228, 95 7614097, opencl, NVIDIA GeForce GTX 980 Ti, 7549.53000000001, 39482.1, 688.370375435958, 94 7866760, opencl, NVIDIA GeForce GTX 980 Ti, 222.5, 2144.89, 373.445724489368, 3 7943399, cuda, NVIDIA GeForce GTX 980 Ti, 64.50, 464.31, 500.096918007366, 1 7943399, opencl, NVIDIA GeForce GTX 980 Ti, 141.84, 689.93, 740.109866218312, 3 7461278, cuda, NVIDIA GeForce GTX 980 Ti, 99.82, 551.21, 651.93302008309, 1 6194103, opencl, NVIDIA GeForce GTX 980 Ti, 2896.92, 11472.49, 909.036486412279, 37 7814871, opencl, NVIDIA GeForce GTX 980 Ti, 1180.36, 5188.42, 818.996149116687, 20 7855892, opencl, NVIDIA GeForce GTX 980 Ti, 185.89, 762.19, 878.001548170404, 4 7414518, opencl, NVIDIA GeForce GTX 980 Ti, 258.78, 1433.69, 649.797376001786, 2 8043806, cuda, NVIDIA GeForce GTX 980 Ti, 306.99, 3435.5, 321.689419298501, 4 8047105, cuda, NVIDIA GeForce GTX 980 Ti, 370.23, 3986.89, 334.302677023946, 4 7484329, opencl, NVIDIA GeForce GTX 980 Ti, 6366.68, 40734.14, 562.674159807964, 69 7922345, opencl, NVIDIA GeForce GTX 980 Ti, 242.52, 1693.74, 515.469906833398, 3 7856691, opencl, NVIDIA GeForce GTX 980 Ti, 15509.56, 72542.29, 769.680913023286, 173 8050971, opencl, NVIDIA GeForce GTX 980 Ti, 10784.97, 40198.54, 965.853287208938, 133 8037061, opencl, NVIDIA GeForce GTX 980 Ti, 2889.88, 25456.71, 408.676847872329, 39 7405526, cuda, NVIDIA GeForce GTX 980 Ti, 165.37, 912.92, 652.118476974981, 2 7818638, opencl, NVIDIA GeForce GTX 980 Ti, 10662.97, 51413.63, 746.624815248408, 131 7905174, opencl, NVIDIA GeForce GTX 980 Ti, 141.69, 1036.1, 492.311552938906, 2 7207870, opencl, NVIDIA GeForce GTX 980 Ti, 813.8, 2872.06, 1020.06225496682, 12 7807344, opencl, NVIDIA GeForce GTX 980 Ti, 205.8, 1221.21, 606.676984302454, 2 7958621, opencl, NVIDIA GeForce GTX 980 Ti, 99.02, 308.12, 1156.92587303648, 1 7849689, opencl, NVIDIA GeForce GTX 980 Ti, 825.16, 5607.22, 529.776966125816, 7 7938869, opencl, NVIDIA GeForce GTX 980 Ti, 612.35, 2569.12, 858.060347511988, 10 7837291, opencl, NVIDIA GeForce GTX 980 Ti, 631.26, 2518.49, 902.340688269558, 9 7485441, opencl, NVIDIA GeForce GTX 980 Ti, 2511.15, 17448.4, 518.107104376333, 24 8033077, opencl, NVIDIA GeForce GTX 980 Ti, 645.88, 3436.8, 676.550279329609, 7 7938928, opencl, NVIDIA GeForce GTX 980 Ti, 270.66, 1711.97, 569.154833320677, 4 7921537, cuda, NVIDIA GeForce GTX 980 Ti, 3013.66, 32829.45, 330.470842490508, 38 7921537, opencl, NVIDIA GeForce GTX 980 Ti, 133.95, 1137.55, 423.9110368775, 1 7977308, opencl, NVIDIA GeForce GTX 980 Ti, 7496.2, 40299.75, 669.639886103512, 95 7969336, opencl, NVIDIA GeForce GTX 980 Ti, 2810.61, 12935.99, 782.174074036854, 34 8033637, cuda, NVIDIA GeForce GTX 980 Ti, 221.82, 1403.16, 569.109723766356, 3 8033637, opencl, NVIDIA GeForce GTX 980 Ti, 2056.14, 12318.5, 600.89329057921, 27 6611735, opencl, NVIDIA GeForce GTX 980 Ti, 10194.59, 63873.05, 574.585431570905, 132 7986520, opencl, NVIDIA GeForce GTX 980 Ti, 147.24, 716.47, 739.827208396723, 3 7369273, opencl, NVIDIA GeForce GTX 980 Ti, 5659.74, 30930.38, 658.739530519832, 69 7957711, opencl, NVIDIA GeForce GTX 980 Ti, 106.24, 291.84, 1310.52631578947, 2 7931722, opencl, NVIDIA GeForce GTX 980 Ti, 1576.81, 8933.55, 635.41548432594, 18 8011009, cuda, NVIDIA GeForce GTX 980 Ti, 174.33, 790.21, 794.204072335202, 2 8011009, opencl, NVIDIA GeForce GTX 980 Ti, 2266.14, 10527.25, 774.951103089601, 24 7636964, opencl, NVIDIA GeForce GTX 980 Ti, 529.53, 3202.86, 595.189299563515, 8 7906492, cuda, NVIDIA GeForce GTX 980 Ti, 8146.42, 77941.35, 376.271542640716, 103 7718613, opencl, NVIDIA GeForce GTX 980 Ti, 109.51, 924.48, 426.440809968847, 1 7430324, opencl, NVIDIA GeForce GTX 980 Ti, 6652.08, 31771.14, 753.749723805945, 93 7268469, opencl, NVIDIA GeForce GTX 980 Ti, 42.06, 222.99, 679.025965289923, 1 1575265, opencl, NVIDIA GeForce GTX 980 Ti, 3748.02, 35837.04, 376.506318602206, 53 7089943, opencl, NVIDIA GeForce GTX 980 Ti, 9122.79000000001, 61469.7, 534.280206345566, 115 7607650, opencl, NVIDIA GeForce GTX 980 Ti, 12585.62, 62583.56, 723.963801356139, 155 5882028, opencl, NVIDIA GeForce GTX 980 Ti, 1412.12, 14816.78, 343.099647831715, 17 7999832, opencl, NVIDIA GeForce GTX 980 Ti, 5580.89, 29353.48, 684.457311364785, 71 8043229, opencl, NVIDIA GeForce GTX 980 Ti, 3703.73, 24721.84, 539.338010439352, 48 6985099, opencl, NVIDIA GeForce GTX 980 Ti, 9992.17, 43178.8, 833.089664372331, 114 8052542, cuda, NVIDIA GeForce GTX 980 Ti, 245.01, 3507.51, 251.470701437772, 2 7989887, opencl, NVIDIA GeForce GTX 980 Ti, 200.2, 553.06, 1303.14974867103, 2 As you can see whenever a host with this card gets a CUDA task instead of an OpenCL task it yields less credit. This makes me wonder if I made a mistake when I installed Lunatics -- I seem to recall the question being framed "ATI ? OpenCL : NVidia ? CUDA ..." and so I only ended up with CUDA. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.