Message boards :
Number crunching :
Best tuning for 1080ti and Process Lasso use.
Message board moderation
Previous · 1 · 2 · 3 · 4
Author | Message |
---|---|
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
@Al Hey Al, off-topic for this thread but . . . . how goes the RedHat experience in the new server? You haven't posted to your thread in a while? Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489 |
Huh, just went to add those commands, and did a search for the app_config.xml, because I couldn't seem to find it, and for good reason. It wasn't there. Shouldn't that have been created when the Lunatics installer was ran? An app_info.xml is created by the installer and nothing else, but I use the app_info.xml as others with more modern BOINC versions use an app_config.xml. ;-) Cheers. |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
@Al Hey Al, off-topic for this thread but . . . . how goes the RedHat experience in the new server? You haven't posted to your thread in a while? Keith, those 2 are actually sitting right next to each other in the same rack setup, I had to get a few bits and pieces together to be able to do that, and wanted to get the 'easy' one out of of the way first, which is now pretty much up and running other than the app_config file. It's going to be interesting having one of the most powerful crunching rigs munching away right next to one of the most wimpy Intel CPU based ones. But to answer your question, now that everything is pretty much assembled in its proper place, and the one is up and running, next is to dive back into the RedHat world and slay that dragon. I have the resources available, though they closed my 1st ticket because of inactivity, but can re-open it if I run into any issues, which I'm sure I won't, as this will all be seamless, right? *rolleyes* lol |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
I am thinking about changing my systems as mentioned above to HT off, and adopting the command line more or less as listed in these posts. Both my machines are dual E5-2680 v2s with 32gb ram (1333MHz) and 2xGTX 1080s (NOT TIs) and are running just south of 70K RAC each at the moment. So I can side-by-side compare them for results. I will change one of them today to HT off, and see how much real-world difference it actually makes in RAC after about a week. (A question: is the difference between the 1080s and 1080tis such that the command line will have to be changed because of that fact?) Then I will change the command line and see what effect that has. BTW: given the vagaries of Credit New, will the fact that CPU WUs will get done faster affect the credit granted by CN (I assume we all think it shouldn't)? IIRC, when I tested HT on/off in the dim, dark past (a couple of years ago), with HT off the individual WUs seemed about 40% faster, and CPU threads were a bit more than 50% fewer (because usage by GPUs was the same over half the total threads). But I didn't try to track RAC at the time. So I would need somewhat more credit per WU to maintain my RAC in HT off mode. Any info or suggestions would be appreciated... |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I will be surprised if there is any observed change in gpu run time just because you turn off the HT threads. There would be more of an impact I would think to cpu run time, if any. The job that any thread, physical or HT in supporting a gpu task is nothing more than a coal-tender. It just needs to shovel the data in and out of the card. At least for SETI tasks. Einstein actually uses a full cpu core at the very end of a gpu task's completion, from the 90% to 100% point when it shifts all the computational work done so far exclusively on the gpu for final verification and formatting. The difference between running on physical cores will come down to the improvement in efficiency of code execution in the core when one thread doesn't have to share common resources with another competing thread and possible cache misses or translation lookahead buffer dumps. Modern cpus have been engineered to minimize that penalty to the utmost. My AMD FX processors are an example of how to maximize the HT penalty with it's kludged design. Will be interesting to find out how you actually fair. Please post back your test results. I am very curious. With regard to the SoG app tuning, you can be just as aggressive with the 1080 as you would be with a 1080Ti. The only difference would be the tuning parameter for the AP command line. The -unroll value of the 1080 is only 20 vice 28 for the Ti. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
Thanks for the info, Keith. I was aware that HT/no HT has minimal effect on the current GPU apps; that's why I am going to do this in the two phases I mentioned. The savings should be almost independent on each branch... One thing I did see right away is that the CPUs are using 25-30 watts less each with HT off, so there's that. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Makes sense with half the chip turned off. Probably will drop cpu temps too since you won't be spinning processes on the HT cores. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
Makes sense with half the chip turned off. Probably will drop cpu temps too since you won't be spinning processes on the HT cores. Xactly what happened. Luckily, since I started the outage with exactly 100 CPU and 200 GPU WUs in each machine, and they ran everything down to 0, I got some interesting data from the two machines, which I will report a little later today, when my brain is working better...too early here near Boston as yet. |
Marco Vandebergh ( SETI orphan ) Send message Joined: 27 Aug 10 Posts: 39 Credit: 12,630,994 RAC: 9 |
Tested once again, HT vs non-HT on my system. With HT on a GUPPI WU on GPU takes about 30 seconds longer that without HT. So, the shovel to GPU theory is debunked for my system. CPU is calculating something, and is faster without HT on my system. Good luck. :) |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Tested once again, HT vs non-HT on my system. A gain of 30 secs on a host who crunch a WU in around 170 secs it's a big gain. Interesting. Could you elaborate how you make the test (not the HT or No Part that is clear). The WU tested has similar AR? Can you show us the link for the crunched WU before they will be cleared on the DB? Anything else was running on the host? What was the GPU usage in both cases, etc. So we could replicate to see the results in others hosts/GPU combinations. Thanks in advance. <edit>I ask because in the past i run several HT and No-HT tests running several blc WU and i can't see any real difference on the GPU WU crunching times. If the difference exists is to small to notice. Maybe the Linux builds don't have that big differences. Who knows?. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
I have not done any Lunatics style testing for many years now. Do we have a test crunching program with some canned WUs to test crunch times when adjusting various settings or application parameters? That is really the best way to determine best settings under controlled conditions. Meow? "Freedom is just Chaos, with better lighting." Alan Dean Foster |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
I just rerun some WU with and without HT, actually the No-HT takes few seconds more, but was a non controlled test, just put to run on this host. And only with blc16 WU. Examples: With No-HT 156. 52 secs AR 0.018591 https://setiathome.berkeley.edu/result.php?resultid=6453066265 156.56 secs AR 0.010406 https://setiathome.berkeley.edu/result.php?resultid=6453066797 With HT 152.45 secs AR 0.016646 https://setiathome.berkeley.edu/result.php?resultid=6453056594 152.43 secs AR 0.011301 https://setiathome.berkeley.edu/result.php?resultid=6453056653 That of course is just a sample, to be more accurate is necessary to run a lot more WU in a controlled test set. But anyway i not see the big difference from HT or no HT crunching times. The 4 secs difference on my non controlled test is not significant to be considered a real difference and is on the opposite side. That's is why i ask for some more info about the test done. He use a Xeon @2.6 + 1080Ti in Windows SOG and i use a I7@3.6 +1070 in Linux CUDA90 , maybe there is one of the big differences, the 1080Ti could need more CPU assistance to keep it feeded, or that affect only Windows SoG builds, that is what i try to understand. Maybe later I could repeat the test on a Windows host, but the one i have available here has only a 1060. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I have not done any Lunatics style testing for many years now. Yes, the Lunatics site has the test programs under the Test and Tools download directory. You can get the test WU there too, but they are the old Arecibo work and not very useful for what we are crunching now. I would use the available tools to get the current BLC tasks as the test subjects for testing. I've never tried without HT since I only have 4 cores on my first machine builds. Could try it on the Ryzens though. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
I should be able to just make copies of a few current WUs from my cache and use them in the testing program, yes? "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Yes. But it is handy to have a work unit downloader application for tasks that are already gone from your system and you want to look at them again or share them with others for testing. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.