Best tuning for 1080ti and Process Lasso use.

Message boards : Number crunching : Best tuning for 1080ti and Process Lasso use.
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4

AuthorMessage
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 4473
Credit: 278,039,852
RAC: 631,339
United States
Message 1921246 - Posted: 26 Feb 2018, 2:22:29 UTC

@Al Hey Al, off-topic for this thread but . . . . how goes the RedHat experience in the new server? You haven't posted to your thread in a while?
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1921246 · Report as offensive
Profile Wiggo "Socialist"
Avatar

Send message
Joined: 24 Jan 00
Posts: 14226
Credit: 185,236,790
RAC: 85,447
Australia
Message 1921257 - Posted: 26 Feb 2018, 3:42:02 UTC - in response to Message 1921230.  

Huh, just went to add those commands, and did a search for the app_config.xml, because I couldn't seem to find it, and for good reason. It wasn't there. Shouldn't that have been created when the Lunatics installer was ran?

An app_info.xml is created by the installer and nothing else, but I use the app_info.xml as others with more modern BOINC versions use an app_config.xml. ;-)

Cheers.
ID: 1921257 · Report as offensive
Al Special Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1618
Credit: 342,831,087
RAC: 270,034
United States
Message 1921321 - Posted: 26 Feb 2018, 14:26:23 UTC - in response to Message 1921246.  

@Al Hey Al, off-topic for this thread but . . . . how goes the RedHat experience in the new server? You haven't posted to your thread in a while?

Keith, those 2 are actually sitting right next to each other in the same rack setup, I had to get a few bits and pieces together to be able to do that, and wanted to get the 'easy' one out of of the way first, which is now pretty much up and running other than the app_config file.

It's going to be interesting having one of the most powerful crunching rigs munching away right next to one of the most wimpy Intel CPU based ones. But to answer your question, now that everything is pretty much assembled in its proper place, and the one is up and running, next is to dive back into the RedHat world and slay that dragon. I have the resources available, though they closed my 1st ticket because of inactivity, but can re-open it if I run into any issues, which I'm sure I won't, as this will all be seamless, right? *rolleyes* lol

ID: 1921321 · Report as offensive
Cruncher-American Special Project $75 donor

Send message
Joined: 25 Mar 02
Posts: 1448
Credit: 265,369,882
RAC: 151,315
United States
Message 1921386 - Posted: 26 Feb 2018, 19:06:20 UTC

I am thinking about changing my systems as mentioned above to HT off, and adopting the command line more or less as listed in these posts.

Both my machines are dual E5-2680 v2s with 32gb ram (1333MHz) and 2xGTX 1080s (NOT TIs) and are running just south of 70K RAC each at the moment. So I can side-by-side compare them for results.

I will change one of them today to HT off, and see how much real-world difference it actually makes in RAC after about a week.

(A question: is the difference between the 1080s and 1080tis such that the command line will have to be changed because of that fact?)

Then I will change the command line and see what effect that has.

BTW: given the vagaries of Credit New, will the fact that CPU WUs will get done faster affect the credit granted by CN (I assume we all think it shouldn't)?

IIRC, when I tested HT on/off in the dim, dark past (a couple of years ago), with HT off the individual WUs seemed about 40% faster, and CPU threads were a bit more than 50% fewer (because usage by GPUs was the same over half the total threads). But I didn't try to track RAC at the time.

So I would need somewhat more credit per WU to maintain my RAC in HT off mode.

Any info or suggestions would be appreciated...
ID: 1921386 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 4473
Credit: 278,039,852
RAC: 631,339
United States
Message 1921394 - Posted: 26 Feb 2018, 19:31:01 UTC - in response to Message 1921386.  

I will be surprised if there is any observed change in gpu run time just because you turn off the HT threads. There would be more of an impact I would think to cpu run time, if any.

The job that any thread, physical or HT in supporting a gpu task is nothing more than a coal-tender. It just needs to shovel the data in and out of the card. At least for SETI tasks. Einstein actually uses a full cpu core at the very end of a gpu task's completion, from the 90% to 100% point when it shifts all the computational work done so far exclusively on the gpu for final verification and formatting.

The difference between running on physical cores will come down to the improvement in efficiency of code execution in the core when one thread doesn't have to share common resources with another competing thread and possible cache misses or translation lookahead buffer dumps. Modern cpus have been engineered to minimize that penalty to the utmost. My AMD FX processors are an example of how to maximize the HT penalty with it's kludged design. Will be interesting to find out how you actually fair. Please post back your test results. I am very curious.

With regard to the SoG app tuning, you can be just as aggressive with the 1080 as you would be with a 1080Ti. The only difference would be the tuning parameter for the AP command line. The -unroll value of the 1080 is only 20 vice 28 for the Ti.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1921394 · Report as offensive
Cruncher-American Special Project $75 donor

Send message
Joined: 25 Mar 02
Posts: 1448
Credit: 265,369,882
RAC: 151,315
United States
Message 1921442 - Posted: 26 Feb 2018, 22:17:01 UTC - in response to Message 1921394.  

Thanks for the info, Keith.
I was aware that HT/no HT has minimal effect on the current GPU apps; that's why I am going to do this in the two phases I mentioned. The savings should be almost independent on each branch...

One thing I did see right away is that the CPUs are using 25-30 watts less each with HT off, so there's that.
ID: 1921442 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 4473
Credit: 278,039,852
RAC: 631,339
United States
Message 1921446 - Posted: 26 Feb 2018, 22:22:25 UTC - in response to Message 1921442.  

Makes sense with half the chip turned off. Probably will drop cpu temps too since you won't be spinning processes on the HT cores.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1921446 · Report as offensive
Cruncher-American Special Project $75 donor

Send message
Joined: 25 Mar 02
Posts: 1448
Credit: 265,369,882
RAC: 151,315
United States
Message 1921626 - Posted: 28 Feb 2018, 12:05:11 UTC - in response to Message 1921446.  

Makes sense with half the chip turned off. Probably will drop cpu temps too since you won't be spinning processes on the HT cores.


Xactly what happened.

Luckily, since I started the outage with exactly 100 CPU and 200 GPU WUs in each machine, and they ran everything down to 0, I got some interesting data from the two machines, which I will report a little later today, when my brain is working better...too early here near Boston as yet.
ID: 1921626 · Report as offensive
Marco Vandebergh

Send message
Joined: 27 Aug 10
Posts: 26
Credit: 8,017,792
RAC: 2,556
Netherlands
Message 1922421 - Posted: 3 Mar 2018, 14:05:18 UTC

Tested once again, HT vs non-HT on my system.

With HT on a GUPPI WU on GPU takes about 30 seconds longer that without HT.

So, the shovel to GPU theory is debunked for my system. CPU is calculating something, and is faster without HT on my system.



Good luck. :)
ID: 1922421 · Report as offensive
juan BFP Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 6878
Credit: 386,068,821
RAC: 162,539
Panama
Message 1922424 - Posted: 3 Mar 2018, 14:19:10 UTC - in response to Message 1922421.  
Last modified: 3 Mar 2018, 15:16:04 UTC

Tested once again, HT vs non-HT on my system.

With HT on a GUPPI WU on GPU takes about 30 seconds longer that without HT.

So, the shovel to GPU theory is debunked for my system. CPU is calculating something, and is faster without HT on my system.

A gain of 30 secs on a host who crunch a WU in around 170 secs it's a big gain. Interesting.
Could you elaborate how you make the test (not the HT or No Part that is clear).
The WU tested has similar AR? Can you show us the link for the crunched WU before they will be cleared on the DB? Anything else was running on the host? What was the GPU usage in both cases, etc.
So we could replicate to see the results in others hosts/GPU combinations.

Thanks in advance.

<edit>I ask because in the past i run several HT and No-HT tests running several blc WU and i can't see any real difference on the GPU WU crunching times. If the difference exists is to small to notice. Maybe the Linux builds don't have that big differences. Who knows?.
ID: 1922424 · Report as offensive
kittyman Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 49845
Credit: 914,363,746
RAC: 161,185
United States
Message 1922434 - Posted: 3 Mar 2018, 14:51:41 UTC

I have not done any Lunatics style testing for many years now.
Do we have a test crunching program with some canned WUs to test crunch times when adjusting various settings or application parameters?
That is really the best way to determine best settings under controlled conditions.

Meow?
What meowing lurks in the hearts of man? The kittyman knows....MEOWhahahahahahha!

Have made friends here.
Most were cats.
ID: 1922434 · Report as offensive
juan BFP Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 6878
Credit: 386,068,821
RAC: 162,539
Panama
Message 1922442 - Posted: 3 Mar 2018, 15:05:26 UTC
Last modified: 3 Mar 2018, 15:26:34 UTC

I just rerun some WU with and without HT, actually the No-HT takes few seconds more, but was a non controlled test, just put to run on this host. And only with blc16 WU. Examples:

With No-HT
156. 52 secs AR 0.018591 https://setiathome.berkeley.edu/result.php?resultid=6453066265
156.56 secs AR 0.010406 https://setiathome.berkeley.edu/result.php?resultid=6453066797

With HT
152.45 secs AR 0.016646 https://setiathome.berkeley.edu/result.php?resultid=6453056594
152.43 secs AR 0.011301 https://setiathome.berkeley.edu/result.php?resultid=6453056653

That of course is just a sample, to be more accurate is necessary to run a lot more WU in a controlled test set.
But anyway i not see the big difference from HT or no HT crunching times. The 4 secs difference on my non controlled test is not significant to be considered a real difference and is on the opposite side.

That's is why i ask for some more info about the test done.
He use a Xeon @2.6 + 1080Ti in Windows SOG and i use a I7@3.6 +1070 in Linux CUDA90 , maybe there is one of the big differences, the 1080Ti could need more CPU assistance to keep it feeded, or that affect only Windows SoG builds, that is what i try to understand.

Maybe later I could repeat the test on a Windows host, but the one i have available here has only a 1060.
ID: 1922442 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 4473
Credit: 278,039,852
RAC: 631,339
United States
Message 1922453 - Posted: 3 Mar 2018, 15:58:17 UTC - in response to Message 1922434.  

I have not done any Lunatics style testing for many years now.
Do we have a test crunching program with some canned WUs to test crunch times when adjusting various settings or application parameters?
That is really the best way to determine best settings under controlled conditions.

Meow?

Yes, the Lunatics site has the test programs under the Test and Tools download directory. You can get the test WU there too, but they are the old Arecibo work and not very useful for what we are crunching now. I would use the available tools to get the current BLC tasks as the test subjects for testing.

I've never tried without HT since I only have 4 cores on my first machine builds. Could try it on the Ryzens though.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1922453 · Report as offensive
kittyman Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 49845
Credit: 914,363,746
RAC: 161,185
United States
Message 1922472 - Posted: 3 Mar 2018, 17:22:27 UTC - in response to Message 1922453.  

I should be able to just make copies of a few current WUs from my cache and use them in the testing program, yes?
What meowing lurks in the hearts of man? The kittyman knows....MEOWhahahahahahha!

Have made friends here.
Most were cats.
ID: 1922472 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 4473
Credit: 278,039,852
RAC: 631,339
United States
Message 1922473 - Posted: 3 Mar 2018, 17:29:13 UTC - in response to Message 1922472.  

Yes. But it is handy to have a work unit downloader application for tasks that are already gone from your system and you want to look at them again or share them with others for testing.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1922473 · Report as offensive
Previous · 1 · 2 · 3 · 4

Message boards : Number crunching : Best tuning for 1080ti and Process Lasso use.


 
©2018 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.