Titan V and GTX1060s

Message boards : Number crunching : Titan V and GTX1060s
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1967792 - Posted: 29 Nov 2018, 19:32:06 UTC - in response to Message 1967789.  
Last modified: 29 Nov 2018, 19:33:18 UTC

Titan X (Maxwell) is slower than a 1080ti. but comparable power consumption i believe.

while average power consumption may not breach ~170ish W, peak can certainly hit the max.

unrestricted, my 1080tis will have fairly low power consumption in the early stages of the WU, but will ramp up to max power ~240w towards the later half of the WU.

limiting the ceiling still impacts the job even if your average power use is less. now it is limited to that 200W when i hit the later half of the job, which slows it down just a bit.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1967792 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1967801 - Posted: 29 Nov 2018, 19:49:32 UTC - in response to Message 1967791.  

I've set the power limit at 200 for now. I'll see where that gets me. As far as tweaking, I was more referring to the clocks. I might play with the gpu and memory clocks a little to find the sweet spot, in relation to the power consumed. (i.e. is the seti app memory or gpu bound.)

Definitely looking forward to seeing numbers after a couple days.

We've had lots of arguments on that in the past. Depends on the project and the applications. My contention is that the Seti special app responds best to memory overclocks. I see the builtin Nvidia firmware GPUBoost 3.0 do the heavy lifting with respect to the core clocks. If the card has the temp and TDP headroom, the card will boost the core clocks on its own without any intervention on your part. The recent special app has made an effort to reduce the amount of memory calls while doing computations so the benefit of increasing the memory clocks has been reduced from the older CUDA8 zi3v app for example.

Also don't forget that Nvidia penalizes the card when the driver detects a compute load being run and reduces the power state to P2 so that automatically inflicts both a core and mainly a memory clock reduction. You need to run a memory overclock to get the memory back up to what it should be running in normal graphic mode P0 power state.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1967801 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1967803 - Posted: 29 Nov 2018, 19:56:50 UTC - in response to Message 1967801.  

+1, you'll see the most changes by playing with card clocks, rather than any changes to seti app configs.

the app has almost all of the optimization built in for you. you really only have one knob left to turn as far as the app is concerned, and that's to run the command line argument "-nobs" or not. doesn't make a huge difference, but it's noticeable. you're in a good spot either way.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1967803 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1967805 - Posted: 29 Nov 2018, 20:02:53 UTC - in response to Message 1967803.  

+1, you'll see the most changes by playing with card clocks, rather than any changes to seti app configs.

the app has almost all of the optimization built in for you. you really only have one knob left to turn as far as the app is concerned, and that's to run the command line argument "-nobs" or not. doesn't make a huge difference, but it's noticeable. you're in a good spot either way.

I was running -nobs from the get go but I have changed to not using it because the improvement is minimal now with the CUDA92 and CUDA10 with the recent change to inline CUDA libraries into the app app itself and the large reduction in calls outside the compute kernel and the memory calls. Also, there is a noticeable reduction in core utilization which favorably reduces cpu temps.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1967805 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1967811 - Posted: 29 Nov 2018, 20:20:43 UTC
Last modified: 29 Nov 2018, 20:39:15 UTC

@TOD

Your times look a lot better now. But still a little higher on the Titan V.

Do you run too many CPU tasks at the same time?
Sometime running less CPU tasks helps a lot, in my case if i run 4 GPU + 6 CPU tasks produces more than 4+8 (my processor has 12 threads). Due the hot season i was forced to run even less CPU tasks (4GPU +2CPU and turn off the hyperthreading) for now. But seriously the CPU production is very small compared with the GPU production.

After you been comfortable with the temperature you could try to increase the clock of the GPU &/or memory. Or keep the -nobs option turned on to keep you GPU's feeded as fast as possible. How much you gain on each change, depends on your particular host. What i could say from my little experience some gain in the order of 2-5% is possible by playing with each parameter.

But keep in mind, more crunching speed normally give you a lot of more heat and electric cost, for example in my host if i use -nobs the WU is crunched in less than 2-3 secs but it use about 50W more of power from the outlet and the temperature rises about 5 C on the CPU.

As you already discovered, top crunchers normally produces a lot of heat, that's why most of them uses hybrids (like i use) or dedicated water cooling loops. Some even uses other even extreme liquid cooling devices. Those who run on air normally uses big fans to keep the heat out of the host.

ID: 1967811 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1967812 - Posted: 29 Nov 2018, 20:26:21 UTC - in response to Message 1967811.  

looks like he's running the 415 nvidia driver. I remember Keith said he saw a slowdown with that driver, but i don't know if he ever quantified by how much.

Tod, you might see some improvement dropping down to the 410 driver.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1967812 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1967814 - Posted: 29 Nov 2018, 20:34:29 UTC - in response to Message 1967812.  
Last modified: 29 Nov 2018, 20:35:00 UTC

looks like he's running the 415 nvidia driver. I remember Keith said he saw a slowdown with that driver, but i don't know if he ever quantified by how much.

Tod, you might see some improvement dropping down to the 410 driver.

I quantified a 5% increase in compute times. Also over in the Nvidia Linux forums, lots of traffic on the 415 drivers with incompatibilities, slowdowns etc. So it isn't just us with our Seti apps.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1967814 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1967818 - Posted: 29 Nov 2018, 20:41:29 UTC
Last modified: 29 Nov 2018, 20:58:31 UTC

Something interesting TOD uses Fedora not Ubuntu, will be nice to know if he needs to do something special or different in the installation process.
ID: 1967818 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1967819 - Posted: 29 Nov 2018, 20:43:54 UTC - in response to Message 1967818.  
Last modified: 29 Nov 2018, 20:46:06 UTC

he seems ok, more comfortable with Fedora I guess. I once ran Fedora wayyyy back in like 2005/2006 and it was neat playing with it. Ubuntu really is Linux Easy Mode these days with how much adoption it has. That's pretty much the reason i use it, since problems are easier to google search a solution for lol.

looks like he got it going with the repository version and just copied the app and app_info into the project folder. it works just fine that way too.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1967819 · Report as offensive
Tod

Send message
Joined: 17 Apr 99
Posts: 27
Credit: 143,685,603
RAC: 0
United States
Message 1967840 - Posted: 29 Nov 2018, 22:22:19 UTC - in response to Message 1967819.  

I have been a Fedora/CentOS user for years, old habits die hard. I did actually try to install ubuntu 18 on this machine. The 'live' desktop wouldn't even boot. Lots of PCI errors. I didn't want to spend a lot of time on it when the Fedora installer works on this hardware out of the box without messing around.

Installing the nvidia driver is also pretty trival on fedora, I can just grab any of the drivers on nvidia's site, and use those. I figured 4.15 would be the best, but may drop it down to 4.10 if you suggest.

Mind you, the base hardware isn't the newest.. an old Socket 2011 from around 6-7 years ago. So, not a speed demon by any stretch. Its an 8 core i7, and I have it running only running 2 cores for Seti.

1. Install Fedora 29.
2. Disable and remove the Nouveau driver
3. Install the NVIDIA driver and DKMS to auto make the kernel mods when a new kernel is installed.
4. sudo yum install boinc-client boinc-manager
5. Choose seti@home for project, then install the CUDA app and app_info.xml
6. Done.

The entire process was about an hour to complete. I can certainly help anyone or document the process for anyone wanting to use Fedora.

I have uninstalled the Titan Maxwell GPUs, they were just too power hungry for their results. Currently its running 1 Titan V and 1 1060. Next week, I will pull out the 1060, and install a couple 2080tis' and see how they perform.

Again, if you think I should downgrade to 4.10, I'll give that a go tonight.
ID: 1967840 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1967844 - Posted: 29 Nov 2018, 22:48:23 UTC - in response to Message 1967840.  

I have uninstalled the Titan Maxwell GPUs, they were just too power hungry for their results. Currently its running 1 Titan V and 1 1060. Next week, I will pull out the 1060, and install a couple 2080tis' and see how they perform.

Again, if you think I should downgrade to 4.10, I'll give that a go tonight.

My tests were with 1070Ti's, 1080's and 1080TI's so the result may have nothing in common with Turing hardware. I myself will eventually get to retest 415 drivers with Turing in the future. Card shows up tomorrow but unless I want to pull a card out of an existing machine to install the air cooled card, I was planning on holding off till all the bits get here. Looks like the custom water cooling parts haven't even made it out of the SouthEast yet after a week. Don't know if weather is holding them up or what.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1967844 · Report as offensive
Tod

Send message
Joined: 17 Apr 99
Posts: 27
Credit: 143,685,603
RAC: 0
United States
Message 1967859 - Posted: 29 Nov 2018, 23:28:00 UTC - in response to Message 1967844.  

Also don't forget that Nvidia penalizes the card when the driver detects a compute load being run and reduces the power state to P2 so that automatically inflicts both a core and mainly a memory clock reduction. You need to run a memory overclock to get the memory back up to what it should be running in normal graphic mode P0 power state.


Wow, I had no idea this happened. Thanks for the heads up. I'll check the clocks later tonight.
ID: 1967859 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1967868 - Posted: 29 Nov 2018, 23:56:17 UTC - in response to Message 1967859.  

Also don't forget that Nvidia penalizes the card when the driver detects a compute load being run and reduces the power state to P2 so that automatically inflicts both a core and mainly a memory clock reduction. You need to run a memory overclock to get the memory back up to what it should be running in normal graphic mode P0 power state.


Wow, I had no idea this happened. Thanks for the heads up. I'll check the clocks later tonight.

Yes, Nvidia has penalized all the cards after Maxwell architecture whenever the driver detects a compute load. We suspect that is to drive compute usage towards their premium compute lines like Quadro and Tesla. This applies to both Windows and Linux environments.

There is an aftermarket Windows tool called Nvidia Profile Inspector that is able to change the driver parameters to not allow CUDA compute downclocking. But no such tool for Linux. So the only option for Linux and Nvidia drivers is to use the nvidia-settings app and apply a memory overclock to raise the memory clock back to at least P0 clocks. Up to you how far you want to push it and if you want to apply even more memory overclock past stock clocks. I have good experience with water cooled cards to get at least a 2000Mhz more memory clock on top of the stock memory clocks with no problem. A little harder to keep errors out when the card is air cooled though. I stick with only around +800Mhz for GDDR5 air cooled cards.

If you do try to run a memory overclock, it is VERY recommended to also run Petri's keepP2 utility NVIDIA P0, P2 states and overclocking 1080, 1080Ti and VOLTA in Linux to keep the card always in P2 state so that it doesn't temporarily jump to P0 state when a task unloads from the card and before the next starts and the overclock is applied on top of the stock P0 clock state. That can crash the app, the host or corrupt the task.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1967868 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1967880 - Posted: 30 Nov 2018, 0:21:15 UTC - in response to Message 1967840.  

I did actually try to install ubuntu 18 on this machine. The 'live' desktop wouldn't even boot. Lots of PCI errors..


do you by chance have an ASUS x79 board running that system?

I ran into this problem previously on my ASUS z270 boards. It was so bad that if left alone it spammed PCI errors so heavily that it would fill the log files and eventually make the OS unusable after the whole hard drive filled up LOL. I was managing it by limiting my cards to PCIe gen2 speeds until i found the problem. i know you don't seem to be impacted after switching to Fedora, but just an FYI if you ever want to try Ubuntu again or see that problem on another system.

here's what i found:

Some Google-fu found this: What causes this? pcieport 0000:00:03.0: PCIe Bus Error: AER / Bad TLP
I can give at least a few details, even though I cannot fully explain what happens.

As described for example here, the CPU communicates with the PCIe bus controller by transaction layer packets (TLPs). The hardware detects when there are faulty ones, and the Linux kernel reports that as messages.

The kernel option pci=nommconf disables Memory-Mapped PCI Configuration Space, which is available in Linux since kernel 2.6. Very roughly, all PCI devices have an area that describe this device (which you see with lspci -vv), and the originally method to access this area involves going through I/O ports, while PCIe allows this space to be mapped to memory for simpler access.

That means in this particular case, something goes wrong when the PCIe controller uses this method to access the configuraton space of a particular device. It may be a hardware bug in the device, in the PCIe root controller on the motherboard, in the specific interaction of those two, or something else.

By using pci=nommconf, the configuration space of all devices will be accessed in the original way, and changing the access methods works around this problem. So if you want, it's both resolving and suppressing it.


I applied the kernel command line option described below to my grub file

1. cp /etc/default/grub ~/Desktop

2. Edit grub. Add pci=nommconf at the end of GRUB_CMDLINE_LINUX_DEFAULT. Line will be like this:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=nommconf"

3. sudo cp ~/Desktop/grub /etc/default/

4. sudo update-grub

5. Reboot now


and viola!
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1967880 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1967893 - Posted: 30 Nov 2018, 0:46:57 UTC - in response to Message 1967840.  
Last modified: 30 Nov 2018, 0:47:38 UTC

I have been a Fedora/CentOS user for years, old habits die hard. I did actually try to install ubuntu 18 on this machine. The 'live' desktop wouldn't even boot. Lots of PCI errors. I didn't want to spend a lot of time on it when the Fedora installer works on this hardware out of the box without messing around.

Installing the nvidia driver is also pretty trival on fedora, I can just grab any of the drivers on nvidia's site, and use those. I figured 4.15 would be the best, but may drop it down to 4.10 if you suggest.

Mind you, the base hardware isn't the newest.. an old Socket 2011 from around 6-7 years ago. So, not a speed demon by any stretch. Its an 8 core i7, and I have it running only running 2 cores for Seti.

1. Install Fedora 29.
2. Disable and remove the Nouveau driver
3. Install the NVIDIA driver and DKMS to auto make the kernel mods when a new kernel is installed.
4. sudo yum install boinc-client boinc-manager
5. Choose seti@home for project, then install the CUDA app and app_info.xml
6. Done.

The entire process was about an hour to complete. I can certainly help anyone or document the process for anyone wanting to use Fedora.

I have uninstalled the Titan Maxwell GPUs, they were just too power hungry for their results. Currently its running 1 Titan V and 1 1060. Next week, I will pull out the 1060, and install a couple 2080tis' and see how they perform.

Again, if you think I should downgrade to 4.10, I'll give that a go tonight.
I remember when it was the same with Ubuntu. Unfortunately, it seems Major changes were made with Ubuntu 18.04. First impression is Ubuntu has Blocked the Drivers downloaded from nVidia from loading. I've decided to try the Beta OS a few months earlier than the usual One Year wait period and the first thing I found is you can't use the nVidia drivers...and the Repository drivers are missing OpenCL. Such a Deal. So far I've just been trying Lubuntu 18.04.1, but, I'm assuming Ubuntu will be the same from the reports I've been hearing. All the more reason to stay with Ubuntu 16.04, you don't have that problem with 16.04 to 17.10. Well, let's see how the full Ubuntu 18.04 works, the download has finished.
ID: 1967893 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1967925 - Posted: 30 Nov 2018, 2:39:01 UTC
Last modified: 30 Nov 2018, 2:54:24 UTC

Well, that about sums it up. Ubuntu 18.04.1 is blocking the nVidia driver downloaded from nVidia from running. The installer installs the files but they won't run. NVIDIA Settings refuses to launch and nvidia-smi says the driver isn't loaded while Additional drivers says a manually installed driver is in use. All this while the screen is at 640 x 480 and can't be changed. In Lubuntu I did finally get the screen to 1920 x 1080 but the driver never loaded. I'm not going to say what I think about an OS that won't allow the Vendors driver to load. Linux becomes more & more like Apple every day, while they are both quite a ways behind M$ at the moment.

At least I can cheat, I have 14.04 installed on another partition. Booting to 14.04 and placing an xorg.conf in X11 brings the resolution to 1920 x 1080 in 18.04, but the driver still won't load. So, why did 18.04 remove the option to install gksu anyway? Seems to be just another move to annoy the user to me.
ID: 1967925 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20265
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1967928 - Posted: 30 Nov 2018, 2:53:06 UTC - in response to Message 1967925.  

Is the nVidia driver compatible with your kernel version?

Happy fast crunchin'!
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1967928 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20265
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1967929 - Posted: 30 Nov 2018, 2:57:03 UTC - in response to Message 1967925.  

Note that nVidia are notorious for locking down their GPUs...

I gave up with their silliness and have gone all AMD. Much easier and I can compile my own custom kernels without any compatibility problems with the GPU driver...

Happy crunchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1967929 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1967930 - Posted: 30 Nov 2018, 2:57:52 UTC - in response to Message 1967928.  
Last modified: 30 Nov 2018, 3:03:13 UTC

The driver loads just fine if you install it from the PPA, install the one(S) from nVidia and they refuse to run. Pick one, 390, 396, 410, or 415, they all refuse to run if installed by the Official nVidia installer.

Of course, the same drivers work just fine in systems before 18.,04. How do you think I installed the CUDA 10 Toolkit in 14.04 and built the current CUDA 10 Special App? Yes, nVidia has a version of the CUDA 10 Toolkit for 14.04. Look it up, https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1404
ID: 1967930 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20265
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1967932 - Posted: 30 Nov 2018, 3:02:37 UTC - in response to Message 1967930.  

The driver loads just fine if you install it from the PPA, install the one(S) from nVidia and they refuse to run. Pick one, 390, 396, 410, or 415, they all refuse to run if installed by the Official nVidia installer.

Check the README file(s)?

Or look at what the ppa does differently?

Good luck,
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1967932 · Report as offensive
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Titan V and GTX1060s


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.