Getting the most bang for your buck from a GTX 1060

Message boards : Number crunching : Getting the most bang for your buck from a GTX 1060
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

AuthorMessage
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5516
Credit: 528,817,460
RAC: 242
United States
Message 1878488 - Posted: 16 Jul 2017, 1:56:50 UTC - in response to Message 1878487.  

Have you looked at SIV64X?

http://rh-software.com/

Some of us use that to get an overview of what is going on.
ID: 1878488 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5121
Credit: 276,046,078
RAC: 462
Message 1878492 - Posted: 16 Jul 2017, 2:35:40 UTC - in response to Message 1878356.  

all explanations are there http://lunatics.kwsn.info/index.php/topic,1808.msg60931.html#msg60931


Thank you. I just got done reading the explanation. It makes it much clearer about the interaction between -tt N and -period_iterations_num N. I think adding that URL to the docs would help a number of people be a little less confused.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1878492 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5121
Credit: 276,046,078
RAC: 462
Message 1878493 - Posted: 16 Jul 2017, 3:14:35 UTC - in response to Message 1870623.  

Wiggo,
I was looking at your 5712423 machine. When I took the total gpu tasks listed and divided it into 1440 minutes (24 X 60) I got an "average" of 3.5 minutes per Lunatics SOG task.

Since the "Cuda 80 / Secret sauce / Linux" thread was talking about gpu processing at that kind of speed I am wondering if you have been using WD-40 on those SOG work units to squirt them through or what? ;)

You appear to be using pretty stock i5 boxes ( i5-3570K CPU @ 3.40GHz / i5-2500K CPU @ 3.30GHz) for this. And the gtx 1060 3 GB cards. The parameters for mb*sog.txt are not all that exotic.

What is your secret then? The "south" is faster than the "north"?

Hmmmmm.

Or do cpu's with TSX-NI and/or AVX drive Gtx 1060's that much faster?

Do your motherboards allow you to overclock? Are you running at the 3.80Ghz turbo boost all the time?

Thanks for some very impressive numbers to aspire to.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1878493 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13351
Credit: 208,696,464
RAC: 304
Australia
Message 1878495 - Posted: 16 Jul 2017, 3:30:20 UTC - in response to Message 1878493.  

When I took the total gpu tasks listed and divided it into 1440 minutes (24 X 60) I got an "average" of 3.5 minutes per Lunatics SOG task.

You need to look at actual run times.
10 min 21 sec blc05_2bit_guppi_57835_09875_HIP40693_0036.28456.0.23.46.208.vlar_0
10 min 32 sec blc05_2bit_guppi_57835_05639_HIP39896_0024.2021.0.24.47.82.vlar_2
6 min 23 sec 02mr08ad.2865.18068.5.32.74_1
4 min 15 sec 12mr08ah.27375.4571.3.30.62_1
Grant
Darwin NT
ID: 1878495 · Report as offensive
Iona
Avatar

Send message
Joined: 12 Jul 07
Posts: 790
Credit: 22,438,118
RAC: 0
United Kingdom
Message 1878504 - Posted: 16 Jul 2017, 6:12:44 UTC - in response to Message 1878495.  

They're similar times to what I get with my 970. In fact, the APR of those 1060s is frequently lower, in spite of the higher 'clock' of the 1060. There is definitely, still, life in the 970! True, it may not be as cheap to run, but it does mean I'm not 'pensioning' off a very capable piece of kit. I've kept my 3570K back to 3.6 GHz max (in the BIOS), when the higher ambient temperatures, made themselves apparent, last month. The CPU cooler might be able to deal with the extra heat, but I didn't leave the factory with one fitted!
Don't take life too seriously, as you'll never come out of it alive!
ID: 1878504 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5121
Credit: 276,046,078
RAC: 462
Message 1878513 - Posted: 16 Jul 2017, 8:18:18 UTC - in response to Message 1878495.  
Last modified: 16 Jul 2017, 8:18:35 UTC

You need to look at actual run times.


Yes, but I am more interested in the total number of gpu SOG work units which is what I was pointing to. He had about 411 tasks listed for the combined 1060's gpus on one machine at that time.

I just looked at mine and the seti website is claiming I have 234 for a lunatics setup. So I am just a little green with envy. Or more exactly I am wondering how to boost my task count higher.... I thought I had already set it to as fast as was possible.

Perhaps the 2nd most confusing thing is "when" is the daily count reset? At this moment while I have 234, that same machine of Wiggo's is showing 19?

Tom
A proud member of the OFA (Old Farts Association).
ID: 1878513 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13351
Credit: 208,696,464
RAC: 304
Australia
Message 1878516 - Posted: 16 Jul 2017, 8:51:49 UTC - in response to Message 1878513.  
Last modified: 16 Jul 2017, 9:10:38 UTC

He had about 411 tasks listed for the combined 1060's gpus on one machine at that time.

?
Under Application details?
About the only numbers there of any use are the APR- and that's only when running 1 WU at a time. Running 2 at a time might produce more work per hour (eg for CUDA50), but the APR is much, much lower than when processing only 1 WU at a time.
The other is the Average turnaround time, and that is only of any real meaning if is Seti is the only project they run, and they run it 24/7 (and with slower machines don't run a multi-day cache).

The only values that are of any real use are the crunching times for each valid WU when running 1 WU at a time, or the number of valid WUs per hour if running more than 1 WU at a time.

EDIT- added the valids.
Grant
Darwin NT
ID: 1878516 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5121
Credit: 276,046,078
RAC: 462
Message 1878608 - Posted: 16 Jul 2017, 22:08:24 UTC - in response to Message 1878516.  

The only values that are of any real use are the crunching times for each valid WU when running 1 WU at a time, or the number of valid WUs per hour if running more than 1 WU at a time.


And is there a summary total of those anywhere? I really don't like hand counting stuff that it should be possible to use a computer to count it.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1878608 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 26915
Credit: 261,360,520
RAC: 489
Australia
Message 1878609 - Posted: 16 Jul 2017, 23:02:01 UTC - in response to Message 1878493.  

Wiggo,
I was looking at your 5712423 machine. When I took the total gpu tasks listed and divided it into 1440 minutes (24 X 60) I got an "average" of 3.5 minutes per Lunatics SOG task.

Since the "Cuda 80 / Secret sauce / Linux" thread was talking about gpu processing at that kind of speed I am wondering if you have been using WD-40 on those SOG work units to squirt them through or what? ;)

You appear to be using pretty stock i5 boxes ( i5-3570K CPU @ 3.40GHz / i5-2500K CPU @ 3.30GHz) for this. And the gtx 1060 3 GB cards. The parameters for mb*sog.txt are not all that exotic.

What is your secret then? The "south" is faster than the "north"?

Hmmmmm.

Or do cpu's with TSX-NI and/or AVX drive Gtx 1060's that much faster?

Do your motherboards allow you to overclock? Are you running at the 3.80Ghz turbo boost all the time?

Thanks for some very impressive numbers to aspire to.

Tom

Tom, both i5 rigs have Speed Step turned off (there's nothing worse than having a CPU revving up and down while processing video) so both rigs are locked at a constant 3.4GHz (the Z68 chipset runs the 2500K at that speed when Speed Step is turned off). They can be overclocked, but they do well enough ATM without the need to. Memory speed is the only difference between them as the 3570K runs at a higher frequency. ;-)

Your average turnaround time (this is the number to watch) of 0.45 is slowly getting closer to my 0.42, but then my 1060's arn't carrying that extra 3GB of weight memory about with them either. :-)

BTW, 3.5mins is the time it takes to do an Arecibo VHAR and I had a lot of them here last week.

I think that should answer your questions.

Cheers.
ID: 1878609 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5516
Credit: 528,817,460
RAC: 242
United States
Message 1878610 - Posted: 16 Jul 2017, 23:03:40 UTC - in response to Message 1878608.  

That is going to be up to the end user to figure out for himself. It's going to vary based on your particular system. You will need to find work units with similar AR and get a consensus on what the average time is for each. Most crunchers have a general idea of how long it takes to crunch certain work units after a while looking at your results. Others like to keep a record of returned work units and look thru the results. Some of us use BoincTasks. It keeps a record of all reporte work units. Eyeballing that can give you a general idea of how fast it's taking your work units to process.
ID: 1878610 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5121
Credit: 276,046,078
RAC: 462
Message 1878685 - Posted: 17 Jul 2017, 4:53:38 UTC - in response to Message 1878609.  

Your average turnaround time (this is the number to watch) of 0.45 is slowly getting closer to my 0.42, but then my 1060's arn't carrying that extra 3GB of weight memory about with them either. :-)

BTW, 3.5mins is the time it takes to do an Arecibo VHAR and I had a lot of them here last week.

I think that should answer your questions.

Cheers.


I have one 6GB and one 3GB and they do seem to run a little bit different on the same parameters.

Thank you,
Tom
A proud member of the OFA (Old Farts Association).
ID: 1878685 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1878749 - Posted: 17 Jul 2017, 15:52:07 UTC - in response to Message 1878609.  

...Your average turnaround time (this is the number to watch) of 0.45 is slowly getting closer to my 0.42, but then my 1060's arn't carrying that extra 3GB of weight memory about with them either. :-)

Cheers.

So, 6GB is worse than 3GB for these cards? If running Lunatics, doesn't that allow you to run more than 1 task at a time, or is the rest of the card not up to it so it's irrelevant?

ID: 1878749 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13141
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1878765 - Posted: 17 Jul 2017, 17:06:22 UTC - in response to Message 1878749.  

I should think the extra 3GB of memory has zero effect on Lunatics performance since you can only use 1563 GB of memory anyway because of OpenCL limitations. The extra 128 CUDA cores of the 6GB card has the more direct effect.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1878765 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5121
Credit: 276,046,078
RAC: 462
Message 1878872 - Posted: 18 Jul 2017, 2:42:24 UTC - in response to Message 1878765.  

I should think the extra 3GB of memory has zero effect on Lunatics performance since you can only use 1563 GB of memory anyway because of OpenCL limitations. The extra 128 CUDA cores of the 6GB card has the more direct effect.


I agree with the "should" but the Gpu list actually shows the 3GB card indexing higher than the 6GB card which was why I bought the 3GB card originally. If it hadn't been short enough to fit into one my other machines I probably would have not bought another one.

And there does seem to be some difference in how they perform/work.

The confounding influences include running two different brands, two different form factors (full size 3GB vs. compact, single fan, 6GB) and probably slightly different bios versions.

Gpu-Z reports differences in memory usage, gpu loads etc under the same mb*sog.txt parameters. Its not enough to make me grumpy all the time but when one card shows its self at 95+% most of the time and the other one regularly dips below 89% it makes you wonder just what IS the difference...

As was pointed out, its the actual production of tasks/wu's not how Gpu-Z reports this/that/or the other thing that counts. That is why I haven't been posting about those differences.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1878872 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5121
Credit: 276,046,078
RAC: 462
Message 1881588 - Posted: 2 Aug 2017, 3:11:56 UTC

I have just started another experiment with a new-to-me (used) Gtx 1060 3GB "mini" card. I am running it as a "gpu only" task on my Dell OptiPlex 7010. I want to see what kind of RAC it supports after all the Seti cpu tasks have drained out. And it has to transfer its baseline from a gtx 750ti to the 1060. Its running Lunatic beta6.

So far, it looks like the "period" stuff is going to end up below 4. It was a tiny bit "laggy" at both 1 and 4 so I am trying it at 10 now. The time between when I type a key and it displays on the monitor has resumed being "instant".

I am also going to try to run my new-to-me (made from a used MB/Cpus but a new case) dual E5-2670 with 1 Gtx 1060 3GB as stock Seti to see how high it stabilizes. At least 1 other Setizen's similar system is running in the 30,000's RAC with no gpu and 1 non-Seti task. I want to see if I can replicate that.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1881588 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5121
Credit: 276,046,078
RAC: 462
Message 1884434 - Posted: 17 Aug 2017, 11:49:50 UTC - in response to Message 1881588.  

I have just started another experiment with a new-to-me (used) Gtx 1060 3GB "mini" card. I am running it as a "gpu only" task on my Dell OptiPlex 7010. I want to see what kind of RAC it supports after all the Seti cpu tasks have drained out.


The RAC just went "flat" (instead of upwards) at 16,976.85 (the graph basically says about 17,000).

If it stays that way for a couple of days (or wobbles around that result) I am going to claim that for the Lunatics SOG app (whichever one I happen to be running off the beta6 installer) a Gtx 1060 3GB video card is "worth" about 17,000 credits for the RAC.

Tom


The 2nd quote is from my Dell 7010 experiment thread but I wanted it documented over here too.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1884434 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5121
Credit: 276,046,078
RAC: 462
Message 1884514 - Posted: 17 Aug 2017, 17:48:13 UTC

I ran across a mention of power state 0 vs 2 in another thread. (presumably 0 is faster) That got me to wondering if that is available on the 1060 and how you would go about setting it under Windows?

Tom
A proud member of the OFA (Old Farts Association).
ID: 1884514 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5516
Credit: 528,817,460
RAC: 242
United States
Message 1884525 - Posted: 17 Aug 2017, 18:44:42 UTC - in response to Message 1884514.  

P0 states is a default power and memory setting from Nvidia for computational work on the GPU.

P2 state is normal gaming settings where Memory and clock speeds are maximized.

Since errors in memory aren't as important in games as it is in computational work, the memory is higher for the P2 state. When the system determines that
computational work is being done, it lowers the state to the P0 so that error related to memory are diminished.

You can override these setting in 2 places, in Precision X and in Nvidia Inspector by changing the memory speed. But the caveat is error rate may go up as well as the GPU can (usually is) be unstable. Leading to lock up, crashes, etc.... Use at your own risk....
ID: 1884525 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13141
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1884539 - Posted: 17 Aug 2017, 19:33:49 UTC - in response to Message 1884525.  

P0 states is a default power and memory setting from Nvidia for computational work on the GPU.

P2 state is normal gaming settings where Memory and clock speeds are maximized.

Since errors in memory aren't as important in games as it is in computational work, the memory is higher for the P2 state. When the system determines that
computational work is being done, it lowers the state to the P0 so that error related to memory are diminished.

You can override these setting in 2 places, in Precision X and in Nvidia Inspector by changing the memory speed. But the caveat is error rate may go up as well as the GPU can (usually is) be unstable. Leading to lock up, crashes, etc.... Use at your own risk....

Z, that is a nope. You have the states reversed. P0 state is the highest state for memory and core clock and is the default for running games. It is only when you are doing distributed computing that the cards are lowered to P2 state. There are even lower power states that the card is downclocked to when it is idle where P8 is the lowest. It is easy to verify by looking at GPU Info in SIV which lists all the memory and core clocks for each power state in a table.
GPU Info
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1884539 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5516
Credit: 528,817,460
RAC: 242
United States
Message 1884546 - Posted: 17 Aug 2017, 19:53:02 UTC - in response to Message 1884539.  

Yeah, I thought it might be reverse..lol I get those dang things confused
ID: 1884546 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

Message boards : Number crunching : Getting the most bang for your buck from a GTX 1060


 
©2022 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.