LotzaCores and a GTX 1080 FTW

Message boards : Number crunching : LotzaCores and a GTX 1080 FTW
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 11 · Next

AuthorMessage
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1793935 - Posted: 6 Jun 2016, 11:35:20 UTC - in response to Message 1793926.  


(yes, the default Event Log display limit is 2,000 lines, and 0 removes the limit - it just falls over when you run out of memory)
(the value of N in the suggestion above is bytes, not MB)

Oooh.. Didn't know either of those things. Hmm, well, I'll have to rethink some of those settings then, as I don't want it to tip over now.. ;-) And thanks for the heads up on the stdout file, I thought it was MB, not bytes. I swear I learn something new here every day...

ID: 1793935 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1793971 - Posted: 6 Jun 2016, 14:19:27 UTC

Has anybody here got a GTX 1080 or 1070 active and running BOINC? If so, could you please post the GPU detection line from the startup log - including, specifically, the 'GFLOPs peak' value - I've been asked to check if that's accurate.

(and please don't just compare it with what SIV says - it's the author of SIV who's asked me, because both programs use the same formula - they'll either both be right or both be wrong.)
ID: 1793971 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1794037 - Posted: 6 Jun 2016, 17:58:11 UTC - in response to Message 1793971.  

Well Richard, don't look to me anytime soon, I just got off the phone with EVGA about my new machines video weirdness, so I gave him the order # to check the shipping status, and he said that it would be shipping later in the month. Min a couple weeks, but almost for certain by the end of the month. I asked if there was a way to change the shipping method, because I had thought it would be shipping by the end of this week initially, so I chose 2 day and paid an extra $18 for it, but if it's coming that late, well, who cares about a couple 3 more days? Of course, the only way to change it is to cancel the pre-order, and then there is no chance of re-ordering it at this time. Yippee. So, as I said I'll make a note of it here when I get it installed and running, and maybe use it as a comparison of the Reference version vs. the FTW version, not sure how much difference there will be, but at least the data will be out there for all to see.

ID: 1794037 · Report as offensive
Profile tullio
Volunteer tester

Send message
Joined: 9 Apr 04
Posts: 8797
Credit: 2,930,782
RAC: 1
Italy
Message 1794052 - Posted: 6 Jun 2016, 18:32:57 UTC
Last modified: 6 Jun 2016, 18:35:16 UTC

I don't have a GTX 1080 but I have downloaded its nVidia driver, 365.19 and it works on my GTX 750 OC.
Tullio
ID: 1794052 · Report as offensive
Rasputin42
Volunteer tester

Send message
Joined: 25 Jul 08
Posts: 412
Credit: 5,834,661
RAC: 0
United States
Message 1794056 - Posted: 6 Jun 2016, 18:48:59 UTC - in response to Message 1794052.  

Meaning?
ID: 1794056 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 1794065 - Posted: 6 Jun 2016, 19:32:48 UTC - in response to Message 1794052.  
Last modified: 6 Jun 2016, 19:34:16 UTC

I don't have a GTX 1080 but I have downloaded its nVidia driver, 365.19 and it works on my GTX 750 OC.
Tullio

Curious, this Nvidia driver download site currently advises using 368.25 for a Windows 10 64-bit 1080 host, and 368.22 for the same host with a 750.
ID: 1794065 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1794087 - Posted: 6 Jun 2016, 21:25:27 UTC - in response to Message 1794065.  
Last modified: 6 Jun 2016, 21:26:28 UTC

They are different, and apparently non-compatible drivers as I found out, because I didn't choose the correct OS (needed Win 7-64 bit, but picked the default one which was Win 10). I tried installing it, and it popped up saying Wrong OS, Dummy. lol It lists the Win 10 drivers separately from the Vista, 7 & 8 drivers, not sure what changed in 10 but looks like they had to tweak something for 10 and gave it a different version #, so there you go.

ID: 1794087 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1794094 - Posted: 6 Jun 2016, 22:03:07 UTC

Comparing the information for 368.25 on nVidia's download page:
Supported Products: GeForce 10 Series:
GeForce GTX 1080

and regular 368.22:

Supported Products: GeForce 900 Series:
GeForce GTX TITAN X, GeForce GTX 980 Ti, GeForce GTX 980, GeForce GTX 970, GeForce GTX 960, GeForce GTX 950

GeForce 700 Series:
GeForce GTX TITAN Z, GeForce GTX TITAN Black, GeForce GTX TITAN, GeForce GTX 780 Ti, GeForce GTX 780, GeForce GTX 770, GeForce GTX 760, GeForce GTX 760 Ti (OEM), GeForce GTX 750 Ti, GeForce GTX 750, GeForce GTX 745, GeForce GT 740, GeForce GT 730, GeForce GT 720, GeForce GT 710, GeForce GT 705

GeForce 600 Series:
GeForce GTX 690, GeForce GTX 680, GeForce GTX 670, GeForce GTX 660 Ti, GeForce GTX 660, GeForce GTX 650 Ti BOOST, GeForce GTX 650 Ti, GeForce GTX 650, GeForce GTX 645, GeForce GT 645, GeForce GT 640, GeForce GT 635, GeForce GT 630, GeForce GT 620, GeForce GT 610, GeForce 605

GeForce 500 Series:
GeForce GTX 590, GeForce GTX 580, GeForce GTX 570, GeForce GTX 560 Ti, GeForce GTX 560 SE, GeForce GTX 560, GeForce GTX 555, GeForce GTX 550 Ti, GeForce GT 545, GeForce GT 530, GeForce GT 520, GeForce 510

GeForce 400 Series:
GeForce GTX 480, GeForce GTX 470, GeForce GTX 465, GeForce GTX 460 SE v2, GeForce GTX 460 SE, GeForce GTX 460, GeForce GTS 450, GeForce GT 440, GeForce GT 430, GeForce GT 420


Seems the Pascal branch isn't merged into the mainstream yet. Don't see a 1070 yet either. I wouldn't advise mixing GPU generations with a 1080 until they are (usually happens in another driver release, or a few, after initial)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1794094 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1794095 - Posted: 6 Jun 2016, 22:13:40 UTC - in response to Message 1794094.  

Well, then maybe my card being delayed for a few more weeks isn't necessarily a bad thing, it will allow them to work out some of the driver kinks? Although I hadn't planned initially to do a multi GPU setup with it, I am going to toss it into my 48 core machine and let it run for a month or so to see how well it works.

ID: 1794095 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1794101 - Posted: 6 Jun 2016, 22:24:51 UTC - in response to Message 1794095.  

Well, then maybe my card being delayed for a few more weeks isn't necessarily a bad thing, it will allow them to work out some of the driver kinks? Although I hadn't planned initially to do a multi GPU setup with it, I am going to toss it into my 48 core machine and let it run for a month or so to see how well it works.


Typically that'll be a shifting target for 3-4 months, after which the drivers settle down. Could be a bit of extra jostling than normal, due to DirectX12, OpenCL(2.1?)/Vulkan, Cuda 8.0, all attempting to mature at the same time.

At least there are Raistmer's OpenCL apps, Familiar Cuda ones, and multiple new avenues being explored on all fronts for when we know more about this architecture. One of those times when having plenty of eggs in different baskets could pay off (even if initially there is a few unwanted surprises)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1794101 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 1794248 - Posted: 7 Jun 2016, 13:26:58 UTC

A thread on the Nvidia forums regarding a widely reported complaint of abnormal fan speed variation on the initial 1080 cards asserts that a driver fix has been found, tested, and scheduled for release in a driver update on June 7, 2016

Checking the Nvidia driver download page, as of a couple of minutes ago, the currently recommended driver for the 1080 and 1070 is 368.39, and unlike the previous 1080 driver this one lists a very large set of supported products, including for example the 750, 980, and a fullish looking list back through the 400 series.
ID: 1794248 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1794291 - Posted: 7 Jun 2016, 23:48:59 UTC - in response to Message 1794248.  

Checking the Nvidia driver download page, as of a couple of minutes ago, the currently recommended driver for the 1080 and 1070 is 368.39, and unlike the previous 1080 driver this one lists a very large set of supported products, including for example the 750, 980, and a fullish looking list back through the 400 series.

Coming so soon after the initial release, if I had a GTX 10xx card i'd probably wait till the next release, or at least let others test it out a while before trying it myself.
Grant
Darwin NT
ID: 1794291 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1794788 - Posted: 9 Jun 2016, 19:48:22 UTC
Last modified: 9 Jun 2016, 19:55:35 UTC

Hmm, so they finally released the specs on the FTW version I am getting, turns out it's an incremental improvement, but nothing earthshattering.



Biggest differences is that there are 2 BIOS chips on it, it's 35 watts higher, and it now takes two 8 pin power connectors instead of one. Overall, I think I'd probably go for the step down and do a slight overclock, and get about the same performace and not have to have dual 8 pin power connectors.

*edit* Unless of course with the dual 8 pins there is more overclocking headroom...

ID: 1794788 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1795354 - Posted: 11 Jun 2016, 12:50:08 UTC - in response to Message 1791668.  

Look in the Event Log again. Do you see "This computer has reached a limit of tasks in progress", or words to that effect?

That again is in place to protect the servers, and apportion the work more evenly among the ~150K volunteers. If there was no restriction, and everybody turned every knob up to 11 (as they did at one point), the fastest hosts would be caching thousands of tasks. Each task requires an entry in the database at Berkeley: thousands of tasks times thousands of computers times many days equals a hugely bloated and inefficient database.

I'll be shouted down for this, but no permanently-connected SETI host *needs* a cache longer than about six hours, to cover 'Maintenance Tuesdays'. Make that 12 hours, to cover the congestion period after the end of maintenance, but no more.


. . I tend to agree with that except I was running with 0.9 days cached but the outage last week went over 24 hours and I ended up without work for several hours. I originally was using 0.5 days but maintenance outages persuaded me to go 0.9. I have now succumbed to "insurance measures" and have increased to 1.5 days. That seems to have a reasonable balance but there are still some anomalies that cause under caching. Because I have had so few APs the system keeps telling me they will take 6 days to process and so starves my cache until the one Ap task is completed, which when it comes up takes about 2 hours :(
ID: 1795354 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1795358 - Posted: 11 Jun 2016, 13:10:36 UTC - in response to Message 1795354.  

Because I have had so few APs the system keeps telling me they will take 6 days to process and so starves my cache until the one Ap task is completed, which when it comes up takes about 2 hours :(

That's partly because of so few Astropulse tasks needing to be processed (most of the data in these old tapes has been processed already), and partly because they conditions set for considering an AP task to have a 'normal' runtime, and hence being worth taking into account when deciding the true estimate, have been set quite strictly. Your host 8012534 is showing

Number of tasks completed	4
Consecutive valid tasks		7

You need to get that 'completed' line up to 11 before the estimates are corrected: sadly, that probably needs another fifteen or twenty tasks to run.
ID: 1795358 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1795364 - Posted: 11 Jun 2016, 13:51:30 UTC - in response to Message 1791681.  

The limit is NOT 100 per 24 hours, but 100 tasks reported as "in progress". This has been the case for a few years.
The only time you are capped per day is when a computer returns a number of error results, in which case the concurrent limit is reduced - as I had a couple of weeks back when I screwed up an update and had an invalid driver installed on one of my rigs. Thankfully I caught that quickly and the cap was removed within a few hours as the computer concerned returned enough valid results.



. . OK, I had to test that (Don't touch that it's hot! :) ) And when I blew out the setting to 3.5 days it still only cached 200 tasks, so I guess that is 100 for the CPU and 100 for the GPU. As it turns out that is about 1 days work for the GPU and a l.4 for the CPU. Which I guess overrides the the settings in BOINC Manager, making the current setting of 1.5 seem about the limit anyway.
ID: 1795364 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1795369 - Posted: 11 Jun 2016, 14:24:15 UTC - in response to Message 1792451.  

Thanks! My RAC in the software shows that it has rocketed from 0 to 4600 in those 2 short days. Who knows how high it just might go? :-)


Don't know for sure, but can do some ballparking.

IIRC 8 Xeon cores doing MB, back in cobblestone scale days, used to get about 20K RAC on PreAVX AKv8 code. Since then there's been two main credit drops amounting to x ~30%. You claw back a little for increased throughput with AVX (about 1.5x), so my guess with 48 CPU cores alone (AVX capable + fast memory), would be 20K*6*0.3*1.5 ~= 50K (1 significant digit). Lots of variability, especially if adding AP and weird work mixes and GPUs into the picture.



. . Should not that formula be 20K*6*0.7*1.5 ??

. . Or is that drop "to" 30% not "of" 30%.

. . I am curious to know if the drop has been that large?
ID: 1795369 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1795375 - Posted: 11 Jun 2016, 15:02:37 UTC - in response to Message 1795358.  

Because I have had so few APs the system keeps telling me they will take 6 days to process and so starves my cache until the one Ap task is completed, which when it comes up takes about 2 hours :(

That's partly because of so few Astropulse tasks needing to be processed (most of the data in these old tapes has been processed already), and partly because they conditions set for considering an AP task to have a 'normal' runtime, and hence being worth taking into account when deciding the true estimate, have been set quite strictly. Your host 8012534 is showing

Number of tasks completed	4
Consecutive valid tasks		7

You need to get that 'completed' line up to 11 before the estimates are corrected: sadly, that probably needs another fifteen or twenty tasks to run.


. . That could be 12 months away :)

. . It's taken 4 months to get to 4 ....
ID: 1795375 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1795376 - Posted: 11 Jun 2016, 15:03:36 UTC - in response to Message 1795369.  

Thanks! My RAC in the software shows that it has rocketed from 0 to 4600 in those 2 short days. Who knows how high it just might go? :-)


Don't know for sure, but can do some ballparking.

IIRC 8 Xeon cores doing MB, back in cobblestone scale days, used to get about 20K RAC on PreAVX AKv8 code. Since then there's been two main credit drops amounting to x ~30%. You claw back a little for increased throughput with AVX (about 1.5x), so my guess with 48 CPU cores alone (AVX capable + fast memory), would be 20K*6*0.3*1.5 ~= 50K (1 significant digit). Lots of variability, especially if adding AP and weird work mixes and GPUs into the picture.



. . Should not that formula be 20K*6*0.7*1.5 ??

. . Or is that drop "to" 30% not "of" 30%.

. . I am curious to know if the drop has been that large?

The change from MBv6->MBv7 was a drop of 40-50% for some.

I prefer to just use the actual run times to calculate the number of tasks a day & then guesstimate the credit. It looks like their normal AR tasks are running ~2.5 hours. So that gives us ~450 tasks/day. I like to figure 100 credit per task to give a max theoretical RAC value of 45,000. Then I figure 80% of the max for the low end of 36,000. Which would put a RAC of ~40.5K in the middle.
The daily credit values for the past few days on their 48 core host are: 34,506 36,640 46,096 62,035 35,202 39,718 35,010 35,898 44,786. Which averages out to 41099 for the past 9 days.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1795376 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1795394 - Posted: 11 Jun 2016, 15:56:35 UTC

A data point for you:

On my 32 core dual E5-2670 (AVX) running 29 CPU threads, a typical WU runs about 2.5 hours, so ~275 there. I also have 2 x GTX 980s running 3/dev, at ~1 hour/WU. So that's another 144 there, for a total of about 420/day, and I am averaging a bit over 40K over the last few days.
ID: 1795394 · Report as offensive
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 11 · Next

Message boards : Number crunching : LotzaCores and a GTX 1080 FTW


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.