LotzaCores and a GTX 1080 FTW

Message boards : Number crunching : LotzaCores and a GTX 1080 FTW
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 11 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1791794 - Posted: 29 May 2016, 19:42:15 UTC - in response to Message 1791787.  
Last modified: 29 May 2016, 19:43:39 UTC

As a matter of fact, regarding the database performance issues.. Matt recently said [...]we are making some huge advances in reducing the science database. All the database performance problems I've been moaning about for years are finally getting solved, or worked around, basically. This is really good news.

Which is great - but the recent discussion has been about the BOINC task/workunit processing database. The science database is the (huge) repository of all the signals found since the project started - it doesn't affect day-to-day matters like 'tasks in progress' limits at all.
ID: 1791794 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1791796 - Posted: 29 May 2016, 19:50:07 UTC - in response to Message 1791746.  

Is the increase in temp you mentioned after using the AVX app? That would normally be reasonable behavior.

Yep, and just checked it a bit ago, still hanging in there around 50 after installing the AVX app, so I am pretty happy with the way it's been performing so far. And about defying logical behavior, well, that figures and is about par for me. lol

I am interested in seeing how the GPU effects it, and SuperMicro has a Lot of information about memory and the different configs/speeds on their site and in their manual. It truly is a server board, and as such they have taken their documentation seriously. This is the first server board I've bought in probably 20 some years, and the last one I bought was to roll my own server, and install Novell Netware 4.01. I had an account with Tech Data, and 4.0 had just came out. All I'll say is that I thought I wanted it because of NDS, because that made sense to me logically as opposed to bindery, but I think they said I was one of the 1st 10 customers to get it in the country at that time. I had a direct support line to Novell, and talk about half baked... I was one of their unofficial beta testers it turned out, because I had ran into things that they had never seen before, even after buying an Intel branded server (this was the 90's, remember) to try and get it to work. I think I still have that thing somewhere down in the basement, I should try firing it up one day for old times sake.

Anywho, not sure if other brands of server boards have this thorough of docs, but I have to say these seem to cover it pretty well, and the couple times I called them before the purchase with questions, they seemed to have their stuff in a group, and I was off the phone in less than 5 mins both times.

ID: 1791796 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1791868 - Posted: 29 May 2016, 22:23:57 UTC - in response to Message 1791794.  

As a matter of fact, regarding the database performance issues.. Matt recently said [...]we are making some huge advances in reducing the science database. All the database performance problems I've been moaning about for years are finally getting solved, or worked around, basically. This is really good news.

Which is great - but the recent discussion has been about the BOINC task/workunit processing database. The science database is the (huge) repository of all the signals found since the project started - it doesn't affect day-to-day matters like 'tasks in progress' limits at all.

Mmm.. noted. I must have misread that. But it is quite likely that maybe some of the solutions for the performance issues of the science DB can translate to the BOINC DB.

I remember the limits were put in place because of the lower-than-expected I/O performance of the DB, which was causing slowdowns to the point of outright crashing. I remember the thought being "if we get more disk spindles, we should be able to increase the I/O," but then it turned out that it was looking more like software/kernel limitations and not so much the hardware, but it was also suspected that it may have been the RAID controller itself and the drivers for it.

It's just been a long time since there was any details about it, so it's all a bit fuzzy now.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1791868 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11359
Credit: 29,581,041
RAC: 66
United States
Message 1791923 - Posted: 30 May 2016, 1:22:02 UTC - in response to Message 1791796.  

We really deserve some pictures of this machine.
ID: 1791923 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1791962 - Posted: 30 May 2016, 3:49:26 UTC

The one I just built here with the 48 cores? I'll see if I can take a couple and find some way to get them on a photo hosting site, since our server isn't able to handle pics locally I was told when I wanted to post some here in the past. But, to be perfectly honest, it's less than impressive, because as with most of my systems, it's an open board with a PSU and a (couple, in this case) disk drives.

Not much to see, at all, but if you'd really want check it out, I suppose I can do that, to satisfy peoples curiosity. Maybe I will also post another of my setups, the one that I built a extended video card rack for, that has all the video cards running about 8" above the motherboard. That one I feel is much more interesting, and actually took some effort to build. :-)

ID: 1791962 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1792004 - Posted: 30 May 2016, 7:05:45 UTC - in response to Message 1791962.  

... because as with most of my systems, it's an open board with a PSU and a (couple, in this case) disk drives.

Ah, that would explain why you've got such good CPU temperatures. Running in it's proper server case, even with all the fans screaming along, i'd expect it to be hotter than what it is.
Grant
Darwin NT
ID: 1792004 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1792110 - Posted: 30 May 2016, 14:56:36 UTC - in response to Message 1792004.  

... because as with most of my systems, it's an open board with a PSU and a (couple, in this case) disk drives.

Ah, that would explain why you've got such good CPU temperatures. Running in it's proper server case, even with all the fans screaming along, i'd expect it to be hotter than what it is.

Oh Heck, Yeah. It's because of that accurate description that I am doing it this way. Don't have a datacenter to muffle the sound, and they can get unbearably loud, especially having to listen to them 24/7. I was unsure what to expect from the OEM coolers, but they appear to be working pretty well, looking at the temps right now, room temp is 74f, and about 1/2 are running between 43 and 46, the other half are running 47-51, with only 4 cores currently at 50 or above, but that probably depends on the WU's they are crunching?

ID: 1792110 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 1792118 - Posted: 30 May 2016, 15:19:45 UTC - in response to Message 1792110.  

and about 1/2 are running between 43 and 46, the other half are running 47-51, with only 4 cores currently at 50 or above, but that probably depends on the WU's they are crunching?

The on-chip sensors were not originally intended to be accurate thermometers, and both in my personal experience and as reported by others sensors on the same die can be surprisingly mismatched.

For the specific differences you are reporting, I'd guess there may be some component of real cross-die temperature variation (probably your heat sink and thermal compound don't remove heat perfectly uniformly) with some component of "thermometer error".

As not even the slope of the devices is necessarily well matched, the ideal calibration method would involve setting the whole CPU die to near-zero power idle, but warming it up with external means (with HSF still on). If you warm it to near the actual operating point of interest, but there is very little power dissipation in the chip, then the real temperatures at all the sensors should be well matched, and you can take their reported differences as calibration offset errors. That gives you relative error, but still leaves the overall offset error uncontrolled.

Or you could say "looks pretty good to me" and leave it alone, which is what I'd do in your specific situation.
ID: 1792118 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1792127 - Posted: 30 May 2016, 15:38:17 UTC - in response to Message 1792118.  

...Or you could say "looks pretty good to me" and leave it alone, which is what I'd do in your specific situation.

You nailed it! :-D

But thank you for the detailed reply, that is one that would fall under 'Good to know'!

ID: 1792127 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1792327 - Posted: 30 May 2016, 22:52:55 UTC - in response to Message 1792304.  

Thanks! My RAC in the software shows that it has rocketed from 0 to 4600 in those 2 short days. Who knows how high it just might go? :-)

ID: 1792327 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1792434 - Posted: 1 Jun 2016, 3:35:18 UTC

Well, I just found out exactly how long my cache will last before it runs out of work, that is about 4 or so hours. The server went down at about 11 my time, and I ran out of work at about 3:30. So, looks like the system will be hanging around taking a break every Tuesday for 3-5 hours without much to do, though once the GPU is in, that should be fine, it as it will probably only be running 3-4 tasks at a time, we'll have to see how it goes. I learned something today about my new system, so that is a good thing!

ID: 1792434 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1792448 - Posted: 1 Jun 2016, 3:54:19 UTC - in response to Message 1792442.  

Yep, I clicked on Update just for giggles because I just got home (as it said it was a couple hours before it would auto attempt again), that was at 10:31, and shocker, it was up! Just before that, I had refreshed my screen for the threads here and my system stats page, and it still said system down for maint, so imagine the suprise less than 30 seconds later it had started downloading 100 tasks! I was quite happy as you'd expect.

I figured it out by going into my event log, and looking at the time the last comm was, and then the time of the last task completing. So the 4-5 hour estimate given earlier was pretty accuate, it depends of course on where the 1st half of the cache is in the processing, if most of them have just started it might go a little longer, if most have been crunching for a bit it will be shorter, but this is a decent average for how long it will take to run out of CPU tasks. Not complaining, just noting it is all. :-)

ID: 1792448 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1792451 - Posted: 1 Jun 2016, 4:26:22 UTC - in response to Message 1792327.  
Last modified: 1 Jun 2016, 4:27:29 UTC

Thanks! My RAC in the software shows that it has rocketed from 0 to 4600 in those 2 short days. Who knows how high it just might go? :-)


Don't know for sure, but can do some ballparking.

IIRC 8 Xeon cores doing MB, back in cobblestone scale days, used to get about 20K RAC on PreAVX AKv8 code. Since then there's been two main credit drops amounting to x ~30%. You claw back a little for increased throughput with AVX (about 1.5x), so my guess with 48 CPU cores alone (AVX capable + fast memory), would be 20K*6*0.3*1.5 ~= 50K (1 significant digit). Lots of variability, especially if adding AP and weird work mixes and GPUs into the picture.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1792451 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1792555 - Posted: 1 Jun 2016, 12:21:18 UTC - in response to Message 1792451.  

Thanks for the guesstimate, 50k before any GPU input is pretty good I'd have to say! I chatted with EVGA yesterday about water cooling parts for a couple cards I am planning on putting into a system of mine, and found that the new 1080 I ordered almost certainly won't ship this week, and possibly won't even ship next week, but I am crossing my fingers.

ID: 1792555 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11359
Credit: 29,581,041
RAC: 66
United States
Message 1792626 - Posted: 1 Jun 2016, 17:49:11 UTC

Al it looks like you could use a backup project for that monster. Einstein has a very cool CPU app if you like gravity waves.
ID: 1792626 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1792642 - Posted: 1 Jun 2016, 18:20:27 UTC - in response to Message 1792626.  
Last modified: 1 Jun 2016, 18:20:45 UTC

I've thought about that from time to time, but I've been a pretty loyal and true blue SETI dude, going back 17 years, so I think I'll probably begin and end my crunching 'career' with them, so, good or bad, I'll take what they give me. Besides, this doesn't really draw _that_ much power when it is idled down not doing anything while waiting. Still need to figure out that site for storing my pics, prefer something that is non-Google, there's photobucket, flickr, and a few others, anyone had good experiences over the longer term with any in particular?

ID: 1792642 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1792643 - Posted: 1 Jun 2016, 18:22:50 UTC - in response to Message 1792642.  
Last modified: 1 Jun 2016, 18:23:16 UTC

I found lightshot's printscreen utility pretty nifty in the past, a fair bit less cumbersome than negotiating photobucket's links and using the Windows snipping tool.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1792643 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1792650 - Posted: 1 Jun 2016, 18:39:25 UTC

I use 50webs.net for my web server and like it. They also have a free web server (I tied and didn't mind it)

File transfers are quite easy, and easy to link to.
ID: 1792650 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1792652 - Posted: 1 Jun 2016, 18:41:53 UTC - in response to Message 1792643.  

They also good at the storage side as well? And not gathering all sorts of metrics and data on users? Jeeze, since Snowden, I've just seemed to think about all that business a lot more, both on the govt side and on the corporate side. Everyone seems to want info/data, or monitoring, or both. If this was the 90's they'd say to put on a tinfoil hat and all that, but look where we are today.

And the above statement reflects just what we know, who knows whats going on that we don't? Well, one big solar flare later, like what fried a number of telegraphs around the world on September 1, 1859 and all this tracking and monitoring will be the least of our worries. ;-)

Inter-what?

ID: 1792652 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1792654 - Posted: 1 Jun 2016, 18:47:18 UTC - in response to Message 1792652.  

Meh, no idea how much (if any) snooping/datamining may or may not be in the lightshot tray app. Naturally anything supposedly 'free' then you're the product. I only activate the application on the rare occasions that I need it, mostly since I don't like a lot of background applications running anyway. Probably anything/anyone building a digital profile of me would be pretty confused at this point, so I'm less paranoid than I once was.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1792654 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 11 · Next

Message boards : Number crunching : LotzaCores and a GTX 1080 FTW


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.