Message boards :
Number crunching :
20-core/40-thread Xeon E5-2630 v4 duo
Message board moderation
Author | Message |
---|---|
AMDave Send message Joined: 9 Mar 01 Posts: 234 Credit: 11,671,730 RAC: 0 |
Building a 40-Thread Xeon Monster PC for less than a Broadwell-E Core i7 Let the discussion commence. |
Shaggie76 Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 |
I'd hazard a guess that a pair of 1070's would be a bit better throughput and about the same price. My 980ti is 20% faster than my Haswell-E and I'd bet that beautiful dual Xeon would be about 60% faster than my CPU so unless I've missed a factor a simple rig with twin 1070's should be more productive. Of course if you fail (like me) to get a pair of 1070's to stabilize the point is moot. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
I'd hazard a guess that a pair of 1070's would be a bit better throughput. For guppies or for traditional work? |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
Building a 40-Thread Xeon Monster PC for less than a Broadwell-E Core i7 20 cores & 40 threads for around the same power consumption as an FX-8350. And when you consider how well Al's system performs, this would be a very capable cruncher. And depending on the motherboard you could add multiple high end video cards & have plenty of threads to support them each crunching multiple WUs. Grant Darwin NT |
AMDave Send message Joined: 9 Mar 01 Posts: 234 Credit: 11,671,730 RAC: 0 |
Given: ... a simple rig with twin 1070's should be more productive. also And when you consider how well Al's system performs, this would be a very capable cruncher. and, although this remains unanswered I'd hazard a guess that a pair of 1070's would be a bit better throughput. My interest is long-term (ie. Efficiency).  For example, referencing Shaggie's graphs (most recent one here), we can see that when it comes to production (Avg Credit/Hour), the 1070 spanks the 750Ti.  However, when it comes to efficiency (Avg Credit/Watt-Hour), the 750 Ti more than holds it own, which means more bang for your buck (efficiency-wise). Having said that, if we were to compare a rig only crunching on two 1070s (with 1, 2 or 3 WUs concurrently), with the topic rig of this thread crunching on all threads concurrently, which one would wear the efficiency crown? |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
Given: Since Credit is variable. I find comparing watt hours per task to be much more helpful in finding how efficient a device is for crunching. With the Wh/task info charts could be made ranking the devices by efficiency and a column for the run time would be helpful to guesstimate credit. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
Having said that, if we were to compare a rig only crunching on two 1070s (with 1, 2 or 3 WUs concurrently), with the topic rig of this thread crunching on all threads concurrently, which one would wear the efficiency crown? Depends on what you consider efficiency to be. Most WUs per hour, or number of WUs per Watt? My GTX 750Ti & GTX 1070 are doing just under 10WUs per hour (all Guppies) and they're pulling up to 140W (on average around 120W) For the 40 thread Xeon it would be around 2.5 hrs per WU (depending on the WU, with the AVX application), so around 16 WUs per hour and requiring around 270W. So the Dual Xeon system would lead on the number of WUs crunched per hour, but it takes more power per WU crunched to do so. Of course if you get a motherboard that supports a dozen PCIe slots, and use each of those slots to drive 1 current GPU & use a Core for each WU being crunched & use command lines to get the most from the video cards- the output would be staggering. As would the power bill. Grant Darwin NT |
Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13 |
I'm pretty sure there are ways to split PCIe slots into multiple x4s and so forth. Theoretically you should be able to turn one x16 into 16 x1's. But isn't there a hard limit in the drivers that can't handle more than 8 GPUs? I know with people using slot extenders and so forth that there is not a very large drop in performance (though it is noticeable) in running a GPU from an x1 connection, and at x4, there's basically no drop in performance since there doesn't need to be a ton of data transfer through the interface, as most of the work is done on the card itself. So it stands to reason that if you had a board with 4 actual x16s, you should theoretically be able to turn it into 16 x4s with basically no loss in performance. Though you would need at least 16 CPU threads to feed it all. But I think if that were possible, there would be people who have done that already. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
I'll let you know how it works out, I'm in the process of building what you're referring to. It has 10 PCI-E x8 slots, so the plan is to load it up with GTX 750Ti SC's on those USB risers, and see how it works. It is a dual Xeon v4 motherboard/CPU (56 cores), and all those cards will certainly put a strain on the system. In fact, I'm not yet sure if it will run them all, though it's a 1200w PSU, so that is at max 750w for the GPU's (I've heard they usually draw between 45-60w, which is helpful), and the system will get the rest. It will be an interesting experiment... :-) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
I'm pretty sure there are ways to split PCIe slots into multiple x4s and so forth. Theoretically you should be able to turn one x16 into 16 x1's. I don't see how. Each slot has it's own device number. The PCIe*1 section includes power, data, signalling connections. The other sections just include data & signalling to indicate they are being used. The slot, and the software drivers, are designed for a single device- not for running multiple different devices in the one slot. That's why there are multiple slots. Some slots share PCIe lanes, which is why a PCIe*16 slot can become a PCIe*8 slot when another slot on the motherboard is used. One device per slot; the PCIe lanes can be shared between slots, but they are only ever used by one device at a time. Grant Darwin NT |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
I'm pretty sure there are ways to split PCIe slots into multiple x4s and so forth. Theoretically you should be able to turn one x16 into 16 x1's. I believe the configuration of the PCIe slots on a MB may be limited by Intel or AMD. On Intel's site they often list the PCIe configuration options under the CPU specs. Supported Processor PCI Express Port Configurations: 1x16, 2x8, 1x8 and 2x4 The use of PCIe switches or splitters can be utilized to allow more PCIe devices than originally intended. The easiest options is to use a PCIe expansion chassis. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
HAL, any guesses as to my likelihood of success, before it's built? I'm going to assemble it regardless, but it'd be fun to hear your thoughts on what you think might happen. It has 10 physical and (I believe, they appear to have a full compliment of contacts inside them) electrical slots, so I would have hoped with that design they would have allowed a sufficient quantity of PCI-E lanes. I believe each CPU has a total of 40, is that correct, and if so would that make it a total of 80 lanes available on this board, which would cover the 10 slots with the proper 8 lanes per slot that the spec specifies? All guesswork at this time, but as this is the 2nd or 3rd build down the line right now, all I can do is speculate till I get it running. But sometimes that's half the fun! |
_heinz Send message Joined: 25 Feb 05 Posts: 744 Credit: 5,539,270 RAC: 0 |
I believe each CPU has a total of 40, is that correct, and if so would that make it a total of 80 lanes available on this board, which would cover the 10 slots with the proper 8 lanes per slot that the spec specifies? It is correct, each CPU has 40 lanes, together 80 lanes So you have 8 lanes per Slot. Thats enough to Support all your GPU's 10 single Slot GPU's with 8 lanes or 5 double-Slot GPU's with 16 lanes perfect D5400XS V8-Xeon |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
But of course, each one is going to be going thru an individual 1x-USB 3.0 adapter/extension, so I think that the bottleneck will be there and not in the PCI-E 8x slots... ;-) Will be interesting to see how it all works out. |
_heinz Send message Joined: 25 Feb 05 Posts: 744 Credit: 5,539,270 RAC: 0 |
But of course, each one is going to be going thru an individual 1x-USB 3.0 adapter/extension, so I think that the bottleneck will be there and not in the PCI-E 8x slots... ;-) Will be interesting to see how it all works out. ideally running your GPU's direct on the X16-Slots on your Motherboard, as I have seen it is enough space to run 5 double Slot Cards or use 10 single Slot Card. But is interesting how these USB-Riser Cards will working D5400XS V8-Xeon |
Rune Bjørge Send message Joined: 5 Feb 00 Posts: 45 Credit: 30,508,204 RAC: 5 |
You can get splitters for PCIE. I have never tried the x16 to x4 Version, but i had a x1 to 3 * x1 splitter running in one system a time back (With cuda 5 app). Did not measure performance of the system back then, but With 3 gtx 550 ti in the splitter my system was happily crunching away. The splitters and 550 card was later on replaced by 4 GTX 770 for a better performance. Those el cheapo splitters did not handle the 770 Cards for some reason. Have not tried With any newer card thoug. |
Rune Bjørge Send message Joined: 5 Feb 00 Posts: 45 Credit: 30,508,204 RAC: 5 |
I am thinking of something like Your build. Want to buy a server MB With a bundle of cores, and i've got a couple of Nvidia S1070 and P084 PCIE gpu chassis in shipping. I've seen someone replacing the quadro Cards in those systems With more powerful Cards. My plan is to put the gpu chassis to good use With a load of GTX titans and 770, and then hook it all up to a mobo/cpu combo With enough cores to feed that lot. At the moment i am searching for the right mobo for the build to get the number of cores way up just to see what kind of "damage" a system like that can cause, rac wise, Power bill wise and financially. |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Yeah, that would ideally be the way to go, a system with as many as needed real 16x PCI-E 3.0 slots. I'm still a bit up in the air about the whole PCI-E 1x to USB 3.0 replacing a 16x PCI-E connection, as obviously it's missing a number of components, 1st of all being a 16x connection, as well as getting it's normal 75 (potentially) watts from a Molex connection on the 16x daughterboard. I mean, what could possibly go wrong? ;-) Oh well, if no one ever tried anything out of the ordinary, we'd still be living in caves, right? |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.