20-core/40-thread Xeon E5-2630 v4 duo

Message boards : Number crunching : 20-core/40-thread Xeon E5-2630 v4 duo
Message board moderation

To post messages, you must log in.

AuthorMessage
AMDave
Volunteer tester

Send message
Joined: 9 Mar 01
Posts: 234
Credit: 11,671,730
RAC: 0
United States
Message 1808003 - Posted: 8 Aug 2016, 19:44:51 UTC

ID: 1808003 · Report as offensive
Profile Shaggie76
Avatar

Send message
Joined: 9 Oct 09
Posts: 282
Credit: 271,858,118
RAC: 196
Canada
Message 1808008 - Posted: 8 Aug 2016, 20:24:37 UTC

I'd hazard a guess that a pair of 1070's would be a bit better throughput and about the same price. My 980ti is 20% faster than my Haswell-E and I'd bet that beautiful dual Xeon would be about 60% faster than my CPU so unless I've missed a factor a simple rig with twin 1070's should be more productive.

Of course if you fail (like me) to get a pair of 1070's to stabilize the point is moot.
ID: 1808008 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1808024 - Posted: 8 Aug 2016, 21:12:12 UTC - in response to Message 1808008.  

I'd hazard a guess that a pair of 1070's would be a bit better throughput.

For guppies or for traditional work?
ID: 1808024 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1808473 - Posted: 11 Aug 2016, 9:26:48 UTC - in response to Message 1808003.  

Building a 40-Thread Xeon Monster PC for less than a Broadwell-E Core i7

Let the discussion commence.

20 cores & 40 threads for around the same power consumption as an FX-8350.
And when you consider how well Al's system performs, this would be a very capable cruncher.
And depending on the motherboard you could add multiple high end video cards & have plenty of threads to support them each crunching multiple WUs.
Grant
Darwin NT
ID: 1808473 · Report as offensive
AMDave
Volunteer tester

Send message
Joined: 9 Mar 01
Posts: 234
Credit: 11,671,730
RAC: 0
United States
Message 1808520 - Posted: 11 Aug 2016, 14:25:29 UTC - in response to Message 1808473.  
Last modified: 11 Aug 2016, 14:25:58 UTC

Given:

... a simple rig with twin 1070's should be more productive.

also

And when you consider how well Al's system performs, this would be a very capable cruncher.

and, although this remains unanswered

I'd hazard a guess that a pair of 1070's would be a bit better throughput.

For guppies or for traditional work?

My interest is long-term (ie. Efficiency).  For example, referencing Shaggie's graphs (most recent one here), we can see that when it comes to production (Avg Credit/Hour), the 1070 spanks the 750Ti.  However, when it comes to efficiency (Avg Credit/Watt-Hour), the 750 Ti more than holds it own, which means more bang for your buck (efficiency-wise).

Having said that, if we were to compare a rig only crunching on two 1070s (with 1, 2 or 3 WUs concurrently), with the topic rig of this thread crunching on all threads concurrently, which one would wear the efficiency crown?
ID: 1808520 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1808581 - Posted: 11 Aug 2016, 20:27:04 UTC - in response to Message 1808520.  
Last modified: 11 Aug 2016, 20:27:55 UTC

Given:

... a simple rig with twin 1070's should be more productive.

also

And when you consider how well Al's system performs, this would be a very capable cruncher.

and, although this remains unanswered

I'd hazard a guess that a pair of 1070's would be a bit better throughput.

For guppies or for traditional work?

My interest is long-term (ie. Efficiency).  For example, referencing Shaggie's graphs (most recent one here), we can see that when it comes to production (Avg Credit/Hour), the 1070 spanks the 750Ti.  However, when it comes to efficiency (Avg Credit/Watt-Hour), the 750 Ti more than holds it own, which means more bang for your buck (efficiency-wise).

Having said that, if we were to compare a rig only crunching on two 1070s (with 1, 2 or 3 WUs concurrently), with the topic rig of this thread crunching on all threads concurrently, which one would wear the efficiency crown?

Since Credit is variable. I find comparing watt hours per task to be much more helpful in finding how efficient a device is for crunching. With the Wh/task info charts could be made ranking the devices by efficiency and a column for the run time would be helpful to guesstimate credit.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1808581 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1808794 - Posted: 12 Aug 2016, 22:56:50 UTC - in response to Message 1808520.  

Having said that, if we were to compare a rig only crunching on two 1070s (with 1, 2 or 3 WUs concurrently), with the topic rig of this thread crunching on all threads concurrently, which one would wear the efficiency crown?


Depends on what you consider efficiency to be. Most WUs per hour, or number of WUs per Watt?
My GTX 750Ti & GTX 1070 are doing just under 10WUs per hour (all Guppies) and they're pulling up to 140W (on average around 120W)
For the 40 thread Xeon it would be around 2.5 hrs per WU (depending on the WU, with the AVX application), so around 16 WUs per hour and requiring around 270W.
So the Dual Xeon system would lead on the number of WUs crunched per hour, but it takes more power per WU crunched to do so.
Of course if you get a motherboard that supports a dozen PCIe slots, and use each of those slots to drive 1 current GPU & use a Core for each WU being crunched & use command lines to get the most from the video cards- the output would be staggering. As would the power bill.
Grant
Darwin NT
ID: 1808794 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1808814 - Posted: 13 Aug 2016, 3:06:30 UTC

I'm pretty sure there are ways to split PCIe slots into multiple x4s and so forth. Theoretically you should be able to turn one x16 into 16 x1's. But isn't there a hard limit in the drivers that can't handle more than 8 GPUs?

I know with people using slot extenders and so forth that there is not a very large drop in performance (though it is noticeable) in running a GPU from an x1 connection, and at x4, there's basically no drop in performance since there doesn't need to be a ton of data transfer through the interface, as most of the work is done on the card itself. So it stands to reason that if you had a board with 4 actual x16s, you should theoretically be able to turn it into 16 x4s with basically no loss in performance. Though you would need at least 16 CPU threads to feed it all.

But I think if that were possible, there would be people who have done that already.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1808814 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1809004 - Posted: 14 Aug 2016, 2:53:56 UTC - in response to Message 1808794.  

I'll let you know how it works out, I'm in the process of building what you're referring to. It has 10 PCI-E x8 slots, so the plan is to load it up with GTX 750Ti SC's on those USB risers, and see how it works. It is a dual Xeon v4 motherboard/CPU (56 cores), and all those cards will certainly put a strain on the system. In fact, I'm not yet sure if it will run them all, though it's a 1200w PSU, so that is at max 750w for the GPU's (I've heard they usually draw between 45-60w, which is helpful), and the system will get the rest. It will be an interesting experiment... :-)

ID: 1809004 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1809008 - Posted: 14 Aug 2016, 3:08:45 UTC - in response to Message 1808814.  

I'm pretty sure there are ways to split PCIe slots into multiple x4s and so forth. Theoretically you should be able to turn one x16 into 16 x1's.

I don't see how.

Each slot has it's own device number. The PCIe*1 section includes power, data, signalling connections. The other sections just include data & signalling to indicate they are being used.
The slot, and the software drivers, are designed for a single device- not for running multiple different devices in the one slot.
That's why there are multiple slots.

Some slots share PCIe lanes, which is why a PCIe*16 slot can become a PCIe*8 slot when another slot on the motherboard is used.
One device per slot; the PCIe lanes can be shared between slots, but they are only ever used by one device at a time.
Grant
Darwin NT
ID: 1809008 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1809080 - Posted: 14 Aug 2016, 15:04:10 UTC - in response to Message 1809008.  

I'm pretty sure there are ways to split PCIe slots into multiple x4s and so forth. Theoretically you should be able to turn one x16 into 16 x1's.

I don't see how.

Each slot has it's own device number. The PCIe*1 section includes power, data, signalling connections. The other sections just include data & signalling to indicate they are being used.
The slot, and the software drivers, are designed for a single device- not for running multiple different devices in the one slot.
That's why there are multiple slots.

Some slots share PCIe lanes, which is why a PCIe*16 slot can become a PCIe*8 slot when another slot on the motherboard is used.
One device per slot; the PCIe lanes can be shared between slots, but they are only ever used by one device at a time.

I believe the configuration of the PCIe slots on a MB may be limited by Intel or AMD. On Intel's site they often list the PCIe configuration options under the CPU specs.
Supported Processor PCI Express Port Configurations: 1x16, 2x8, 1x8 and 2x4

The use of PCIe switches or splitters can be utilized to allow more PCIe devices than originally intended. The easiest options is to use a PCIe expansion chassis.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1809080 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1809087 - Posted: 14 Aug 2016, 15:48:31 UTC

HAL, any guesses as to my likelihood of success, before it's built? I'm going to assemble it regardless, but it'd be fun to hear your thoughts on what you think might happen. It has 10 physical and (I believe, they appear to have a full compliment of contacts inside them) electrical slots, so I would have hoped with that design they would have allowed a sufficient quantity of PCI-E lanes. I believe each CPU has a total of 40, is that correct, and if so would that make it a total of 80 lanes available on this board, which would cover the 10 slots with the proper 8 lanes per slot that the spec specifies? All guesswork at this time, but as this is the 2nd or 3rd build down the line right now, all I can do is speculate till I get it running. But sometimes that's half the fun!

ID: 1809087 · Report as offensive
_heinz
Volunteer tester

Send message
Joined: 25 Feb 05
Posts: 744
Credit: 5,539,270
RAC: 0
France
Message 1809095 - Posted: 14 Aug 2016, 16:23:32 UTC - in response to Message 1809087.  

I believe each CPU has a total of 40, is that correct, and if so would that make it a total of 80 lanes available on this board, which would cover the 10 slots with the proper 8 lanes per slot that the spec specifies?


It is correct, each CPU has 40 lanes, together 80 lanes
So you have 8 lanes per Slot.
Thats enough to Support all your GPU's
10 single Slot GPU's with 8 lanes
or
5 double-Slot GPU's with 16 lanes

perfect
D5400XS V8-Xeon
ID: 1809095 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1809099 - Posted: 14 Aug 2016, 17:29:29 UTC

But of course, each one is going to be going thru an individual 1x-USB 3.0 adapter/extension, so I think that the bottleneck will be there and not in the PCI-E 8x slots... ;-) Will be interesting to see how it all works out.

ID: 1809099 · Report as offensive
_heinz
Volunteer tester

Send message
Joined: 25 Feb 05
Posts: 744
Credit: 5,539,270
RAC: 0
France
Message 1809101 - Posted: 14 Aug 2016, 17:45:33 UTC - in response to Message 1809099.  

But of course, each one is going to be going thru an individual 1x-USB 3.0 adapter/extension, so I think that the bottleneck will be there and not in the PCI-E 8x slots... ;-) Will be interesting to see how it all works out.

ideally running your GPU's direct on the X16-Slots on your Motherboard, as I have seen it is enough space to run 5 double Slot Cards or use 10 single Slot Card.

But is interesting how these USB-Riser Cards will working
D5400XS V8-Xeon
ID: 1809101 · Report as offensive
Profile Rune Bjørge

Send message
Joined: 5 Feb 00
Posts: 45
Credit: 30,508,204
RAC: 5
Norway
Message 1809104 - Posted: 14 Aug 2016, 17:51:28 UTC - in response to Message 1808814.  

You can get splitters for PCIE. I have never tried the x16 to x4 Version, but i had a x1 to 3 * x1 splitter running in one system a time back (With cuda 5 app).

Did not measure performance of the system back then, but With 3 gtx 550 ti in the splitter my system was happily crunching away. The splitters and 550 card was later on replaced by 4 GTX 770 for a better performance.

Those el cheapo splitters did not handle the 770 Cards for some reason. Have not tried With any newer card thoug.
ID: 1809104 · Report as offensive
Profile Rune Bjørge

Send message
Joined: 5 Feb 00
Posts: 45
Credit: 30,508,204
RAC: 5
Norway
Message 1809110 - Posted: 14 Aug 2016, 18:03:15 UTC - in response to Message 1809087.  

I am thinking of something like Your build. Want to buy a server MB With a bundle of cores, and i've got a couple of Nvidia S1070 and P084 PCIE gpu chassis in shipping.

I've seen someone replacing the quadro Cards in those systems With more powerful Cards. My plan is to put the gpu chassis to good use With a load of GTX titans and 770, and then hook it all up to a mobo/cpu combo With enough cores to feed that lot.

At the moment i am searching for the right mobo for the build to get the number of cores way up just to see what kind of "damage" a system like that can cause, rac wise, Power bill wise and financially.
ID: 1809110 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1809112 - Posted: 14 Aug 2016, 18:14:50 UTC - in response to Message 1809110.  

Yeah, that would ideally be the way to go, a system with as many as needed real 16x PCI-E 3.0 slots. I'm still a bit up in the air about the whole PCI-E 1x to USB 3.0 replacing a 16x PCI-E connection, as obviously it's missing a number of components, 1st of all being a 16x connection, as well as getting it's normal 75 (potentially) watts from a Molex connection on the 16x daughterboard. I mean, what could possibly go wrong? ;-) Oh well, if no one ever tried anything out of the ordinary, we'd still be living in caves, right?

ID: 1809112 · Report as offensive

Message boards : Number crunching : 20-core/40-thread Xeon E5-2630 v4 duo


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.