Xeon Phi (aka Knights Corner, MIC)


log in

Advanced search

Message boards : Number crunching : Xeon Phi (aka Knights Corner, MIC)

1 · 2 · Next
Author Message
Profile Ex
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 12 Mar 12
Posts: 2895
Credit: 1,797,699
RAC: 396
United States
Message 1248859 - Posted: 20 Jun 2012, 18:09:45 UTC
Last modified: 20 Jun 2012, 18:18:01 UTC

Xeon Phi, (MIC is finally coming to fruition)

By the end of this year we should start having a real idea where this is going. I am looking forward to seeing a CPU solution to parallel processing.

One of the interesting points of the Xeon Phi, as opposed to GPUs is that the Phi can be used as a dedicated cruncher, or as a system co-processor (I'm assuming you'll need a Xeon CPU to pull this off).

It will be running itself on a Linux, which I've heard you will even be able to SSH into. (I'm hoping for a webGUI ;-))

I'm waiting hopefully to see if it will be released in a x8 package (in addition to the x16 pictured below), for use in existing server systems that lack pci-e x16.



More Info.
____________
-Dave #2

3.2.0-33

.clair.
Volunteer moderator
Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 23,080,292
RAC: 775
United Kingdom
Message 1248957 - Posted: 20 Jun 2012, 22:15:39 UTC

I will wait for real world comparisons against the likes of 680 and 7970 before i sell an arm and a leg to pay for Intel`s Tesla.
Nice pice of kit though.

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2326
Credit: 8,868,033
RAC: 947
United States
Message 1248969 - Posted: 20 Jun 2012, 22:52:54 UTC

Well with the way PCI-e connectors work, if you didn't care about any warranties, you could take an x16 card and actually cut the connector to make it fit into an x8 slot. Conversely, you can cut the end of an x8 slot so you can put an x16 card in there. Either way will work. Throughput will be half of the theoretical max for x16, but as we've seen around here on the forums, GPUs don't slow down much at all when put on x1.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4592
Credit: 121,555,540
RAC: 51,310
United States
Message 1248980 - Posted: 20 Jun 2012, 23:25:30 UTC - in response to Message 1248969.

Well with the way PCI-e connectors work, if you didn't care about any warranties, you could take an x16 card and actually cut the connector to make it fit into an x8 slot. Conversely, you can cut the end of an x8 slot so you can put an x16 card in there. Either way will work. Throughput will be half of the theoretical max for x16, but as we've seen around here on the forums, GPUs don't slow down much at all when put on x1.

I would probably feel safer screwing a heavy duty paper clip into my soldering station iron and carving out a notch in the connector. Rather than trying to dremel off the connector edge. I've been thinking of trying that on my MoDT system that only has a x1 connector since that system is starting to get wonky I wouldn't feel bad if I killed it.

Once these come out and some make their way into the hands of developers for app testing it will be nice to see how they compare in performance/cost.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

fataldog187
Send message
Joined: 4 Nov 02
Posts: 42
Credit: 1,271,261
RAC: 0
United States
Message 1249010 - Posted: 21 Jun 2012, 0:40:32 UTC - in response to Message 1248980.

Well with the way PCI-e connectors work, if you didn't care about any warranties, you could take an x16 card and actually cut the connector to make it fit into an x8 slot. Conversely, you can cut the end of an x8 slot so you can put an x16 card in there. Either way will work. Throughput will be half of the theoretical max for x16, but as we've seen around here on the forums, GPUs don't slow down much at all when put on x1.

I would probably feel safer screwing a heavy duty paper clip into my soldering station iron and carving out a notch in the connector. Rather than trying to dremel off the connector edge. I've been thinking of trying that on my MoDT system that only has a x1 connector since that system is starting to get wonky I wouldn't feel bad if I killed it.

Why wouldn't you use one of those PCIe ribbon riser/extension things instead?
____________

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4592
Credit: 121,555,540
RAC: 51,310
United States
Message 1249023 - Posted: 21 Jun 2012, 1:46:37 UTC - in response to Message 1249010.

Well with the way PCI-e connectors work, if you didn't care about any warranties, you could take an x16 card and actually cut the connector to make it fit into an x8 slot. Conversely, you can cut the end of an x8 slot so you can put an x16 card in there. Either way will work. Throughput will be half of the theoretical max for x16, but as we've seen around here on the forums, GPUs don't slow down much at all when put on x1.

I would probably feel safer screwing a heavy duty paper clip into my soldering station iron and carving out a notch in the connector. Rather than trying to dremel off the connector edge. I've been thinking of trying that on my MoDT system that only has a x1 connector since that system is starting to get wonky I wouldn't feel bad if I killed it.

Why wouldn't you use one of those PCIe ribbon riser/extension things instead?

After looking at my board I wouldn't be able to fit a x16 card even if I mod the connector. So one of those ribbon solutions would have to be the answer for that system. If I really wanted to go for it. I don't know if I really want to spend the $4 though. lol

Really I have been thinking about replacing the system with a newer board that has a x16 slot. Such as a Super Micro X9SCV-Q, but I am hoping for a newer release using a 70 series chipset. And also that it will work one of those Xeon Phi boards in it.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

Profile Ex
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 12 Mar 12
Posts: 2895
Credit: 1,797,699
RAC: 396
United States
Message 1249072 - Posted: 21 Jun 2012, 5:02:00 UTC - in response to Message 1248957.
Last modified: 21 Jun 2012, 5:04:33 UTC

I will wait for real world comparisons against the likes of 680 and 7970 before i sell an arm and a leg to pay for Intel`s Tesla.
Nice pice of kit though.


Sounds like these bad boys are going to take some serious overhead within, to run properly:

Intel said its Xeon Phi boards will have at least 8GB of GDDR5 memory, which is a third more than current generation Nvidia Tesla cards. Hazra told The INQUIRER that local memory will be what determines the overall performance of Xeon Phi,

Source: The Inquirer (http://s.tt/1fenZ)


That tells me, that perhaps cost-wise GPUs may continue to be the better option... For now ;-)


And as far as hacking my slots or buying adapters, nope not in my little server, I love her and respect her too much, not gonna happen. If it's not a stock fit it won't be used.
____________
-Dave #2

3.2.0-33

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4592
Credit: 121,555,540
RAC: 51,310
United States
Message 1249384 - Posted: 21 Jun 2012, 12:43:22 UTC - in response to Message 1249072.

I will wait for real world comparisons against the likes of 680 and 7970 before i sell an arm and a leg to pay for Intel`s Tesla.
Nice pice of kit though.


Sounds like these bad boys are going to take some serious overhead within, to run properly:

Intel said its Xeon Phi boards will have at least 8GB of GDDR5 memory, which is a third more than current generation Nvidia Tesla cards. Hazra told The INQUIRER that local memory will be what determines the overall performance of Xeon Phi,

Source: The Inquirer (http://s.tt/1fenZ)


That tells me, that perhaps cost-wise GPUs may continue to be the better option... For now ;-)


And as far as hacking my slots or buying adapters, nope not in my little server, I love her and respect her too much, not gonna happen. If it's not a stock fit it won't be used.

Let's hope they are priced better then the current NVIDIA Tesla boards. As the M2090 runs in the neighborhood of $2500.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

-BeNt-
Avatar
Send message
Joined: 17 Oct 99
Posts: 1234
Credit: 10,116,112
RAC: 0
United States
Message 1249400 - Posted: 21 Jun 2012, 13:24:30 UTC

I doubt they will be priced better if at all than the Tesla cards. Depends on the speeds I would imagine. If they are on par or close to a Tesla performance wise they will be in the same neighborhood. If they are faster, get ready for $4-5K per.
____________
Traveling through space at ~67,000mph!

Profile Ex
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 12 Mar 12
Posts: 2895
Credit: 1,797,699
RAC: 396
United States
Message 1249422 - Posted: 21 Jun 2012, 14:59:47 UTC - in response to Message 1249400.
Last modified: 21 Jun 2012, 15:28:42 UTC

I doubt they will be priced better if at all than the Tesla cards. Depends on the speeds I would imagine. If they are on par or close to a Tesla performance wise they will be in the same neighborhood. If they are faster, get ready for $4-5K per.


Ouch... This is starting to sound more and more like a dream than a reality.

Even supposing $2K, that makes that compute card more expensive than my entire little Xeon server.
____________
-Dave #2

3.2.0-33

tbretProject donor
Volunteer tester
Avatar
Send message
Joined: 28 May 99
Posts: 2897
Credit: 218,381,374
RAC: 62,793
United States
Message 1249541 - Posted: 21 Jun 2012, 19:09:43 UTC - in response to Message 1249422.

I doubt they will be priced better if at all than the Tesla cards. Depends on the speeds I would imagine. If they are on par or close to a Tesla performance wise they will be in the same neighborhood. If they are faster, get ready for $4-5K per.


Ouch... This is starting to sound more and more like a dream than a reality.

Even supposing $2K, that makes that compute card more expensive than my entire little Xeon server.


That's been my major objection to the Tesla cards for crunching. I read somewhere that nVIDIA is having to take the cost/benefit ratio in consideration with regards to their Tesla offerings vs their GeForce offerings.

I doubt that we'll see inexpensive MICs as PCIe cards, maybe ever. It will be interesting to see how it evolves, though. Maybe something like PCIe (or its replacement) external connections to stand-alone MICs you stack next to your workstation?

Maybe something really wild I can't even imagine will come out of this.

eSATA seemed pretty esoteric to me after dealing with external drives on a parallel port, so I'm easily impressed, whatever happens.

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4592
Credit: 121,555,540
RAC: 51,310
United States
Message 1249548 - Posted: 21 Jun 2012, 19:22:02 UTC - in response to Message 1249541.

I doubt they will be priced better if at all than the Tesla cards. Depends on the speeds I would imagine. If they are on par or close to a Tesla performance wise they will be in the same neighborhood. If they are faster, get ready for $4-5K per.


Ouch... This is starting to sound more and more like a dream than a reality.

Even supposing $2K, that makes that compute card more expensive than my entire little Xeon server.


That's been my major objection to the Tesla cards for crunching. I read somewhere that nVIDIA is having to take the cost/benefit ratio in consideration with regards to their Tesla offerings vs their GeForce offerings.

I doubt that we'll see inexpensive MICs as PCIe cards, maybe ever. It will be interesting to see how it evolves, though. Maybe something like PCIe (or its replacement) external connections to stand-alone MICs you stack next to your workstation?

Maybe something really wild I can't even imagine will come out of this.

eSATA seemed pretty esoteric to me after dealing with external drives on a parallel port, so I'm easily impressed, whatever happens.

You can already get external PCIe enclosures & I have read that that will be some Thunderbolt enclosures soon. Unless the Thunderbolt enclosures have already some out.

What I want now is some way to connect machines together so that a machine could be presented to the system as a coprocessor.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

Horacio
Send message
Joined: 14 Jan 00
Posts: 536
Credit: 75,826,412
RAC: 22,652
Argentina
Message 1249569 - Posted: 21 Jun 2012, 19:55:31 UTC - in response to Message 1249548.

Ouch... This is starting to sound more and more like a dream than a reality.

Even supposing $2K, that makes that compute card more expensive than my entire little Xeon server.

I guess the gamer's segment will always win in the number of customers to the high-computing's segments... An also, if GPU becomes too expensive dady will not buy it... If a bussiness requires an MCI, the boss has to buy it...

What I want now is some way to connect machines together so that a machine could be presented to the system as a coprocessor.

Is not that what BOINC does? :D



____________

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4592
Credit: 121,555,540
RAC: 51,310
United States
Message 1249572 - Posted: 21 Jun 2012, 19:59:06 UTC - in response to Message 1249569.

Ouch... This is starting to sound more and more like a dream than a reality.

Even supposing $2K, that makes that compute card more expensive than my entire little Xeon server.

I guess the gamer's segment will always win in the number of customers to the high-computing's segments... An also, if GPU becomes too expensive dady will not buy it... If a bussiness requires an MCI, the boss has to buy it...

What I want now is some way to connect machines together so that a machine could be presented to the system as a coprocessor.

Is not that what BOINC does? :D


In the current universe, where I live, BOINC can not tell the machine it is running on to use the one next to is as a coprocessor.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

Horacio
Send message
Joined: 14 Jan 00
Posts: 536
Credit: 75,826,412
RAC: 22,652
Argentina
Message 1249575 - Posted: 21 Jun 2012, 20:10:31 UTC - in response to Message 1249572.

In the current universe, where I live, BOINC can not tell the machine it is running on to use the one next to is as a coprocessor.

You need the BOINC server apps to do that... ;b
(All our hosts are a kind of copocessors of the projects servers... anyway I was not trying to be really serious...it was just a kind joke...)

____________

Profile Alex Storey
Volunteer tester
Avatar
Send message
Joined: 14 Jun 04
Posts: 561
Credit: 1,684,169
RAC: 542
Greece
Message 1249624 - Posted: 21 Jun 2012, 21:21:59 UTC - in response to Message 1249422.

Ouch... This is starting to sound more and more like a dream than a reality.

Even supposing $2K, that makes that compute card more expensive than my entire little Xeon server.

I'm guessing your enthusiasm temporarily impaired your reading comprehension!;)
From the Enquirer article:
Intel said its Xeon Phi boards will have at least 8GB of GDDR5 memory, which is a third more than current generation Nvidia Tesla cards.

And from the AnandTech article that first got your attention.
...Xeon Phi co-processors won’t be available until the end of the year – if not next year – but regardless the timing is such that Intel will be going up against NVIDIA’s GK110-based Tesla K20, which is similarly expected by the end of the year.
So definately not a GeForce series competitor...


Press releases are always full of fanfare and usually quite boring, but I thoroughly enjoyed the Xeon's:
Intel Xeon Processors E5 Achieve Fastest Adoption, Announcing Xeon Phi Co-Processors

Here's what I liked:

The "SuperMUC" supercomputer at LRZ in Germany, which ranked fourth on the list, delivers 2.9 PetaFLOPs of performance, making it the most powerful in Europe, as well as the largest installation based on the new Intel Xeon processors E5 family.

The Intel Xeon processor E5 family is powering exponential performance gains in high performance computing...

This is the next step of Intel's commitment to achieve exascale-level computation by 2018...

Did Intel just say that they are going from 3PFLOPs to 1000 in just six years!?
Exponential indeed... Even if they do it in eight!

Providing a two week weather forecast with the same level of accuracy as in a 48-hour forecast, or mapping the human genome within 12 hours at less than $1000 as compared to the current two weeks at $60,000, are two examples of the many challenges that HPC will address with more compute capacity. The insatiable need for performance has driven the tremendous growth of HPC processor shipments over the last five years, and Intel predicts this growth will reach more than 20 percent CAGR in the next 5 years. Intel forecasts2 [sic] that the most powerful supercomputer in 2013 will feature the amount of CPUs greater than 1 percent of its own server CPU shipments in 2011.

Thats 28x faster (pretty much consistent with the exascale claim above) and more importantly 60x cheaper. Fun times:)
Sure this stuff is out of our league, but it's reassuring to know the competion is fierce. Plus it'll trickle down to us setiheads one way or another.

(You know, it never crossed my mind that the "48hr weather precision" could be a computing bottleneck! For the past few years I just assumed it was a modeling/chaos problem.)

Profile Fred J. Verster
Volunteer tester
Avatar
Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,902,299
RAC: 284
Netherlands
Message 1249662 - Posted: 21 Jun 2012, 22:21:19 UTC - in response to Message 1249624.
Last modified: 21 Jun 2012, 22:25:59 UTC

Ouch... This is starting to sound more and more like a dream than a reality.

Even supposing $2K, that makes that compute card more expensive than my entire little Xeon server.

I'm guessing your enthusiasm temporarily impaired your reading comprehension!;)
From the Enquirer article:
Intel said its Xeon Phi boards will have at least 8GB of GDDR5 memory, which is a third more than current generation Nvidia Tesla cards.

And from the AnandTech article that first got your attention.
...Xeon Phi co-processors won’t be available until the end of the year – if not next year – but regardless the timing is such that Intel will be going up against NVIDIA’s GK110-based Tesla K20, which is similarly expected by the end of the year.
So definately not a GeForce series competitor...


Press releases are always full of fanfare and usually quite boring, but I thoroughly enjoyed the Xeon's:
Intel Xeon Processors E5 Achieve Fastest Adoption, Announcing Xeon Phi Co-Processors

Here's what I liked:

The "SuperMUC" supercomputer at LRZ in Germany, which ranked fourth on the list, delivers 2.9 PetaFLOPs of performance, making it the most powerful in Europe, as well as the largest installation based on the new Intel Xeon processors E5 family.

The Intel Xeon processor E5 family is powering exponential performance gains in high performance computing...

This is the next step of Intel's commitment to achieve exascale-level computation by 2018...

Did Intel just say that they are going from 3PFLOPs to 1000 in just six years!?
Exponential indeed... Even if they do it in eight!

Providing a two week weather forecast with the same level of accuracy as in a 48-hour forecast, or mapping the human genome within 12 hours at less than $1000 as compared to the current two weeks at $60,000, are two examples of the many challenges that HPC will address with more compute capacity. The insatiable need for performance has driven the tremendous growth of HPC processor shipments over the last five years, and Intel predicts this growth will reach more than 20 percent CAGR in the next 5 years. Intel forecasts2 [sic] that the most powerful supercomputer in 2013 will feature the amount of CPUs greater than 1 percent of its own server CPU shipments in 2011.

Thats 28x faster (pretty much consistent with the exascale claim above) and more importantly 60x cheaper. Fun times:)
Sure this stuff is out of our league, but it's reassuring to know the competion is fierce. Plus it'll trickle down to us setiheads one way or another.

(You know, it never crossed my mind that the "48hr weather precision" could be a computing bottleneck! For the past few years I just assumed it was a modeling/chaos problem.)



No, only the sheer amount of data/time requires a powerfull, maybe 10TFLOPs,
maybe even more. And modelling/gaos problems,, don't have to calculated,
whithin 14 days. Whether forecast has to do so.

I'll follow the story it is the future also find it interresting, wonder what
price-tag this baby will have and what it's power use is.
(I'd calculated that you can compaire this to 256 GTX580 GPUs)!
____________

Profile Ex
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 12 Mar 12
Posts: 2895
Credit: 1,797,699
RAC: 396
United States
Message 1249785 - Posted: 22 Jun 2012, 5:03:50 UTC - in response to Message 1249541.
Last modified: 22 Jun 2012, 5:05:42 UTC

external connections to stand-alone MICs you stack next to your workstation?


You can do that with Tesla cards, I know it's possible to literally build or buy a Tesla compute "box" that wires into your rig. Again we're talking multiple thousands of US dollars.


What I want now is some way to connect machines together so that a machine could be presented to the system as a coprocessor.

Oh how sweet it would be...

Shouldn't we be able to do this natively with *nix based systems??? Now I'm curious.
____________
-Dave #2

3.2.0-33

Profile tullioProject donor
Send message
Joined: 9 Apr 04
Posts: 3816
Credit: 393,242
RAC: 238
Italy
Message 1249851 - Posted: 22 Jun 2012, 7:51:18 UTC - in response to Message 1249785.

There was a software tool called Parallel Virtual Machine by the University of Tennessee which allowed this. Once I connected a Bull/Mips minicomputer and a SUN SparcStation using this tool.
Tullio
____________

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4592
Credit: 121,555,540
RAC: 51,310
United States
Message 1249933 - Posted: 22 Jun 2012, 13:30:44 UTC - in response to Message 1249785.

external connections to stand-alone MICs you stack next to your workstation?


You can do that with Tesla cards, I know it's possible to literally build or buy a Tesla compute "box" that wires into your rig. Again we're talking multiple thousands of US dollars.


What I want now is some way to connect machines together so that a machine could be presented to the system as a coprocessor.

Oh how sweet it would be...

Shouldn't we be able to do this natively with *nix based systems??? Now I'm curious.

If you mean a beowulf cluster cluster configuration, then yes nix systems can do that without to much effort. However BOINC uses shared memory to talk between the client and the apps. Which this cluster configuration can't give you across nodes. If you want shared memory access at the moment you have to spend tens of thousands of dollars per node.

Perhaps someone could create an app that would work in this configuration. BOINC could run a 'controller' app, communicating with it via shared memory segments, and the 'controller' app could talk to a science app that runs on the node
As BOINC doesn't have any way to set any kind of controls on each instance of an app it runs. There would probably need to be a layer in between. The 'controller' app would talk to it. Then it would have the job of working out which instance talks to which node and such.

Another approach, which has been brought up before, is a super client version of BOINC. Where you have one client that runs a master client app and then all of the nodes run a client app that only talk to to master client app.

Either way would require quite a lot of coding I would imagine.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

1 · 2 · Next

Message boards : Number crunching : Xeon Phi (aka Knights Corner, MIC)

Copyright © 2014 University of California