What about Intel's new Phi?


log in

Advanced search

Message boards : Number crunching : What about Intel's new Phi?

Author Message
Nick
Send message
Joined: 17 May 99
Posts: 88
Credit: 9,045,714
RAC: 434
United States
Message 1306484 - Posted: 15 Nov 2012, 18:01:02 UTC

I see Intel is releasing a new coprocessor that offers extremely high performance and what seems to be a very easy port. How will Seti support this in the coming new year?

http://newsroom.intel.com/community/intel_newsroom/blog/2012/11/12/intel-delivers-new-architecture-for-discovery-with-intel-xeon-phi-coprocessors
____________

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4101
Credit: 33,140,782
RAC: 7,888
United Kingdom
Message 1306523 - Posted: 15 Nov 2012, 19:13:26 UTC - in response to Message 1306484.

I see Intel is releasing a new coprocessor that offers extremely high performance and what seems to be a very easy port. How will Seti support this in the coming new year?

http://newsroom.intel.com/community/intel_newsroom/blog/2012/11/12/intel-delivers-new-architecture-for-discovery-with-intel-xeon-phi-coprocessors

Ask at the Boinc Dev Forum, until Boinc supports it, Seti can't support it,

Claggy

Nick
Send message
Joined: 17 May 99
Posts: 88
Credit: 9,045,714
RAC: 434
United States
Message 1306529 - Posted: 15 Nov 2012, 19:19:52 UTC - in response to Message 1306523.

Would you be so kind as to point me to the Boinc Dev Forum?

____________

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4101
Credit: 33,140,782
RAC: 7,888
United Kingdom
Message 1306539 - Posted: 15 Nov 2012, 19:32:30 UTC - in response to Message 1306529.

Would you be so kind as to point me to the Boinc Dev Forum?

http://boinc.berkeley.edu/dev/index.php

You'll need to creat a new account to post, Best ask in this thread since it has no anwers: Xeon Phi (MIC): Will it crunch?

For Further Info, there is a thread here too: Xeon Phi (aka Knights Corner, MIC)

Claggy

Nick
Send message
Joined: 17 May 99
Posts: 88
Credit: 9,045,714
RAC: 434
United States
Message 1306550 - Posted: 15 Nov 2012, 20:03:55 UTC - in response to Message 1306539.

Thank you Claggy

This link is from someone who has traditionally been not very favorable to Intel but this time he's about as optimistic as anyone can get:

http://semiaccurate.com/2012/11/13/what-will-intel-xeon-phi-do-to-the-gpgpu-market/


____________

Keith White
Avatar
Send message
Joined: 29 May 99
Posts: 370
Credit: 2,838,555
RAC: 2,211
United States
Message 1306551 - Posted: 15 Nov 2012, 20:07:02 UTC
Last modified: 15 Nov 2012, 20:12:52 UTC

From what I read it's a cluster of roughly 60 Pentium (the original) class CPU cores but is devoid of the usual MMX, SSE and the new AVX instructions in favor of a specialized vector unit (512 bit SIMD design), all running at around 1GHz. In theory each core could do up to 4 threads but that's rarely occurs in reality due to other issues.

The card itself is $2700 and peaks at 225 watts. It's only 2-2.5x faster than the top end Xeon CPU in highly parallelized tasks.
____________
"Life is just nature's way of keeping meat fresh." - The Doctor

Nick
Send message
Joined: 17 May 99
Posts: 88
Credit: 9,045,714
RAC: 434
United States
Message 1306603 - Posted: 15 Nov 2012, 23:24:34 UTC - in response to Message 1306551.

The card itself is $2700 and peaks at 225 watts. It's only 2-2.5x faster than the top end Xeon CPU in highly parallelized tasks.


I believe it is over 1TeraFlop double precision. That's more than 2.5X a Xeon.


____________

Profile ivan
Volunteer tester
Avatar
Send message
Joined: 5 Mar 01
Posts: 612
Credit: 140,090,239
RAC: 153,060
United Kingdom
Message 1306613 - Posted: 16 Nov 2012, 0:05:51 UTC - in response to Message 1306484.

I see Intel is releasing a new coprocessor that offers extremely high performance and what seems to be a very easy port. How will Seti support this in the coming new year?

Ask me again when I get my hands on one. :-) Possibly before February if I made a good enough impression on our preferred HPC supplier last week...
____________

Keith White
Avatar
Send message
Joined: 29 May 99
Posts: 370
Credit: 2,838,555
RAC: 2,211
United States
Message 1306697 - Posted: 16 Nov 2012, 7:03:35 UTC - in response to Message 1306603.
Last modified: 16 Nov 2012, 7:10:40 UTC

The card itself is $2700 and peaks at 225 watts. It's only 2-2.5x faster than the top end Xeon CPU in highly parallelized tasks.


I believe it is over 1TeraFlop double precision. That's more than 2.5X a Xeon.


The 2.5x is from Intel's own page on the Phi, comparing two E5-2687 (3.1GHz, 8 cores) to one Phi card (1GHz, 60 cores). (footnote 5).

They did see some Monte Carlo simulations run at 10.75x faster comparing two E5-2670 (2.6GHz, 8 cores) to a Phi SE10P (1.1GHz, 61 cores). (footnote 9)

The 1 TFlop DP is theoretical peak performance.
____________
"Life is just nature's way of keeping meat fresh." - The Doctor

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3422
Credit: 46,795,218
RAC: 20,980
Russia
Message 1306722 - Posted: 16 Nov 2012, 10:04:05 UTC

If it will be programmable via OpenCL we can make use of it even w/o BOINC support as it was done already with early CUDA and early ATi Brook+/OpenCL support.
SETI@home (at least in it's optimized part) usually goes ahead of BOINC "official" support.
What we really need is hardware availability for test and free development tools for it.

____________

Nick
Send message
Joined: 17 May 99
Posts: 88
Credit: 9,045,714
RAC: 434
United States
Message 1306761 - Posted: 16 Nov 2012, 14:16:09 UTC - in response to Message 1306697.
Last modified: 16 Nov 2012, 14:18:15 UTC

The 2.5x is from Intel's own page on the Phi, comparing two E5-2687 (3.1GHz, 8 cores) to one Phi card (1GHz, 60 cores). (footnote 5)


Yes, you're right. That comparison is watt for watt. It would be 5X faster than a single 8 core Xeon but the power comparison seems more appropriate.
____________

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4241
Credit: 116,000,436
RAC: 143,449
United States
Message 1306833 - Posted: 16 Nov 2012, 17:49:46 UTC - in response to Message 1306722.

If it will be programmable via OpenCL we can make use of it even w/o BOINC support as it was done already with early CUDA and early ATi Brook+/OpenCL support.
SETI@home (at least in it's optimized part) usually goes ahead of BOINC "official" support.
What we really need is hardware availability for test and free development tools for it.

I think standard x86 CPU supports OpenCL. So I would guess x86 based coprocessor cards would also.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

Mark Lybeck
Send message
Joined: 9 Aug 99
Posts: 209
Credit: 101,439,499
RAC: 97,144
Finland
Message 1306881 - Posted: 16 Nov 2012, 19:44:27 UTC - in response to Message 1306697.

The card itself is $2700 and peaks at 225 watts. It's only 2-2.5x faster than the top end Xeon CPU in highly parallelized tasks.


I believe it is over 1TeraFlop double precision. That's more than 2.5X a Xeon.


The 2.5x is from Intel's own page on the Phi, comparing two E5-2687 (3.1GHz, 8 cores) to one Phi card (1GHz, 60 cores). (footnote 5).

They did see some Monte Carlo simulations run at 10.75x faster comparing two E5-2670 (2.6GHz, 8 cores) to a Phi SE10P (1.1GHz, 61 cores). (footnote 9)

The 1 TFlop DP is theoretical peak performance.


Hello,

You get almost One teraflop with one GTX 500 series (5xx) card at a fraction of the price. What is the business case here?
____________

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4252
Credit: 1,050,569
RAC: 250
United States
Message 1306891 - Posted: 16 Nov 2012, 20:20:16 UTC - in response to Message 1306881.

...
The 1 TFlop DP is theoretical peak performance.

Hello,

You get almost One teraflop with one GTX 500 series (5xx) card at a fraction of the price. What is the business case here?

Double precision. For full-up supercomputers, that's an important consideration and the PHI 1 TFLOPS is a lot better than the direct competitor NVIDIA Tesla K10 at about 0.19 TFLOPS DP.

For SETI@home which uses mostly single precision, and users who are willing to go with consumer grade GPUs, top GPUs from either AMD or NVIDIA are likely much better than PHI on a cost/performance basis. But it would be good if a system with PHI bought for other purposes could also be used for SETI crunching. As Raistmer said, it depends on Intel providing the OpenCL drivers and low/no cost development software.

Hal is right that the high level OpenCL can be translated to x86, both AMD and Intel have drivers which do that for some range of their CPUs. But doing OpenCL on a Sandy Bridge chip is somewhat different than the same on 64 cores each with 512 bit SIMD but lacking some other refinements.
Joe

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4241
Credit: 116,000,436
RAC: 143,449
United States
Message 1306894 - Posted: 16 Nov 2012, 20:27:50 UTC - in response to Message 1306881.

The card itself is $2700 and peaks at 225 watts. It's only 2-2.5x faster than the top end Xeon CPU in highly parallelized tasks.


I believe it is over 1TeraFlop double precision. That's more than 2.5X a Xeon.


The 2.5x is from Intel's own page on the Phi, comparing two E5-2687 (3.1GHz, 8 cores) to one Phi card (1GHz, 60 cores). (footnote 5).

They did see some Monte Carlo simulations run at 10.75x faster comparing two E5-2670 (2.6GHz, 8 cores) to a Phi SE10P (1.1GHz, 61 cores). (footnote 9)

The 1 TFlop DP is theoretical peak performance.


Hello,

You get almost One teraflop with one GTX 500 series (5xx) card at a fraction of the price. What is the business case here?

The number you are looking at on those cards is probably the SP(single precision) rating. Intel is stating TF for DP(double precision). Which generally is less than 1/4 of the SP rating. For instance a Radeon HD 6990 is rated 5099 GF running SP & 1276.88 GF running DP.

SETI@Home application currently only use SP. So the SP performance is what you would want to compare. Currently that isn't really helpful across manufacturers yet.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

Keith White
Avatar
Send message
Joined: 29 May 99
Posts: 370
Credit: 2,838,555
RAC: 2,211
United States
Message 1306967 - Posted: 16 Nov 2012, 23:49:29 UTC - in response to Message 1306891.

...
The 1 TFlop DP is theoretical peak performance.

Hello,

You get almost One teraflop with one GTX 500 series (5xx) card at a fraction of the price. What is the business case here?

Double precision. For full-up supercomputers, that's an important consideration and the PHI 1 TFLOPS is a lot better than the direct competitor NVIDIA Tesla K10 at about 0.19 TFLOPS DP.

For SETI@home which uses mostly single precision, and users who are willing to go with consumer grade GPUs, top GPUs from either AMD or NVIDIA are likely much better than PHI on a cost/performance basis. But it would be good if a system with PHI bought for other purposes could also be used for SETI crunching. As Raistmer said, it depends on Intel providing the OpenCL drivers and low/no cost development software.

Hal is right that the high level OpenCL can be translated to x86, both AMD and Intel have drivers which do that for some range of their CPUs. But doing OpenCL on a Sandy Bridge chip is somewhat different than the same on 64 cores each with 512 bit SIMD but lacking some other refinements.
Joe

The Tesla K20 is 1.17 TFlop peak DP. Also a 225 watt card. The K20X is 1.31 TFlop peak DP at 235 watts. The K20X are used in the new #1 in the Top500 supercomputer list.

The AMD S10000 is 1.48 TFlop peak DP at 375 watts.

Everyone has a dog in the HPC hunt now.
____________
"Life is just nature's way of keeping meat fresh." - The Doctor

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5831
Credit: 59,447,461
RAC: 47,796
Australia
Message 1306980 - Posted: 17 Nov 2012, 0:12:58 UTC - in response to Message 1306722.

If it will be programmable via OpenCL we can make use of it even w/o BOINC support as it was done already with early CUDA and early ATi Brook+/OpenCL support.

Keep in mind these are fully compatable x86 CPU cores.
If the software runs on a Xeon system, it will run on a Phi card without any changes. Of course if you do work the code to take advantage of Phi then you'll get even greater improvements in performance than just by running it as is.
____________
Grant
Darwin NT.

Message boards : Number crunching : What about Intel's new Phi?

Copyright © 2014 University of California