Best GPU performance

Message boards : Number crunching : Best GPU performance
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 925459 - Posted: 12 Aug 2009, 1:19:17 UTC - in response to Message 925391.  

[quote][quote]
Maybe Gigabyte GA-8N-SLI Quad Royal Motherboard? That review is from over 3 years ago, though, so the board might be hard to find. Other boards using the same nVidia chipset might be available, too.
                                                               Joe


My god, its true! They exist......


With the older motherboards only running the PCIe bus in x8 mode for SLI does that impact cuda performance in any way...

x8 should be perfectly fast enough for CUDA. SLI doesn't matter one way or the other with the latest drivers.

F.


If the PCIe bus doesn't factor in that much. Then I would go with a 20 clot passive backplane system. Like this with this dual xeon single board computer. 18 295's should be enough right? :)
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 925459 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 925476 - Posted: 12 Aug 2009, 3:11:41 UTC - in response to Message 925459.  

[quote][quote]
Maybe Gigabyte GA-8N-SLI Quad Royal Motherboard? That review is from over 3 years ago, though, so the board might be hard to find. Other boards using the same nVidia chipset might be available, too.
                                                               Joe

My god, its true! They exist......

With the older motherboards only running the PCIe bus in x8 mode for SLI does that impact cuda performance in any way...

x8 should be perfectly fast enough for CUDA. SLI doesn't matter one way or the other with the latest drivers.

F.

If the PCIe bus doesn't factor in that much. Then I would go with a 20 clot passive backplane system. Like this with this dual xeon single board computer. 18 295's should be enough right? :)

There isn't a huge amount of data transfer going on. That's why the multiple PCI-e channels aren't an absolute need, though you'd lose some small percentage.

However, the latencies in the switched arrangement of that backplane system could detract a lot. While crunching a midrange ~0.42 angle range WU the CPU has to tell the GPU what to do next about 3 billion times. If there were a 1 microsecond latency on average, that would add 3000 seconds to the crunch time. Average latencies of 40 nanoseconds would still amount to 2 minutes added to the crunch time.

One other detail would require some cooperation from the BOINC developers. The Scheduler won't believe any host has more than 8 CUDA GPUs, quoting from the sched_send.cpp source code:
const int MAX_CUDA_DEVS = 8;
// don't believe clients who claim they have more CUDA devices than this

                                                              Joe
ID: 925476 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 925506 - Posted: 12 Aug 2009, 5:11:17 UTC - in response to Message 925476.  


If the PCIe bus doesn't factor in that much. Then I would go with a 20 clot passive backplane system. Like this with this dual xeon single board computer. 18 295's should be enough right? :)

There isn't a huge amount of data transfer going on. That's why the multiple PCI-e channels aren't an absolute need, though you'd lose some small percentage.

However, the latencies in the switched arrangement of that backplane system could detract a lot. While crunching a midrange ~0.42 angle range WU the CPU has to tell the GPU what to do next about 3 billion times. If there were a 1 microsecond latency on average, that would add 3000 seconds to the crunch time. Average latencies of 40 nanoseconds would still amount to 2 minutes added to the crunch time.

One other detail would require some cooperation from the BOINC developers. The Scheduler won't believe any host has more than 8 CUDA GPUs, quoting from the sched_send.cpp source code:
const int MAX_CUDA_DEVS = 8;
// don't believe clients who claim they have more CUDA devices than this

                                                              Joe


Ah, No point to use more then 4 cards then :/ I wonder if this is an issue they will revisit with the new boards with 6 PCIe x16 slots or when they start cramming more GPUs on the cards.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 925506 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 925526 - Posted: 12 Aug 2009, 9:29:45 UTC - in response to Message 925506.  
Last modified: 12 Aug 2009, 10:00:31 UTC

Ah, No point to use more then 4 cards then :/ I wonder if this is an issue they will revisit with the new boards with 6 PCIe x16 slots or when they start cramming more GPUs on the cards.


There is no physical room for more than perhaps 6 one slot gpu's so the total number of cuda boards would be less than eight anyhow.

//Vyper

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 925526 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14653
Credit: 200,643,578
RAC: 874
United Kingdom
Message 925527 - Posted: 12 Aug 2009, 9:48:43 UTC - in response to Message 925526.  

There is no physical room for more than perhaps 6 one slot gpu's so the total number of cuda boards would be less than eight anyhow.

//Vyper

But the question was about a 20 slot expansion backplane.
ID: 925527 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 925529 - Posted: 12 Aug 2009, 10:03:01 UTC - in response to Message 925527.  

But the question was about a 20 slot expansion backplane.


Hm, i thought he had answered that question himself in my mind, refering to that it's no use and started discussing about the 6 slot PCI-E boards which you can't fit 6 dualies on.

If my assumptions were wrong about the question then i'm sorry, my bad.

Kind regards Vyper

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 925529 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20323
Credit: 7,508,002
RAC: 20
United Kingdom
Message 925530 - Posted: 12 Aug 2009, 10:38:29 UTC - in response to Message 925527.  

But the question was about a 20 slot expansion backplane.

Interesting idea, but have you seen the price of that thing?!

Cheaper the buy multiple motherboards and CPUs! (Unless you really are going to populate all 20 slots!!!)


Happy crunchin',
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 925530 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 925539 - Posted: 12 Aug 2009, 11:52:06 UTC
Last modified: 12 Aug 2009, 11:57:20 UTC


17 x GTX295..!


[http://www.hardware-infos.com/news.php?news=2865]-[german]

Google translation:
23-times Geforce GTX 295 for mankind

On Youtube is these days a video has been uploaded, which is the simultaneous operation of a total of 17 Geforce GTX 295 graphics cards, and 34 GT200 chips, each with 240 shader units 1D shows.
Initially, the author of this video the commissioning of 23 Geforce GTX 295-designed graphics cards, but failed on the power dissipation.

The campaign runs under the motto "A GPU Folding Farm" to the fight against Huntington's disease, an incurable disease, sometimes in the human nervous system, the autosomal dominant inherited.
Already a Geforce GTX 295 offers a theoretical computing power of 1788 to Gigaflops. As a result, the rack installed, 17 Geforce GTX 295-Gigaflops 30,396 cards available. Or in other words, just over 30 teraflops.

As the author announces, he has two of the six remaining 295-Geforce GTX graphics cards in the private PC in SLI mode (Quad-SLI) were used.
At the end of the video will also show that 14 Geforce GTX 295 graphics cards from EVGA and 9 were made by MSI. A Geforce GTX 260 and GeForce 9800 GT Round personal lineup from Nvidia.


Atlas Folder - 23 nVidia GTX295 GPU Folding Farm

[http://www.youtube.com/watch?v=KjOW5iW7dJQ]

ID: 925539 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 925541 - Posted: 12 Aug 2009, 12:03:38 UTC


Atlas Folder Phase 2 - 31 9800GX2s

[http://www.youtube.com/watch?v=mAcW3Y_IJJA]

ID: 925541 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 925543 - Posted: 12 Aug 2009, 12:08:27 UTC


Atlas Folder - Phase 2 Build

[http://www.youtube.com/watch?v=UI9QzIwAXFg]

ID: 925543 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 925544 - Posted: 12 Aug 2009, 12:17:02 UTC

ID: 925544 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 925621 - Posted: 12 Aug 2009, 18:10:15 UTC - in response to Message 925529.  
Last modified: 12 Aug 2009, 18:25:42 UTC

But the question was about a 20 slot expansion backplane.


Hm, i thought he had answered that question himself in my mind, refering to that it's no use and started discussing about the 6 slot PCI-E boards which you can't fit 6 dualies on.

If my assumptions were wrong about the question then i'm sorry, my bad.

Kind regards Vyper


I was thinking of an Asus P6T6 WS Revolution MB with 6 BFG NVIDIA GeForce GTX 295 H2OC cards. As they only require 1 slot width. So a "standard" motherboard could have 12 cuda tasks running in theory if it were not otherwise limited and you were willing to spend $8000 on the beast of a machine. lol Newegg has some nice photos of the card.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 925621 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 925704 - Posted: 12 Aug 2009, 23:20:43 UTC - in response to Message 925621.  

I was thinking of an Asus P6T6 WS Revolution MB with 6 BFG NVIDIA GeForce GTX 295 H2OC cards. As they only require 1 slot width. So a "standard" motherboard could have 12 cuda tasks running in theory if it were not otherwise limited and you were willing to spend $8000 on the beast of a machine. lol Newegg has some nice photos of the card.


Ah i thought of that one too but it's a dual slotter as bfgs own homepage quotes:

"One vacant add-in card slot below the PCI Express® x16 slot. This graphics card physically occupies two slots"

From http://www.bfgtech.com/bfgegtx2951792h2ocwbe.aspx

Otherwise it could've been plausible until a company actually could create a true single slotter of a GTX295..

But i don't think it's far fetched that we'll see a performance single slotter soon because nVidia is in the ramps for lower nm on their parts.

Kind regards Vyper

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 925704 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 925751 - Posted: 13 Aug 2009, 2:40:10 UTC - in response to Message 925529.  

But the question was about a 20 slot expansion backplane.


Hm, i thought he had answered that question himself in my mind, refering to that it's no use and started discussing about the 6 slot PCI-E boards which you can't fit 6 dualies on.

The board in question has a bunch of PCIe slots, and a processor-board slot at the end. The processor board has the chipset, etc. -- just no slots.

These go in a case designed for the board, with room for 20 cards.

So, my only real question is: is it 20 counting the processor card, or 20 plus the processor card.

... and yes, Martin, they're incredibly expensive because we're all out buying motherboards and not investing in these gawdawful expensive things.

These types of boards used to come in "split" 10/10 versions, and even split 5/5/5/5, so you could put four complete systems in one 4u case.

ID: 925751 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 925762 - Posted: 13 Aug 2009, 4:09:54 UTC - in response to Message 925751.  

But the question was about a 20 slot expansion backplane.


Hm, i thought he had answered that question himself in my mind, refering to that it's no use and started discussing about the 6 slot PCI-E boards which you can't fit 6 dualies on.

The board in question has a bunch of PCIe slots, and a processor-board slot at the end. The processor board has the chipset, etc. -- just no slots.

These go in a case designed for the board, with room for 20 cards.

So, my only real question is: is it 20 counting the processor card, or 20 plus the processor card.

... and yes, Martin, they're incredibly expensive because we're all out buying motherboards and not investing in these gawdawful expensive things.

These types of boards used to come in "split" 10/10 versions, and even split 5/5/5/5, so you could put four complete systems in one 4u case.


I have only seen SBC that use "2 slots" leaving you with at most 18 free slots. The 20 slot boards also are the whole width of a 19" chassis so the PSU's either get mounted in front or under the board. Making it taller and heavier :/

I haven't found anyone that makes a 3 or 4 SBC backplane that would be ideal for making CUDA crunchers. The best I have come across is a backplane with 6 segments. Each with 1 PCIe x16 connector, and 3 PCIe x8 connectors. So you could have 6 cuda machines in 1 box. There might be a 14-slot split backplane out there with enough slots on it for 2 quad video card machines, but I haven't found it yet.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 925762 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 925766 - Posted: 13 Aug 2009, 5:20:42 UTC - in response to Message 925762.  

But the question was about a 20 slot expansion backplane.


Hm, i thought he had answered that question himself in my mind, refering to that it's no use and started discussing about the 6 slot PCI-E boards which you can't fit 6 dualies on.

The board in question has a bunch of PCIe slots, and a processor-board slot at the end. The processor board has the chipset, etc. -- just no slots.

These go in a case designed for the board, with room for 20 cards.

So, my only real question is: is it 20 counting the processor card, or 20 plus the processor card.

... and yes, Martin, they're incredibly expensive because we're all out buying motherboards and not investing in these gawdawful expensive things.

These types of boards used to come in "split" 10/10 versions, and even split 5/5/5/5, so you could put four complete systems in one 4u case.


I have only seen SBC that use "2 slots" leaving you with at most 18 free slots. The 20 slot boards also are the whole width of a 19" chassis so the PSU's either get mounted in front or under the board. Making it taller and heavier :/

I haven't found anyone that makes a 3 or 4 SBC backplane that would be ideal for making CUDA crunchers. The best I have come across is a backplane with 6 segments. Each with 1 PCIe x16 connector, and 3 PCIe x8 connectors. So you could have 6 cuda machines in 1 box. There might be a 14-slot split backplane out there with enough slots on it for 2 quad video card machines, but I haven't found it yet.

I haven't looked. They're really expensive.

ID: 925766 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20323
Credit: 7,508,002
RAC: 20
United Kingdom
Message 925803 - Posted: 13 Aug 2009, 13:39:16 UTC - in response to Message 925751.  

The board in question has a bunch of PCIe slots, and a processor-board slot at the end. The processor board has the chipset, etc. -- just no slots.

These go in a case designed for the board, with room for 20 cards.

... These types of boards used to come in "split" 10/10 versions, and even split 5/5/5/5, so you could put four complete systems in one 4u case.

Anyone see any prices anywhere to price up a 4 cpu card behemoth to see how it compares against a PC motherboard solution?

Happy crunchin',
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 925803 · Report as offensive
john deneer
Volunteer tester
Avatar

Send message
Joined: 16 Nov 06
Posts: 331
Credit: 20,996,606
RAC: 0
Netherlands
Message 925808 - Posted: 13 Aug 2009, 14:15:40 UTC - in response to Message 925803.  

The board in question has a bunch of PCIe slots, and a processor-board slot at the end. The processor board has the chipset, etc. -- just no slots.

These go in a case designed for the board, with room for 20 cards.

... These types of boards used to come in "split" 10/10 versions, and even split 5/5/5/5, so you could put four complete systems in one 4u case.

Anyone see any prices anywhere to price up a 4 cpu card behemoth to see how it compares against a PC motherboard solution?

Happy crunchin',
Martin

Hi Martin,

You mean something like this? I have no idea about the price, you have to fill in a form to ask for a quote. There is probably a reason for that, not scaring potential customers off and all that ....

They say it is 'at a very affordable price', though :-)

Regards,
John.
ID: 925808 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65758
Credit: 55,293,173
RAC: 49
United States
Message 925829 - Posted: 13 Aug 2009, 16:51:14 UTC - in response to Message 925808.  

The board in question has a bunch of PCIe slots, and a processor-board slot at the end. The processor board has the chipset, etc. -- just no slots.

These go in a case designed for the board, with room for 20 cards.

... These types of boards used to come in "split" 10/10 versions, and even split 5/5/5/5, so you could put four complete systems in one 4u case.

Anyone see any prices anywhere to price up a 4 cpu card behemoth to see how it compares against a PC motherboard solution?

Happy crunchin',
Martin

Hi Martin,

You mean something like this? I have no idea about the price, you have to fill in a form to ask for a quote. There is probably a reason for that, not scaring potential customers off and all that ....

They say it is 'at a very affordable price', though :-)

Regards,
John.

Two slots and of course being there in that type case It's just one usable slot, As a 295 or newer would cover both, I'd rather go for an Asrock X58 Deluxe which as of Bios 1.2(It's at Bios revision 2.0 right now) supports four GTX295(Their site says GTX296, The support website is in Taiwan, So that's close enough) cards and I saw that Fred W mentioned Boinc as not supporting more than 8 gpus per install(4 cards) and good luck getting the Devs to change that, He said He tried already.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 925829 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 925836 - Posted: 13 Aug 2009, 17:35:10 UTC - in response to Message 925829.  

...and I saw that Fred W mentioned Boinc as not supporting more than 8 gpus per install(4 cards) and good luck getting the Devs to change that, He said He tried already.

Not guilty, M'Lud. I saw that post too but it wasn't mine. I have (well, don't have ATM as it's on RMA with XFX) only one GTX295.

F.
ID: 925836 · Report as offensive
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : Best GPU performance


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.