if you think cuda is fast

Message boards : Number crunching : if you think cuda is fast
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3

AuthorMessage
tfp
Volunteer tester

Send message
Joined: 20 Feb 01
Posts: 104
Credit: 3,137,259
RAC: 0
United States
Message 848044 - Posted: 2 Jan 2009, 5:23:33 UTC - in response to Message 845957.  
Last modified: 2 Jan 2009, 5:41:44 UTC

wait till we are able to get ati cards to work on this project.
the: ATi Radeon HD 4870 X2 has 1600 Shader Processors
the 4870 has 800
even my 3870 has 320
the nVidia GeForce GTX 280 only has 240

i believe thats what does the calculations is the shaders

when we can use ati cards ... my guess is workunits will be done in less than 5 minutes ...


BTW it might be 800 stream processors (ALUs) or 160 stream processor units (complex shaders?) a core. So 320 stream processor units total in the dual setup. This is only 40 more then Nvidia has of complex shaders on a single core. With the 295 rumor with 2 280 cores Nvidia might still have a large lead of "complex processing units".

So unless you know how each of the stream processors (the 800) or the stream processor units (the 160) compares with one of Nvidias "Shaders" you can't just use the numbers and compare them "apples to apples". So really I'm pretty sure the data being processed on a 4870 can be broken into 160 parts one for each stream processor unit and then the 5 ALU @ 750mhz in each attack the work given. With Nvidia the 280 Shaders @ 1500 (of unknown internal make up) are giving work and do their thing. The question is which setup is more effective. Is that everyone elses understanding?

So far on things like F@H, the one place that seems to have a ATI and Nvidia crunching APP, Nvidia looks much faster.

TechReport:
ATI http://www.techreport.com/articles.x/14990
The RV770 has 10 SIMD cores, as you can see, and each them contains 16 stream processor units. You may not be able to see it above, but each of those SP units is a superscalar processing block comprised of five ALUs. Add it all up, and the RV770 has a grand total of 800 ALUs onboard, which AMD advertises as 800 "stream processors."

Nvidia http://www.techreport.com/articles.x/14934
Arranged in this way, the Chiclets have much to tell us. The 10 large groups across the upper portion of the diagram are what Nvidia calls thread processing clusters, or TPCs. TPCs are familiar from G80, which has eight of them onboard. The little green boxes inside of the TPCs are the chip's basic processing cores, known in Nvidia's parlance as stream processors or SPs. The SPs are arranged in groups of eight, as you can see, and these groups have earned their own name and acronym, for the trifecta: they're called SMs, or streaming multiprocessors.

Now, let's combine the power of all three terms. 10 TPCs multiplied by three SMs times eight SPs works out to a total of 240 processing cores on the GT200. That's an awful lot of green Chiclets and nearly twice the G80's 128 SPs, a substantial increase in processing potential—not to mention chewy, minty flavor.
ID: 848044 · Report as offensive
tfp
Volunteer tester

Send message
Joined: 20 Feb 01
Posts: 104
Credit: 3,137,259
RAC: 0
United States
Message 848045 - Posted: 2 Jan 2009, 5:24:17 UTC - in response to Message 848037.  

Pardon me if I missed your quote and got into the middle of something....

My bad, my apology....carry on then......


I didn't quote it was the post above mine, I didn't think about it. No worries.
ID: 848045 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 848066 - Posted: 2 Jan 2009, 6:12:29 UTC - in response to Message 848010.  

Sure a number of reasons other then an actual example where ATI is faster.


Because they're isn't one. I already stated that the F@H client isn't optimized for ATi, whereas the nVidia one is, so how is that really a fair comparison? And how can I come up with a S@H example when there isn't a S@H app?

You're asking for something that you know I cannot provide, but that doesn't indicate or suggest definitive proof that nVidia will be faster.

So other then a bunch of talk about something that might be and assumptions of things not being done right elsewhere when it comes to the ATI GPU client, there isn't much substance here.


You're right, which is why this is a "gee, I'd like to see this" kind of thread. The entire thread is based upon theoretics, not facts at this point. So no, there's no real substance.

I thought that much was obvious.
ID: 848066 · Report as offensive
tfp
Volunteer tester

Send message
Joined: 20 Feb 01
Posts: 104
Credit: 3,137,259
RAC: 0
United States
Message 848067 - Posted: 2 Jan 2009, 6:12:35 UTC
Last modified: 2 Jan 2009, 6:14:46 UTC

More info on ATI and Nvidia GPU cores:


ATI: (These cores are the same on the newer 4870, just MANY more)
http://www.techreport.com/articles.x/12458/2
In its shader core, the R600's most basic unit is a stream processing block like the one depicted in the diagram on the right. This unit has five arithmetic logic units (ALUs), arranged together in superscalar fashion—that is, each of the ALUs can execute a different instruction, but the instructions must all be issued together at once. You'll notice that one of the five ALUs is "fat." That's because this ALU's capabilities are a superset of the others'; it can be called on to handle transcendental instructions (like sine and cosine), as well. All four of the others have the same capabilities. Optimally, each of the five ALUs can execute a single multiply-add (MAD) instruction per clock on 32-bit floating-point data. (Like G80, the R600 essentially meets IEEE 754 standards for precision.) The stream processor block also includes a dedicated unit for branch execution, so the stream processors themselves don't have to worry about flow control.

Nvidia:
http://www.techreport.com/articles.x/11211
The G80 has eight groups of 16 SPs, for a total of 128 stream processors. These aren't vertex or pixel shaders, but generalized floating-point processors capable of operating on vertices, pixels, or any manner of data. Most GPUs operate on pixel data in vector fashion, issuing instructions to operate concurrently on the multiple color components of a pixel (such as red, green, blue and alpha), but the G80's stream processors are scalar—each SP handles one component. SPs can also be retasked to handle vertex data (or other things) dynamically, according to demand.

Comparison:
http://www.techreport.com/articles.x/12458/2
So how does R600's shader power compare to G80? Both AMD and Nvidia like to throw around peak FLOPS numbers when talking about their chips. Mercifully, they both seem to have agreed to count programmable operations from the shader core, bracketing out fixed-function units for graphics-only operations. Nvidia has cited a peak FLOPS capacity for the GeForce 8800 GTX of 518.4 GFLOPS. The G80 can co-issue one MAD and one MUL instruction per clock to each of its 128 scalar SPs. That's three operations (multiply-add and multiply) per cycle at 1.35GHz, or 518.4 GFLOPS. However, the guys at B3D have shown that that extra MUL is not always available, which makes counting it questionable. If you simply count the MAD, you get a peak of 345.6 GFLOPS for G80.

By comparison, the R600's 320 stream processors running at 742MHz give it a peak capacity of 475 GFLOPS. Mike Houston, the GPGPU guru from Stanford, told us he had achieved an observed compute throughput of 470 GFLOPS on R600 with "just a giant MAD kernel." So R600 seems capable of hitting something very near its peak throughput in the right situation.


So if your running MAD all the time ATI will do well, however start putting a lot of complex math in there and you cut ATI from all 5 ALUs per unit to just the complex one. Where Nvidia "should" stay at the same number of higher clock functional units (the base 240) running all the time. It also sounds like the ALU in the ATI chip can not run out-of-order, so if there is a dependent calculation some ALU will need to sit idle.

There are cases were ATI will be a head but the 800 number doesn't really mean what it sounds like and that is probably why Nvidia is a head on things like F@H.
ID: 848067 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 848073 - Posted: 2 Jan 2009, 6:17:12 UTC - in response to Message 848044.  

So far on things like F@H, the one place that seems to have a ATI and Nvidia crunching APP, Nvidia looks much faster.


Right, but you're still forgetting one major important factor: nVidia has been active in helping F@H with their GPU client to work best with their GPUs, which means a highly optimized application.

The ATi app, on the other hand, has had no or little help from ATi/AMD, so the GPU code for their chips are not nearly as optimized. I'm even willing to admit that even if the code were optimized for ATi, most of the nVidia chips would probably still be faster because the nVidia architecture has been better than ATi's - up to the Radeon HD 4xxx series, which is where I think ATi could shine. I think if an app were optimized for the Radeon HD 4xxx series, it could (yes, theoretically) be faster than the current nVidia chips.
ID: 848073 · Report as offensive
tfp
Volunteer tester

Send message
Joined: 20 Feb 01
Posts: 104
Credit: 3,137,259
RAC: 0
United States
Message 848078 - Posted: 2 Jan 2009, 6:23:54 UTC - in response to Message 848066.  
Last modified: 2 Jan 2009, 6:31:59 UTC

Sure a number of reasons other then an actual example where ATI is faster.


Because they're isn't one. I already stated that the F@H client isn't optimized for ATi, whereas the nVidia one is, so how is that really a fair comparison? And how can I come up with a S@H example when there isn't a S@H app?

You're asking for something that you know I cannot provide, but that doesn't indicate or suggest definitive proof that nVidia will be faster.


I would expect F@H has done some amount of optimizations on the ATI app so the app is fast enough to make the output worth while or they wouldn't have released it at all. {edit yeah ok that's what I thought, ATI/AMD isn't giving the support doesn't have the man power ect.}

I wasn't talking about a Seti app for ATI I'm talking about any crunching app that is faster. I don't have one example.

I have also posted a some links to write ups of how ATI and Nvidias stream processors actually work. I think the output on complex calculations for ATI has been exaggerated or miss understood in general, though for specific uses there is a good amount of power there. So maybe it would be faster but it would depend on what the app is doing. Anything with very complex math or sequential items and it seems like ATI would slow down. But as they said as the compiler improves so will performance for sequential items.

GeForce GTX 280
Peak shader
arithmetic (GFLOPS)
single: 622
double: 933
Radeon HD 4870
Peak shader
arithmetic (GFLOPS)
single: 1200

Sure looking at raw numbers there isn't a reason the 4870 shouldn't be faster, however I expect that in most cases one can't max out the ATI card because of the 4 simple and 1 complex ALU groupings.
ID: 848078 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 848083 - Posted: 2 Jan 2009, 6:29:44 UTC - in response to Message 848078.  
Last modified: 2 Jan 2009, 6:30:55 UTC

I would expect F@H has done some amount of optimizations on the ATI app so the app is fast enough to make the output worth while or they wouldn't have released it at all. Do you know what actual work has been done by Standford on the F@H ATI client or are you guessing/it's known that ATI/AMD doesn't give any support?


Funny thing about expectations. The best way to not be disappointed is to not have any.

Do you really think all software takes that kind of approach? Not release a product if the optimizations aren't level with all hardware? There are quite a few apps that optimize for a given architecture while ignoring others. Many EA Games are optimized for nVidia (like The Sims 2 and Vampire: The Masquerade Bloodlines [not by EA]) that don't have the same level of performance on ATi chips.

No, they have not optimized the ATi code, and it is known that ATi has not given much support/help to F@H.

I wasn't talking about a Seti app for ATI I'm talking about any crunching app that is faster. I don't have one example.


Other than F@H and BOINC, what else is there?

I have also posted a some links to write ups of how ATI and Nvidias stream processors actually work. I think the output on complex calculations for ATI has been exaggerated or miss understood in general, though for specific uses there is a good amount of power there.


...and if an app can be properly coded to get the best use out of the ATi chips, it may very well outperform the nVidia chips based on more shader processors alone, with each one doing basic work - sort of like RISC vs. CISC.
ID: 848083 · Report as offensive
tfp
Volunteer tester

Send message
Joined: 20 Feb 01
Posts: 104
Credit: 3,137,259
RAC: 0
United States
Message 848090 - Posted: 2 Jan 2009, 6:36:44 UTC - in response to Message 848083.  
Last modified: 2 Jan 2009, 6:41:59 UTC

I would expect F@H has done some amount of optimizations on the ATI app so the app is fast enough to make the output worth while or they wouldn't have released it at all. Do you know what actual work has been done by Standford on the F@H ATI client or are you guessing/it's known that ATI/AMD doesn't give any support?


Funny thing about expectations. The best way to not be disappointed is to not have any.

Do you really think all software takes that kind of approach? Not release a product if the optimizations aren't level with all hardware? There are quite a few apps that optimize for a given architecture while ignoring others. Many EA Games are optimized for nVidia (like The Sims 2 and Vampire: The Masquerade Bloodlines [not by EA]) that don't have the same level of performance on ATi chips.


Well I'm sure this isn't the first run of code that worked they probably tweaked it some, they don't need equal they need fast enough. Fast enough requires some work do you really think software doesn't take the fast enough approach and that doesn't require some effort? I'm expect they made sure the games played well enough on ATi cards too, just not the same effort. Chances are they did some things to make that happen. Maybe the problem is what we both think optimization and the word none imply?

I wasn't talking about a Seti app for ATI I'm talking about any crunching app that is faster. I don't have one example.


Other than F@H and BOINC, what else is there?


I was unaware that BOINC is just seti, so I stand corrected. Is there no another BOINC project out there that has a GPU client that isn't Nvidia only? Also I haven't looked but there seems to be other things out there other then BOINC and Seti in the general sense. http://www.gpgpu.org/

I have also posted a some links to write ups of how ATI and Nvidias stream processors actually work. I think the output on complex calculations for ATI has been exaggerated or miss understood in general, though for specific uses there is a good amount of power there.


...and if an app can be properly coded to get the best use out of the ATi chips, it may very well outperform the nVidia chips based on more shader processors alone, with each one doing basic work - sort of like RISC vs. CISC.


Well if you read the write ups there are some commands that the simple ALUs just cant run like Sin/Cos for example. It isn't a CISC vs RISC, its a normal CPU vs a Cell processor for each processing unit.
ID: 848090 · Report as offensive
Chris Oliver Project Donor
Avatar

Send message
Joined: 4 Jul 99
Posts: 72
Credit: 134,288,250
RAC: 15
United Kingdom
Message 848092 - Posted: 2 Jan 2009, 6:39:43 UTC - in response to Message 848083.  

All you dudes with ATI's should of asked santa for an Nvidia because until somebody figures how to port S@H to ATI CUDA's the daddy.
ID: 848092 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 848096 - Posted: 2 Jan 2009, 6:47:02 UTC - in response to Message 848090.  
Last modified: 2 Jan 2009, 6:51:17 UTC

Well I'm sure this isn't the first run of code that worked they probably tweaked it some, they don't need equal they need fast enough. Fast enough requires some work do you really think software doesn't take the fast enough approach and that doesn't require some effort? I'm expect they made sure the games played well enough on ATi cards too, just not the same effort. Chances are they did some things to make that happen. Maybe the problem is what we both think optimization and the word none imply?


If its not the same effort and the same level of performance, than its not really a fair comparison, is it? If a game (or app) is optimized for one and tolerable for the other, than its not really fair to say that the optimized one will always be faster than the unoptimized one under the same circumstances. If they playing field were level, then we could go from there.

I was unaware that BOINC is just seti, so I stand corrected. Is there no another BOINC project out there that has a GPU client that isn't Nvidia only? Also I haven't looked but there seems to be other things out there other then BOINC and Seti in the general sense. http://www.gpgpu.org/


No, there are no other BOINC projects that support ATi, because it hasn't been built into the BOINC client yet. BOINC only supports CUDA which is nVidia only. Other "things" out there other than BOINC/F@H also have more cooperation with nVidia than ATi, so the playing field is still not level or fair.

Well if you read the write ups there are some commands that the simple ALUs just cant run like Sin/Cos for example. It isn't a CISC vs RISC, its a normal CPU vs a Cell processor for each processing unit.


You have to remember that most of those write ups are actually guesses, as both nVidia and ATi protect their designs like trade secrets. Those write up go off of publicly available info and try to fill in the rest, which means making assumptions.
ID: 848096 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 848099 - Posted: 2 Jan 2009, 6:48:13 UTC - in response to Message 848092.  

All you dudes with ATI's should of asked santa for an Nvidia because until somebody figures how to port S@H to ATI CUDA's the daddy.


But I don't want nVidia. I don't care if it doesn't run SETI, its the graphics chip that I prefer to buy for my games. SETI would only be a bonus if it were available, but not a reason to buy.
ID: 848099 · Report as offensive
Stephen R

Send message
Joined: 25 Aug 99
Posts: 56
Credit: 3,736,096
RAC: 0
United States
Message 848176 - Posted: 2 Jan 2009, 13:45:08 UTC

mhouston who does the work on the F@H ATI GPUv2 program works for ATI. He has done optimizations to the program and is working on more. Some have to wait on a newer release of the CAL drivers that are in the works. For the ATI HD4xxx there are quite a few more optimizations that can be done (from what he has posted over at F@H they would not work so well if at all on the earlier models, so don't think he is too interested in having multiple ATI GPU progs at the moment)

Yes nVidia has a faster F@H GPU program than the ATI one, just no where near "faster" as some people seem to think. Some folding projects are better suited to nVidia and some are actually better suited to ATI. Also you can't compare performance between different project series (which a lot of people seem to do)
ID: 848176 · Report as offensive
tfp
Volunteer tester

Send message
Joined: 20 Feb 01
Posts: 104
Credit: 3,137,259
RAC: 0
United States
Message 848217 - Posted: 2 Jan 2009, 15:48:47 UTC - in response to Message 848096.  
Last modified: 2 Jan 2009, 15:56:09 UTC

Well I'm sure this isn't the first run of code that worked they probably tweaked it some, they don't need equal they need fast enough. Fast enough requires some work do you really think software doesn't take the fast enough approach and that doesn't require some effort? I'm expect they made sure the games played well enough on ATi cards too, just not the same effort. Chances are they did some things to make that happen. Maybe the problem is what we both think optimization and the word none imply?


If its not the same effort and the same level of performance, than its not really a fair comparison, is it? If a game (or app) is optimized for one and tolerable for the other, than its not really fair to say that the optimized one will always be faster than the unoptimized one under the same circumstances. If they playing field were level, then we could go from there.


Life isn't fair? The playing field is never level so you use the best for each involved and they are and have been optimizing for ATi. What's a fair amount if one set of HW is harder to optimize for then the other? Do we wait until ATi wins out because some people thing it should perform faster?

Even if ATi did provide some help there would be at least one person claiming that it wasn't as much or if they won out that Nvidia didn't do all they could. If an app is more optimized for one hardware then the other more often then not which is the better choice? How long do people buy based on maybes vs something with proven performance and greater app support. (Though ATI drivers have been better lately.)

Who says that the level of performance for ATi on those games wasn't the best they could do with the knowledge and time they had. If anything it's a reason not to buy ATi and that the hardware is to complex to use correctly because of the lack of support. This seems to cover the Cell port here at Seti as well, though the PPD is pretty ok for Cell over at F@H. Though the points there is just a mess and you really can't tell really what work they are doing.

I was unaware that BOINC is just seti, so I stand corrected. Is there no another BOINC project out there that has a GPU client that isn't Nvidia only? Also I haven't looked but there seems to be other things out there other then BOINC and Seti in the general sense. http://www.gpgpu.org/



Considering the images are from ATi and nVidia I'm pretty sure a lot of it is correct, they also talk to the companies for the information they do have and most of the time state when it is an assumption. The 1 + 4 simple is correct. So maybe because the hardware only excels at a smaller set of calculations it's not going to out perform the competitor.

Really this is all the info we have, I would expect that the actual hardware would be discussed in detail if we want to assume a fair world for everything else.

Some times things loose because they are targeting a different data set, that really could be the case here. So does any one know if most of the calcs are MAD or are there a number of other items done a large percentage of the time for seti?
ID: 848217 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 848232 - Posted: 2 Jan 2009, 16:13:12 UTC - in response to Message 848176.  

mhouston who does the work on the F@H ATI GPUv2 program works for ATI. He has done optimizations to the program and is working on more. Some have to wait on a newer release of the CAL drivers that are in the works. For the ATI HD4xxx there are quite a few more optimizations that can be done (from what he has posted over at F@H they would not work so well if at all on the earlier models, so don't think he is too interested in having multiple ATI GPU progs at the moment)


One single guy is nothing compared to an entire cooperative company helping out. But at least someone from ATi is helping.
ID: 848232 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 848241 - Posted: 2 Jan 2009, 16:34:10 UTC - in response to Message 848217.  
Last modified: 2 Jan 2009, 16:53:47 UTC

Life isn't fair? The playing field is never level so you use the best for each involved and they are and have been optimizing for ATi. What's a fair amount if one set of HW is harder to optimize for then the other? Do we wait until ATi wins out because some people thing it should perform faster?


Not exactly what I was trying to get at. My entire point for making the argument is that you can't say "well, there's some 'acceptable' performance for ATi GPUs, and there's optimized code for nVidia, and nVidia always wipes the floor with ATi so therefore nVidia is better". Then you turn your argument and state "life isn't fair" as a defense, which isn't exactly a good debate tactic when trying to have a discussion.

No, you don't have to wait until ATi wins "just because some people think it should perform faster" - rather, we should wait before comparing performance until the same level of optimizations are made for both GPUs. Just like AMD is still better suited for some computing tasks and Intel others, I'm sure Stephen R is correct in stating that ATi will naturally be good at some tasks while nVidia others.

Even if ATi did provide some help there would be at least one person claiming that it wasn't as much or if they won out that Nvidia didn't do all they could. If an app is more optimized for one hardware then the other more often then not which is the better choice? How long do people buy based on maybes vs something with proven performance and greater app support. (Though ATI drivers have been better lately.)


As long as one company is providing complete support and the other isn't, I wouldn't call that "proven performance", but certainly greater support.

If you want to base it off support, then yes, nVidia wins - but through support and not necessarily better performance because it hasn't reached a level of fair comparison.

I'm not trying to tell people what to buy for crunching, so I don't know where that argument came from, nor has it been in this entire thread. Fact is, people will buy whatever they choose to, not some silly thread here on SETI@Home. I buy ATi because I prefer their chips and have used them since my original ATI Mach32 2MB VRAM VLB graphics adapter.

In fact, I've never had a problem with ATi's drivers, though they seem to be a major source of commotion in the graphics card community. I have, however, had problems with nVidia's drivers on several occasions.

Who says that the level of performance for ATi on those games wasn't the best they could do with the knowledge and time they had. If anything it's a reason not to buy ATi and that the hardware is to complex to use correctly because of the lack of support. This seems to cover the Cell port here at Seti as well, though the PPD is pretty ok for Cell over at F@H. Though the points there is just a mess and you really can't tell really what work they are doing.


Or perhaps they signed a deal with nVidia to use the "Work best on nVidia" logo program, and in exchange nVidia offered complete support with their graphics cards. In my opinion, this is something a game manufacturer should never do, because you are going alienate some of your users - and since graphics cards, CPUs and OSes (among many other hardware components) are purchased not because of what a game supports but because of what the user prefers or is more comfortable with purchasing, they are shutting out those users by not giving them the performance that is acceptable that their hardware is capable of doing.

If anything, its a reason not to buy those games from those manufacturers - not a reason to bypass the preferred hardware people are comfortable with using.

Considering the images are from ATi and nVidia I'm pretty sure a lot of it is correct, they also talk to the companies for the information they do have and most of the time state when it is an assumption. The 1 + 4 simple is correct. So maybe because the hardware only excels at a smaller set of calculations it's not going to out perform the competitor.


...and how do we know that those assumptions they are making don't effect the overall picture? Yes, they get the images and info direct from the company, just like when there's write ups about Intel or AMD, but the company is only going to let you know what they want you to know. Nothing more, nothing less.

Really this is all the info we have, I would expect that the actual hardware would be discussed in detail if we want to assume a fair world for everything else.


Hardware is only a portion of the discussion. Ask anyone at Lunatics.net what the other half is (or they might even argue 75%). In fact, I don't want to assume a fair world for everything else, because I know it is not.

It appears that you want to have a different discussion. You want to focus on hardware alone while I want to look at the entire picture, from top to bottom; from start to finish. Personally, when discussing performance, I don't see how you can look at anything else other than the whole picture.

Some times things loose because they are targeting a different data set, that really could be the case here.


I most completely agree with this. But I still add that without a) a working ATi app for SETI (if that's what we're discussing) and b) the same amount of support from ATi as nVidia offers for their chips, I cannot agree to any level of comparison and feel that it was balanced and fair.
ID: 848241 · Report as offensive
tfp
Volunteer tester

Send message
Joined: 20 Feb 01
Posts: 104
Credit: 3,137,259
RAC: 0
United States
Message 848247 - Posted: 2 Jan 2009, 16:56:20 UTC
Last modified: 2 Jan 2009, 16:59:11 UTC

If you don't want to talk about the hardware, which is all we can compare right now, what are we going to talk about? The thread started out as a generalization of how the HW is better on ATi vs Nvidia.

You want ATi to help out with that app, lets say that happens. You want the same amount of optimizations over all between ATi and Nvidia, fine lets say those exist. Now that we have made it equal software side what's left to talk about? Hardware.

Of course software is at least half of it. Every successful optimization gets the performance nearer what the max computational output is for any processor as given by the manufacturer. So considering there isn't an app and there is no point is complaining about Nvidia support vs ATi that we can’t change. Maybe your complaint would be better directed at ATi because I don't think it will change things here.

All we have left to talk about is the different in hardware and if the work done by seti can really max out Nvidia's or ATi hardware assuming the best level of optimizations.

I don't see the point of talking about level of optimizations on apps that aren't here; all we can use are existing apps as the example. That isn't fair because ATi isn't helping out enough, fine now what? Should we just work with ATi app will never be as good because they don't help or what?

I'm also sure that the people over at Lunatics.net understand that optimizations are many times HW specific. AMD was ahead at times because the HW had more general-purpose ALUs vs what Intel had at the time as well as a better system level interface until the latest CPU. Intel had better branch prediction, caching ect. Those differences are what people optimize too.

I guess I don't know what you’re getting at. People say ATi will be faster, well say why. I have put up why I don't think any amount of optimizations will make if faster because of the hardware makeup and all I have gotten back is "Nvidia helps with optimization more". Well no kidding. What else are we going to talk about then unless we all want to just agree that ATi will never be as good because they don't support the GPUs in the same way Nvidia does. (And I don't think that's a correct thing to assume, there are people that aren't in ATi that can probably do a great job with ATi based GPU apps.)
ID: 848247 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 848253 - Posted: 2 Jan 2009, 17:13:17 UTC - in response to Message 848247.  
Last modified: 2 Jan 2009, 17:14:52 UTC

If you don't want to talk about the hardware, which is all we can compare right now, what are we going to talk about? The thread started out as a generalization of how the HW is better on ATi vs Nvidia.


IMO, there isn't much of a discussion to be had that would be fair at this point in time. The thread did start out by suggesting that ATi's hardware would be better, but I added the whole picture to the discussion because that would be fair.

You want ATi to help out with that app, lets say that happens. You want the same amount of optimizations over all between ATi and Nvidia, fine lets say those exist. Now that we have made it equal software side what's left to talk about? Hardware.


You can't just "say" that they exist because they do not. You can talk hardware all you want, and come up with some unfair, unbalanced conclusion that "nVidia's hardware is better because of X" when that X may not be the appropriate conclusion.

Of course software is at least half of it. Every successful optimization gets the performance nearer what the max computational output is for any processor as given by the manufacturer. So considering there isn't an app and there is no point is complaining about Nvidia support vs ATi that we can’t change. All we have left to talk about is the different in hardware and if the work done by seti can really max out Nvidia's or ATi hardware assuming the best level of optimizations.


I disagree. Yes, every successful optimization needs to be coded for a specific platform, which is why I am not willing to assume best level optimizations because the true deciding factor in performance is the software. You can have the best functional piece of hardware, but without software support, the hardware means little to nothing.

I don't see the point of talking about level of optimizations on apps that aren't here; all we can use are existing apps as the example. That isn't fair because ATi isn't helping out enough, fine now what? Should we just work with ATi app will never be as good because they don't help or what?


I don't see the point in discussion hardware only because you need to make too many assumptions to form a conclusion, which would be flawed.

I'm also sure that the people over at Lunatics.net understand that optimizations are many times HW specific. AMD was ahead at times because the HW had more general-purpose ALUs vs what Intel had at the time as well as a better system level interface until the latest CPU. Intel had better branch prediction, caching ect. Those differences are what people optimize too.


Exactly. ...and one additional point is that AMD's 3DNow! was better than Intel's MMX, but almost no one coded for it, so it didn't matter. It matters completely to performance what people code for, regardless of hardware.

I guess I don't know what you’re getting at. People say ATi will be faster, well say why. I have put up why I don't think any amount of optimizations will make if faster because of the hardware makeup and all I have gotten back is "Nvidia helps with optimization more". Well no kidding. What else are we going to talk about then unless we all want to just agree that ATi will never be as good because they don't support the GPUs in the same way Nvidia does. (And I don't think that's a correct thing to assume, there are people that aren't in ATi that can probably do a great job with ATi based GPU apps.)


I never said that ATi "will be faster". I never made a definitive answer like that. I stated that, for all we know, with proper coding one ATi processor may be better than nVidia's, but we won't know if that statement is true until there is equal support from both parties.

... and yes, I don't think that someone who isn't with ATi would be able to do a great job with ATi based GPUs, because, as I stated earlier, companies keep their designs as trade secrets, and only the company would know how best to code for a given hardware architecture. Someone outside the company has to guess, and even then, they can only test with what they have, so they end up "optimizing" (if you could call it that) for a single piece of hardware. Someone outside the company does not have the resources or the available knowledge to write the best software possible.


You stated that you don't know what I'm getting at, but that's because we're on two different waves of thought, and neither of us seem willing to meet in the middle (I know I'm not willing, because I still don't think it would be fair and balanced). You want to have a discussion based purely on hardware specs and want to assume everything is fair on the software side. My argument is that regardless of how great the hardware is, its the software that actually makes the hardware "do things", and without proper support from the hardware manufacturer, the performance comparisons mean nothing.
ID: 848253 · Report as offensive
tfp
Volunteer tester

Send message
Joined: 20 Feb 01
Posts: 104
Credit: 3,137,259
RAC: 0
United States
Message 848590 - Posted: 3 Jan 2009, 9:07:33 UTC

Right there is nothing to discuss.
ID: 848590 · Report as offensive
Profile Peter M. Ferrie
Volunteer tester

Send message
Joined: 28 Mar 03
Posts: 86
Credit: 9,967,062
RAC: 0
United States
Message 848614 - Posted: 3 Jan 2009, 10:31:40 UTC

ok guys have a coke and a smile
this was just my opinion


Then why do nVidia cards always beat the ati cards?
Oh wait...
Shader clock = core clock (ati)
Shader clock = 2 x core clock (nVidia)



ok
the gtx280 has 240 stream Processors
the 4870 has 800 stream Processors

even if nvidia does them twice as fast
the 4870 (in my opinion wins by default)

gtx280 240+240 = 480 per cycle
4870 800 per cycle

my opinion 4870 will be faster due to the raw amount of stream Processors
it has almost twice as many

also the power draw on a 4870 is lower than 280gtx so more power for less juice
4870 150-180 watts
280gtx 230-260
4870X2 220-250

and this is just random hardware talk

ID: 848614 · Report as offensive
Previous · 1 · 2 · 3

Message boards : Number crunching : if you think cuda is fast


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.