Message boards :
Number crunching :
Random Musings About the Value of CPUs vs CUDA
Message board moderation
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 · Next
Author | Message |
---|---|
popandbob Send message Joined: 19 Mar 05 Posts: 551 Credit: 4,673,015 RAC: 0 |
Well I guess the stage is set... I am going to do a CUDA run... We shall see who wins. Keep your eyes on my RAC Once my GPUGRID wu is done let the seti Cuda begin. Top 10 here I come :) Do you Good Search for Seti@Home? http://www.goodsearch.com/?charityid=888957 Or Good Shop? http://www.goodshop.com/?charityid=888957 |
Francois Piednoel Send message Joined: 14 Jun 00 Posts: 898 Credit: 5,969,361 RAC: 0 |
This is a classical of CUDA claims, take the super mega corner case, and sell it as it is the general case ... As long as they don t show up with Impressive RAC average, it is all BS ... |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
How would be the performance? IMO, very little. The amount of CPU time the app uses is similar to the amount in a special build I did to just accumulate the FLOP count for any angle range without actually doing the signal finding calculations. That suggests there's very little CPU time expended transferring data to/from the GPU. In future the SETI@home-CUDA-app will always need the CPU/Core for crunching? For the known future, yes to both. However, the move toward heterogeneous computing is almost certain to converge to processors which contain both a general purpose CPU suitable for running an Operating System plus a huge number of supplementary processors for graphics or GPGPU uses. Larrabee? Supposedly samples now or soon, retail in a few months. It will not run CUDA code, that's Nvidia only. But AMD/ATI, Intel, Nvidia, and several other important companies are supporting OpenCL as a common programming language. If that develops as expected, there might be a single S@H OpenCL implementation which would run on any system with OpenCL "drivers" for whichever GPUs are installed. OTOH, the different companies may add so many implementation differences that it will be a mess. Joe |
Gecko Send message Joined: 17 Nov 99 Posts: 454 Credit: 6,946,910 RAC: 47 |
Re: Larrabee For us mere mortals, hard to place 100% faith in the current info re: Larrabee. Only Who? knows for sure...BUT....if it debuts this year in 45nm process w/ 32 and 24 core versions, then I'm slightly underwhealmed. There appears to be much speculation that Larrabee will be a @ 2 Tflop (single precision) card @ launch. Logical to assume this will be the 32 core variety. The HD4870X2 is already @ 2.4 Tflops, the GTX295 to be released in a few days will be @ 2 Tfops. I understand that Flops alone is not the end-all basis for performance expectations, but for the purpose of "general" comparison, this is the power available in the market today. Larrabee will be available to consumers, when? Starting late 09'? Assuming that specific compilers, libraries & development tools are available @ the same time, just maybe a Seti application could be available for the Gen-1 Larrabee by March 2010? I think we can expect to see AMD & Nvidia introduce at least 2, possibly 3 more generations of products of significantly greater processing power by then. We will certainly see Cuda mature more and I'm confident that our gang of volunteer developers will have some surprises in store for GPU processing after 14-15 more months of development experience. Lastly, CUDA may not be the only API implementation for Seti GPU processing for very long. Apple will be pushing OpenCL pretty heavily w/ upcoming Snow Leopard OS so programming tools & support should gain strong momentum. I think we may see some success w/ OpenCL based Seti applications for Nvidia and AMD in the not too distant future. IMO, it seems Larrabee is behind & will playing catch-up when it debuts later this year. That said, there's no question it should still provide for a massive increase in Seti crunching power. This is only my personal opinion based on available, if not speculative info & some interpolation. |
Francois Piednoel Send message Joined: 14 Jun 00 Posts: 898 Credit: 5,969,361 RAC: 0 |
Re: Larrabee I can't say much about Larrabee, I don't understand why NV is trying to do CUDA enabling,it is giving intel an home run on this, they perfectly know that at doing computational Lrb will kick them big time, and if they don't know, they should think a little about what make more than 1 billion transistor bit fast: Process technology!!!! anyway, no need of Lrb to beat CUDA on SETI, the CPU will do it. (correction "The CPU already does it" ) |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
In another month or so, we may be in position for a decent RAC competition. BOINC needs to have settings which will allow a host to do both CUDA and CPU work, the S@H app needs to have bugs fixed, the quota/day needs to be adjusted, etc. Until then it would take a lot of effort to get enough astropulse work to occupy the CPUs while the GPU does setiathome_enhanced work. I can't imagine anyone investing that much effort over a long enough time to achieve a stable RAC. Meanwhile, there have been enough independent posts here about actual speeds of GPU crunching that I'm convinced the top combination will be Core i7 plus Nvidia GPUs later this year. It won't be cheap, it may not take top honors in energy efficiency, but it will surely be able to do huge numbers of WUs in very little time. Joe |
-= Vyper =- Send message Joined: 5 Sep 99 Posts: 1652 Credit: 1,065,191,981 RAC: 2,537 |
Well seti top 20 is a small pissing contest ofcourse and those who get there or near are going to do what they can to stay in that position. Count me in on that p***ng contest too. But you're so right in that the code will need polishing and until that has occured it is rather meaningless pointing out what the gpu app cant and could do for my concern. All i say is that i know several people throughout the years who tried to do a s@h app conversion. Mimo and Hans Dorn is just some of them. The gpu app is now 100% accurate of what i told them later how a gpu app would be benefited the most when they started developing a gpu version. And that is exactly by letting the Boinc app know and take care of where the workload is sent to the right .exe for that specific platform.. All in all even though i think that the .exe is beta in regards of the specific VLAR issue i'm also astonished how well the app is offloading the cpu to do other work. I mean it is truly awesome how they preprocess the wu to make it "streamable" for the gpu to do it's work and when that is done, let the ride begin.. One way to "cheat" is to let Boinc know which WU's are compatible with the present GPU app without freaking out and when Boinc.exe later on can send same tasks (s@h 605 or 606 apps) to both cpu and gpu it could be implemented so that device which get there first could take the appropriate wu if the cue isn't in a high priority mode, otherwise Boinc knows that these WU's that are gpu compatible are beeing assigned to gpu cue but if cpu cue starts to drain Boinc.exe automatically assigns it to cpu.exe or if cpu.exe automatically can fetch work in the gpu cue that is fine also. Well enough mumbling from me :) , it is going to be an exciting year.. Wonder if Ati is promoting a Amd stream version also ;) .. I hope that .exe would be bug free first before it is released into the wild .. Lol Kind Regards to all Vyper _________________________________________________________________________ Addicted to SETI crunching! Founder of GPU Users Group |
Gecko Send message Joined: 17 Nov 99 Posts: 454 Credit: 6,946,910 RAC: 47 |
I agree completely. Once things get sorted over the next couple of months, I think we'll see some pretty massive #s by those running i7 and 3x of GTX260, 280 or even dual or Tri GTX295 which would mean 4/6 GPUs. Point is: total RAC will be WAY higher than just i7 alone, or even dual slot i7 boards. Today, GPU app performance is dependent upon CPU. Who?s point of performance difference between Celeron + GPU vs. i7 + GPU is valid. However, if I run ONE i7 920, I can also potentially add 3x GTX295s for a total of 6 GPUs + 4 CPU cores. Hummm......IDT it's a question of "which" type of processor I prefer or can do the most....I want ALL 10 cores crunching to their MAX ability in 1 rig. Like I said above, once app/Boinc/quota/scheduling issues are sorted...total RAC should be enormous w/ this kind of set-up. It's a sobering thought considering that the current app and crunching "methodology" only scratch the surface of potential. The GPU app still acts/crunches in many ways like the CPU app & is essentially running code & methodology designed for CPU. It's more of a "port" to GPU than a GPU-designed application. There's a HUGE difference (read: handicap) in this approach. But... it WORKS and is a logical "1st" step. The next couple of years will be VERY different form the past couple. It really is the beginning of a new era so to say ; > ) |
Francois Piednoel Send message Joined: 14 Jun 00 Posts: 898 Credit: 5,969,361 RAC: 0 |
it did not stop the NV VPs to claim victory, it does say long about how rackless they are. All Cuda claims are the same, if you look at them closely. |
Francois Piednoel Send message Joined: 14 Jun 00 Posts: 898 Credit: 5,969,361 RAC: 0 |
The interesting part is that you ll get cheaper way to accelerate with the CPU than with the GPU ... For example, going to Dual Nehalem will give you better scalling than adding GPUs. It is already the case, and with the main stream Nehalem coming, it will get even better. The NV super mega high end card from NV will have to drop to 120$ to be competitive in term of RAC/$ ... i let you do the calculation. So, mister NV, can we get those NV G92 for 120$? remember , the way to calculate "how good the GPU is" force you to get a Core i7 in the price, otherwise, your GPU is not competitive... hehehehhe |
Westsail and *Pyxey* Send message Joined: 26 Jul 99 Posts: 338 Credit: 20,544,999 RAC: 0 |
Started crunching the MB Cuda app on my GPU's. All my Seti main credit is from Cuda. My CPU's crunch for other projects..Also, seti share the GPU with ps3grid. "The most exciting phrase to hear in science, the one that heralds new discoveries, is not Eureka! (I found it!) but rather, 'hmm... that's funny...'" -- Isaac Asimov |
-= Vyper =- Send message Joined: 5 Sep 99 Posts: 1652 Credit: 1,065,191,981 RAC: 2,537 |
"how good the GPU is" force you to get a Core i7 in the price, otherwise, your GPU is not competitive... hehehehhe Heh :) I agree to certain amount but you dont need such a powerful cpu to feed the gpu's a basic Q9300 would do just fine or perhaps a dual core.. I presume (speculation) that if you have a Core2 3ghz dual core i think that the cpu utilisation would be aprox 20% overall if you have 4 gpu's to feed with data.. But you're totally right that if you're going to spend time of the precalc on the cpu as it is today until the gpu cores get fed then a old celeron would hinder the gpu much. I know that my small wu's need 82 seconds in i7 power to feed the gpu, the math would be that a celeron would need perhaps 500 seconds before it feeds the gpu so the s@h code must be omptimised and sse X enhanced to lower it's cpu dependancy. Looking forward to see the forthcoming larabee ;) Kind regards Vyper _________________________________________________________________________ Addicted to SETI crunching! Founder of GPU Users Group |
tfp Send message Joined: 20 Feb 01 Posts: 104 Credit: 3,137,259 RAC: 0 |
The interesting part is that you ll get cheaper way to accelerate with the CPU than with the GPU ... For example, going to Dual Nehalem will give you better scalling than adding GPUs. It is already the case, and with the main stream Nehalem coming, it will get even better. The NV super mega high end card from NV will have to drop to 120$ to be competitive in term of RAC/$ ... i let you do the calculation. Well yeah if you don't count the cost of the complete system Nehalem will be cheaper, kind of... If you put as much effort into work as you do talking smack on line you must be the most productive person at Intel. BTW a PD 805 feeds a 9600GT I already have fine. Guess what is the cheapest way for me to increase PPD? Buy a completely new system or just use the GPU already installed... |
Francois Piednoel Send message Joined: 14 Jun 00 Posts: 898 Credit: 5,969,361 RAC: 0 |
The interesting part is that you ll get cheaper way to accelerate with the CPU than with the GPU ... For example, going to Dual Nehalem will give you better scalling than adding GPUs. It is already the case, and with the main stream Nehalem coming, it will get even better. The NV super mega high end card from NV will have to drop to 120$ to be competitive in term of RAC/$ ... i let you do the calculation. My Job is my hobby, i never have the feeling of working ... the down side, I never stop working :) |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
My Job is my hobby, i never have the feeling of working ... the down side, I never stop working :) I guess that puts all the posts into context? F. |
SATAN Send message Joined: 27 Aug 06 Posts: 835 Credit: 2,129,006 RAC: 0 |
Yet again it appears that another marketing drive, full of lies is about to start. The CUDA app is buggy yes, of that there is no doubt. When it works however it is clearly faster than a CPU. So far I have noticed it completes the VLAR units it 360 - 420 seconds of actual time and the longer units in approximately 1450 seconds. Now on the MacPro running at 3.2GHz these units take 700 seconds and 3500 seconds respectively. And if you look at my RAC for Seti you see it has increased since I started running CUDA. Obviously it will never be faster than a Quad or Octo chip, however it is faster than a single CPU. |
Francois Piednoel Send message Joined: 14 Jun 00 Posts: 898 Credit: 5,969,361 RAC: 0 |
Yet again it appears that another marketing drive, full of lies is about to start. I have the NV stuff running on an other account ... I don t see it going faster than Core i7, sorry, it just does not. |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
I have the NV stuff running on an other account ... I don t see it going faster than Core i7, sorry, it just does not. So your Core i7 will knock out 8 (or even 4) VLAR WU's in 360 secs? F. |
Crunch3r Send message Joined: 15 Apr 99 Posts: 1546 Credit: 3,438,823 RAC: 0 |
I have the NV stuff running on an other account ... I don t see it going faster than Core i7, sorry, it just does not. This is a pretty stupid comparison to say the least... why should the i7 crunch 4 or 8 task in 360 sec ? Parallelize the MB app and compare those crunching times to the nv stuff ... i'm pretty sure that an i7 running an smp version of the MB app will outperform cuda. Join BOINC United now! |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
I have the NV stuff running on an other account ... I don t see it going faster than Core i7, sorry, it just does not. Just trying to clarify the baseline for the comparison. So far as I can see, a Core i7 using 7.96 cores + a CUDA card should crunch far more than a Core i7 without the CUDA card. And the CUDA card should be a cheaper way of upgrading my Q9450 rig than a complete new Core i7 rig. Is that right, Who? F. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.