Message boards :
Number crunching :
Which is better ATI Apps or nvidia's cuda?
Message board moderation
Previous · 1 · 2
| Author | Message |
|---|---|
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0
|
|
|
Crun-chi Send message Joined: 3 Apr 99 Posts: 174 Credit: 3,037,232 RAC: 0
|
my 5850 runs 2 WU's at a time and gets about the same completion time as a 260 or 275. Without a doubt the ATI cards are slower on Seti than other projects. However an 8800 is a really some GPU its even embarrassing to thing that card could be that fast. My 560Ti works three WU at a time: so what now :) I doubt then in any near future ATI card would crunch as nearly fast as Nvidia (I think that OPEN CL cannot be compared vs CUDA compiler) I am cruncher :) I LOVE SETI BOINC :) |
1fast6 Send message Joined: 24 May 99 Posts: 3 Credit: 17,198,983 RAC: 0
|
it appears my first attempt did not unpack all of the files... I am now up and running... thanks... one more question... I have a node with a single 5770 and a second node with a 5850 and a 5870... the 5770 is a dedicated DC node, the 5850/5870 node is my personal desktop.. what is a safe / ideal number of instances to run on each of those cards?? thank you all for your help... |
skildude Send message Joined: 4 Oct 00 Posts: 9541 Credit: 50,759,529 RAC: 60
|
and you missed the point. they were claiming an old 8800 was faster than a 5870. I'll grant you the nVidia cards work great on VHAR WU's but do very poorly on VLAR's. Unfortunately for science the VHAR WU's contain very little useable data which is why the nVidia cards are able to blow through them. Lets also note that so far ATI cards are king on AP WU's. So given the limited value of the data collected I'd rather work VLAR on my ATI than waste time trolling through the trashmy 5850 runs 2 WU's at a time and gets about the same completion time as a 260 or 275. Without a doubt the ATI cards are slower on Seti than other projects. However an 8800 is a really some GPU its even embarrassing to thing that card could be that fast. In a rich man's house there is no place to spit but his face. Diogenes Of Sinope |
|
hbomber Send message Joined: 2 May 01 Posts: 437 Credit: 50,852,854 RAC: 0
|
Just one of my points: Your 6900 does VLARs same fast(well, lil bit faster) as one of the cores of my 2500K(4.5 GHz, needs to be noted) , 2900 vs 3200 seconds. Processor uses 100 watts, with all 4 cores loaded with tasks. So having NVIDIA + Intel CPU is win-win situation, all range of ARs are covered. I still claim 8800GT(not even "S", full G92) is same fast as 5870, with MAR units. Even GT240 DDR5, using only 50 watts, can do MAR unit in 20-22 minutes. |
Mike Send message Joined: 17 Feb 01 Posts: 34711 Credit: 79,922,639 RAC: 80
|
You can forget that if a 5850 runs only astropulses. See my 1090T and only 1 5850. With each crime and every kindness we birth our future. |
|
hbomber Send message Joined: 2 May 01 Posts: 437 Credit: 50,852,854 RAC: 0
|
I suspect, if full CPU core is left to serve ATI GPU, calculation times can rise significantly. I obversed this in SETI beta witn NVIDIA OpenCL client. Seems OpenCL needs lot more CPU time than CUDA. Mike, if u speak RAC-wise, my system is running only two CPU cores, usual RAC is 37K+, It dropped, bcs recently it was day and a half offline, I tested memory sticks for a review. |
Miep Send message Joined: 23 Jul 99 Posts: 2412 Credit: 351,996 RAC: 0 |
I suspect, if full CPU core is left to serve ATI GPU, calculation times can rise significantly. I obversed this in SETI beta witn NVIDIA OpenCL client. Seems OpenCL needs lot more CPU time than CUDA. If you mean r246 V7 MB - a) NVidia OpenCL MB is already significatly slower than CUDA MB [alpha testing results] b) r246 calculates the autocorrelation on the CPU, hereby significantly increasing CPU times. Carola ------- I'm multilingual - I can misunderstand people in several languages! |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874
|
b) r246 calculates the autocorrelation on the CPU, hereby significantly increasing CPU times. But autocorrelation applies only to v7 tasks, which aren't being issued here on the main project yet - so it won't affect main project comparison timings between ATI and nVidia, native and OpenCL modes. |
|
hbomber Send message Joined: 2 May 01 Posts: 437 Credit: 50,852,854 RAC: 0
|
Yes, it turned out to be r246. But I didn't see that ATI OpenCL CPU time is not that big, my bad. Thank you, good to know this, about that big CPU usage with r246 is not caused by OpenCL platform itself, as I incorrectly thought. I don't follow beta forums. P.S. I did mistake in my previous post: "calculation times can rise significantly" must be "calculation times can decrease significantly" |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874
|
I don't follow beta forums. Might I respectfully suggest that, as a volunteer tester, you reconsider that policy? The purpose of testing is to learn about the behaviour of new, unreleased applications before they make the transition to the main project, and hence to uncover - and then correct - problems. It's just confusing to the general readership to discuss these pre-release matters in this general forum. I have to admit that, owing to the general shortage of skilled testers, some beta applications have been announced here to recruit extra testers, and that adds to the confusion. But I would ask that, once people have become involved in a test, they follow it on the appropriate forum. There's an excellent introduction to testing, methods, benefits and drawbacks at http://setiweb.ssl.berkeley.edu/beta/forum_thread.php?id=1023&nowrap=true#22983. It deserves a wider readership - maybe a moderator could give that post, or the edited version later in the same thread, some added prominence? |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121
|
Wrong statement. Both V7 apps had almost same speed in error bounds of natural variation last time I did comparison [GSO9600+Core Duo]. Only if something changed in last 4 days?... |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121
|
b) r246 calculates the autocorrelation on the CPU, hereby significantly increasing CPU times. In general, "native" (i.e. CUDA) version of same kernels launched with the same geometry will be slightly faster, I see this on most already converted kernels. Maybe, because NV compiler for CUDA C more mature than their OpenCL compiler... |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121
|
good to know this, about that big CPU usage with r246 is not caused by OpenCL platform itself, as I incorrectly thought. Actually, directly converted CUDA app, based on OpenCL code, experiences very big increase in CPU time. But I would account this to too imperfect translation for now, object for further tuning/debugging. In theory CPU consumption will be almost the same, maybe in small favour of CUDA app (I speak about CUDA app based on OpenCL code, not about already released ones, they use different approach to split work between CPU and GPU. IMHO, to compare different languages/compilers one should compare direct translation of the same algorithm on different languages (and vice versa, to compare different algorithms better to use same language/compilers)) |
©2026 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.