Which is better ATI Apps or nvidia's cuda?

Message boards : Number crunching : Which is better ATI Apps or nvidia's cuda?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 1109575 - Posted: 25 May 2011, 9:29:21 UTC - in response to Message 1109573.  

If you look a few posts back, you'll find a app_info.xml, don't forget to change the name of the executable if you use rev.521.


AND the .cl file name for r521


Lol, was about to edit that ;)


ID: 1109575 · Report as offensive
Crun-chi
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 174
Credit: 3,037,232
RAC: 0
Croatia
Message 1109585 - Posted: 25 May 2011, 10:19:27 UTC - in response to Message 1102481.  

my 5850 runs 2 WU's at a time and gets about the same completion time as a 260 or 275. Without a doubt the ATI cards are slower on Seti than other projects. However an 8800 is a really some GPU its even embarrassing to thing that card could be that fast.

my 5850 vs a 295 http://setiathome.berkeley.edu/workunit.php?wuid=730526754
note that he has 6 GPU's running and I am running 2 WU's at a time. which means I'm getting 2 WU's done at that time while he can only run 1 per GPU. the 4XX and 5XX NVidia cards can run more than 1 WU so it's harder to tell how GPU's compare at that level. But to say a 8800 is even remotely close to being as fast as a 5870 is a joke.


My 560Ti works three WU at a time: so what now :) I doubt then in any near future ATI card would crunch as nearly fast as Nvidia (I think that OPEN CL cannot be compared vs CUDA compiler)

I am cruncher :)
I LOVE SETI BOINC :)
ID: 1109585 · Report as offensive
Profile 1fast6

Send message
Joined: 24 May 99
Posts: 3
Credit: 17,198,983
RAC: 0
United States
Message 1109809 - Posted: 26 May 2011, 2:02:25 UTC - in response to Message 1109585.  

it appears my first attempt did not unpack all of the files...
I am now up and running... thanks...

one more question...
I have a node with a single 5770 and a second node with a 5850 and a 5870...
the 5770 is a dedicated DC node, the 5850/5870 node is my personal desktop..
what is a safe / ideal number of instances to run on each of those cards??

thank you all for your help...
ID: 1109809 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 1109819 - Posted: 26 May 2011, 2:37:55 UTC - in response to Message 1109585.  

my 5850 runs 2 WU's at a time and gets about the same completion time as a 260 or 275. Without a doubt the ATI cards are slower on Seti than other projects. However an 8800 is a really some GPU its even embarrassing to thing that card could be that fast.

my 5850 vs a 295 http://setiathome.berkeley.edu/workunit.php?wuid=730526754
note that he has 6 GPU's running and I am running 2 WU's at a time. which means I'm getting 2 WU's done at that time while he can only run 1 per GPU. the 4XX and 5XX NVidia cards can run more than 1 WU so it's harder to tell how GPU's compare at that level. But to say a 8800 is even remotely close to being as fast as a 5870 is a joke.


My 560Ti works three WU at a time: so what now :) I doubt then in any near future ATI card would crunch as nearly fast as Nvidia (I think that OPEN CL cannot be compared vs CUDA compiler)
and you missed the point. they were claiming an old 8800 was faster than a 5870. I'll grant you the nVidia cards work great on VHAR WU's but do very poorly on VLAR's. Unfortunately for science the VHAR WU's contain very little useable data which is why the nVidia cards are able to blow through them. Lets also note that so far ATI cards are king on AP WU's. So given the limited value of the data collected I'd rather work VLAR on my ATI than waste time trolling through the trash



In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 1109819 · Report as offensive
hbomber
Volunteer tester

Send message
Joined: 2 May 01
Posts: 437
Credit: 50,852,854
RAC: 0
Bulgaria
Message 1109912 - Posted: 26 May 2011, 9:31:45 UTC
Last modified: 26 May 2011, 9:37:19 UTC

Just one of my points:
Your 6900 does VLARs same fast(well, lil bit faster) as one of the cores of my 2500K(4.5 GHz, needs to be noted) , 2900 vs 3200 seconds. Processor uses 100 watts, with all 4 cores loaded with tasks.
So having NVIDIA + Intel CPU is win-win situation, all range of ARs are covered.
I still claim 8800GT(not even "S", full G92) is same fast as 5870, with MAR units. Even GT240 DDR5, using only 50 watts, can do MAR unit in 20-22 minutes.
ID: 1109912 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34711
Credit: 79,922,639
RAC: 80
Germany
Message 1109913 - Posted: 26 May 2011, 9:40:46 UTC


You can forget that if a 5850 runs only astropulses.

See my 1090T and only 1 5850.

With each crime and every kindness we birth our future.
ID: 1109913 · Report as offensive
hbomber
Volunteer tester

Send message
Joined: 2 May 01
Posts: 437
Credit: 50,852,854
RAC: 0
Bulgaria
Message 1109916 - Posted: 26 May 2011, 9:47:35 UTC
Last modified: 26 May 2011, 9:50:50 UTC

I suspect, if full CPU core is left to serve ATI GPU, calculation times can rise significantly. I obversed this in SETI beta witn NVIDIA OpenCL client. Seems OpenCL needs lot more CPU time than CUDA.

Mike, if u speak RAC-wise, my system is running only two CPU cores, usual RAC is 37K+, It dropped, bcs recently it was day and a half offline, I tested memory sticks for a review.
ID: 1109916 · Report as offensive
Profile Miep
Volunteer moderator
Avatar

Send message
Joined: 23 Jul 99
Posts: 2412
Credit: 351,996
RAC: 0
Message 1109923 - Posted: 26 May 2011, 10:55:05 UTC - in response to Message 1109916.  
Last modified: 26 May 2011, 10:55:35 UTC

I suspect, if full CPU core is left to serve ATI GPU, calculation times can rise significantly. I obversed this in SETI beta witn NVIDIA OpenCL client. Seems OpenCL needs lot more CPU time than CUDA.


If you mean r246 V7 MB -
a) NVidia OpenCL MB is already significatly slower than CUDA MB [alpha testing results]
b) r246 calculates the autocorrelation on the CPU, hereby significantly increasing CPU times.
Carola
-------
I'm multilingual - I can misunderstand people in several languages!
ID: 1109923 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1109924 - Posted: 26 May 2011, 11:00:56 UTC - in response to Message 1109923.  

b) r246 calculates the autocorrelation on the CPU, hereby significantly increasing CPU times.

But autocorrelation applies only to v7 tasks, which aren't being issued here on the main project yet - so it won't affect main project comparison timings between ATI and nVidia, native and OpenCL modes.
ID: 1109924 · Report as offensive
hbomber
Volunteer tester

Send message
Joined: 2 May 01
Posts: 437
Credit: 50,852,854
RAC: 0
Bulgaria
Message 1109925 - Posted: 26 May 2011, 11:06:07 UTC
Last modified: 26 May 2011, 11:10:45 UTC

Yes, it turned out to be r246.
But I didn't see that ATI OpenCL CPU time is not that big, my bad.
Thank you, good to know this, about that big CPU usage with r246 is not caused by OpenCL platform itself, as I incorrectly thought. I don't follow beta forums.

P.S. I did mistake in my previous post:
"calculation times can rise significantly" must be
"calculation times can decrease significantly"
ID: 1109925 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1109928 - Posted: 26 May 2011, 11:27:50 UTC - in response to Message 1109925.  
Last modified: 26 May 2011, 11:51:45 UTC

I don't follow beta forums.

Might I respectfully suggest that, as a volunteer tester, you reconsider that policy?

The purpose of testing is to learn about the behaviour of new, unreleased applications before they make the transition to the main project, and hence to uncover - and then correct - problems. It's just confusing to the general readership to discuss these pre-release matters in this general forum.

I have to admit that, owing to the general shortage of skilled testers, some beta applications have been announced here to recruit extra testers, and that adds to the confusion. But I would ask that, once people have become involved in a test, they follow it on the appropriate forum.

There's an excellent introduction to testing, methods, benefits and drawbacks at http://setiweb.ssl.berkeley.edu/beta/forum_thread.php?id=1023&nowrap=true#22983. It deserves a wider readership - maybe a moderator could give that post, or the edited version later in the same thread, some added prominence?
ID: 1109928 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1110356 - Posted: 27 May 2011, 12:39:16 UTC - in response to Message 1109923.  
Last modified: 27 May 2011, 12:44:55 UTC


If you mean r246 V7 MB -
a) NVidia OpenCL MB is already significatly slower than CUDA MB [alpha testing results]

Wrong statement. Both V7 apps had almost same speed in error bounds of natural variation last time I did comparison [GSO9600+Core Duo]. Only if something changed in last 4 days?...
ID: 1110356 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1110358 - Posted: 27 May 2011, 12:42:58 UTC - in response to Message 1109924.  

b) r246 calculates the autocorrelation on the CPU, hereby significantly increasing CPU times.

But autocorrelation applies only to v7 tasks, which aren't being issued here on the main project yet - so it won't affect main project comparison timings between ATI and nVidia, native and OpenCL modes.


In general, "native" (i.e. CUDA) version of same kernels launched with the same geometry will be slightly faster, I see this on most already converted kernels.
Maybe, because NV compiler for CUDA C more mature than their OpenCL compiler...
ID: 1110358 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1110360 - Posted: 27 May 2011, 12:54:56 UTC - in response to Message 1109925.  

good to know this, about that big CPU usage with r246 is not caused by OpenCL platform itself, as I incorrectly thought.

Actually, directly converted CUDA app, based on OpenCL code, experiences very big increase in CPU time. But I would account this to too imperfect translation for now, object for further tuning/debugging. In theory CPU consumption will be almost the same, maybe in small favour of CUDA app (I speak about CUDA app based on OpenCL code, not about already released ones, they use different approach to split work between CPU and GPU. IMHO, to compare different languages/compilers one should compare direct translation of the same algorithm on different languages (and vice versa, to compare different algorithms better to use same language/compilers))
ID: 1110360 · Report as offensive
Previous · 1 · 2

Message boards : Number crunching : Which is better ATI Apps or nvidia's cuda?


 
©2026 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.