GPU crunching on G8x

Message boards : Number crunching : GPU crunching on G8x
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 · Next

AuthorMessage
Profile mimo
Volunteer tester
Avatar

Send message
Joined: 7 Feb 03
Posts: 92
Credit: 14,957,404
RAC: 0
Slovakia
Message 691437 - Posted: 14 Dec 2007, 17:34:44 UTC

Hello boys !!!

now i have compiled an app based on last seti code working on GPUs from nvidia - using CUDA technology .
download from from this forum thread.

what needed ?
- 169.21 and above graphics driver
- CUDA based card : 8xxx, tesla

computed on GPU for now :
- FFT
- PowerSpectum

results are validated by knabench at strongly similiar

test and crunch !!!

ID: 691437 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20147
Credit: 7,508,002
RAC: 20
United Kingdom
Message 691453 - Posted: 14 Dec 2007, 18:43:34 UTC - in response to Message 691437.  

... seti code working on GPUs from nvidia - using CUDA technology .
download from from this forum thread.

what needed ?
- 169.21 and above graphics driver
- CUDA based card : 8xxx, tesla

computed on GPU for now :
- FFT
- PowerSpectum...

Wow indeed!!!

So, just to check...

I guess "Windows only"?

Is it worthwhile on 84xx, 85xx and 86xx cards?

(And do 87xx cards exist?)

And what does "tesla" refer to?


Looking very interesting!

Happy crunchin',
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 691453 · Report as offensive
Profile UBT - NaRyan
Avatar

Send message
Joined: 20 Oct 07
Posts: 89
Credit: 165,614
RAC: 0
United Kingdom
Message 691457 - Posted: 14 Dec 2007, 18:49:08 UTC - in response to Message 691453.  
Last modified: 14 Dec 2007, 18:51:18 UTC

... seti code working on GPUs from nvidia - using CUDA technology .
download from from this forum thread.

what needed ?
- 169.21 and above graphics driver
- CUDA based card : 8xxx, tesla

computed on GPU for now :
- FFT
- PowerSpectum...

Wow indeed!!!

So, just to check...

I guess "Windows only"?

Is it worthwhile on 84xx, 85xx and 86xx cards?

(And do 87xx cards exist?)

And what does "tesla" refer to?


Looking very interesting!

Happy crunchin',
Martin


Is it all windows versions (as in XP,2003,Vista... & both x86\\x64)
And any other download locations?
The lunatics forums takes ages to load for me, and I dont see the point having to register to download a file, when I'm never going to post there
ID: 691457 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 691461 - Posted: 14 Dec 2007, 19:01:44 UTC
Last modified: 14 Dec 2007, 19:03:27 UTC

Got your message Mimo..

That's really sweet!

I'm sooo close to Ghosting my C: if i can't get things to work in Vista X64 and reinstall a XP just to test and see the speed..

Good work mimo..

Kind Regards Vyper

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 691461 · Report as offensive
Profile popandbob
Volunteer tester

Send message
Joined: 19 Mar 05
Posts: 551
Credit: 4,673,015
RAC: 0
Canada
Message 691470 - Posted: 14 Dec 2007, 19:59:29 UTC

now if only I had a CUDA card... dang 7950GT!


Do you Good Search for Seti@Home? http://www.goodsearch.com/?charityid=888957
Or Good Shop? http://www.goodshop.com/?charityid=888957
ID: 691470 · Report as offensive
Profile mimo
Volunteer tester
Avatar

Send message
Joined: 7 Feb 03
Posts: 92
Credit: 14,957,404
RAC: 0
Slovakia
Message 691472 - Posted: 14 Dec 2007, 20:09:47 UTC

tesla is new card primary builded for GPU computing by nvidia - see on nvidia pages
it is not a graphics card ....
this is first unoptimized version so it slower than stock app ....

ID: 691472 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 691513 - Posted: 14 Dec 2007, 23:49:50 UTC - in response to Message 691472.  

tesla is new card primary builded for GPU computing by nvidia - see on nvidia pages
it is not a graphics card ....
this is first unoptimized version so it slower than stock app ....


Well thats perfectly allright, it's better that it validates and then start to improve than improve first and never got it to validate :)

So all in all that's the right order for what it seems to me.

//Vyper

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 691513 · Report as offensive
Profile mimo
Volunteer tester
Avatar

Send message
Joined: 7 Feb 03
Posts: 92
Credit: 14,957,404
RAC: 0
Slovakia
Message 691757 - Posted: 15 Dec 2007, 15:21:00 UTC
Last modified: 15 Dec 2007, 15:22:26 UTC

see there : http://setiathome.berkeley.edu/result.php?resultid=681495948
some real unit crunched with my application ....

ID: 691757 · Report as offensive
Profile Gecko
Volunteer tester
Avatar

Send message
Joined: 17 Nov 99
Posts: 454
Credit: 6,946,910
RAC: 47
United States
Message 691787 - Posted: 15 Dec 2007, 16:32:00 UTC - in response to Message 691757.  

see there : http://setiathome.berkeley.edu/result.php?resultid=681495948
some real unit crunched with my application ....


This is a FANTASTIC accomplishment and a true "first" w/ Seti.
Very exciting : > )
ID: 691787 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 691817 - Posted: 15 Dec 2007, 18:41:06 UTC
Last modified: 15 Dec 2007, 18:51:27 UTC

How is this working?

For example, you have an AMD K8 with one CPU.
BOINC show now 2 WUs in work?

For example my Quad would show then 5 WUs in work?

Now I have the GeForce 6200 LE, because to save electricity..

Your app will be only for 8xxx cards?
Or you will extend your nice work for other cards too?

So which card I must/can buy in future?


How I install this app?



BTW.
For example the GeForce 8800 Ultra would be faster than the GeForce 8400 GS, or? Because of MHz?
ID: 691817 · Report as offensive
Profile mimo
Volunteer tester
Avatar

Send message
Joined: 7 Feb 03
Posts: 92
Credit: 14,957,404
RAC: 0
Slovakia
Message 691820 - Posted: 15 Dec 2007, 19:13:45 UTC

its only for nvidia cards with CUDA support - series 8xxx, Tesla supercomputers, some Quadros.

BOINC do not recognize GPU as next core for now so then is recomended for use only one app .
application use only one card for now but in the future there might be a two apps running on more card without SLI ...

tesla will be more speedy as 8800gtx because there are 2 GPUs 8800 (2*16 multiprocessors- MP).
8800 is more speedy as 8500 because 8800 has 16 MP and 8500 only 2 . MHz are not so relevant as count of MP ....

app is tested on winxp 32 bit and onlz for this OS ...

ID: 691820 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 691846 - Posted: 15 Dec 2007, 21:04:04 UTC



When it will be possible that the BOINC client will support a GPU-app?

Or maybe let run a BOINC client and a GPU-client in the same time?
Because, normally you can let run two or more progs at the same time in Windows too, or?


ID: 691846 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 691922 - Posted: 16 Dec 2007, 9:08:03 UTC - in response to Message 691846.  
Last modified: 16 Dec 2007, 9:13:23 UTC



When it will be possible that the BOINC client will support a GPU-app?

Or maybe let run a BOINC client and a GPU-client in the same time?
Because, normally you can let run two or more progs at the same time in Windows too, or?



The progress for what it seems is that mimo here is going to let the gpu take care of more and more of the s@h client so the GPU uses it to the max.. I don't think for now that he has experimented in letting multiple WU's run in parallell in the GPU so we can see the true benefit of it.

Mind if for instance a 8500GT Gpu makes the most of itself if you can run 6 Wu's in parallell and a 8800GTS+ makes the most of 10 Wu's in parallell those people who makes the Boinc application itself would need to make a "Use GPU" button and enter the number of virtual wu's u want to run in parallell using a supporting GPU app, in this case a cuda based GPU app.

I really think that we won't really see the GPU's beeing faster only running one application but if we add more parallellism the benefit of a well optimised GPU app will make its way, the trick is to free the s@h app of unused CPU cycles letting other applications get those when the cpu is just waiting for data to be fed. Perhaps lowering the priority of that application to lower than normal so that you wont really notice it in the O/S and spare cycles is beeing taken care of the normal s@h cpu client..

If this thought succeeds then perhaps you can run 8 gpu apps and 4 cpu apps in parallell.. We surely wont get the cpu apps in a core2q based systems calculated in 1,5 hour in real time any longer but perhaps 3 hours each but the gpu gets feeded with data and makes the most out of you system anyway in terms of RAC..

This is just pure speculations now what i think we will get the most out of a gpu application all in all and not in terms of how-much-faster-than-a-cpu is this app because i don't sincerly think we can't compete in pure cpu/gpuspeed with todays busprotocols and speeds, and this is where i think ATI/AMD comes into hand in a few years from now when they have developed a true high speed bus to the gpu and a compiler that makes the most out of any application game as fpu/sse+ enhanced calculating applications.

My 2 cents for now.

Kind Regards Kevin Peterson

P.S You will need alot of memory if 12 s@h applications should occupy system memory in windows ;-) D.S

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 691922 · Report as offensive
Profile mimo
Volunteer tester
Avatar

Send message
Joined: 7 Feb 03
Posts: 92
Credit: 14,957,404
RAC: 0
Slovakia
Message 691947 - Posted: 16 Dec 2007, 13:58:51 UTC

it is better to run one wu per core (core by GPU is whole card) as it is on CPU
by GPU is "context" switching more time expensive as task switching on CPU, so this will only degrade power ...
so when run this client version , avoid any running aplication in opengl/direct3d mode - the speed will massively degrade !!!

ID: 691947 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 691972 - Posted: 16 Dec 2007, 17:16:50 UTC - in response to Message 691947.  

it is better to run one wu per core (core by GPU is whole card) as it is on CPU
by GPU is "context" switching more time expensive as task switching on CPU, so this will only degrade power ...
so when run this client version , avoid any running aplication in opengl/direct3d mode - the speed will massively degrade !!!


Oh i didn't really know that! So the only way to speed up things is perhaps divide a single WU in parallell "chunks" then?!
I sincerly thought that the GPU would be better of running multiple apps really but hey it was only a thought as i said..

Better to be safe than sorry..

Kind regards Vyper

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 691972 · Report as offensive
Profile mimo
Volunteer tester
Avatar

Send message
Joined: 7 Feb 03
Posts: 92
Credit: 14,957,404
RAC: 0
Slovakia
Message 692009 - Posted: 16 Dec 2007, 19:43:26 UTC

when you have more GPU (absolutely the same - chip,memory size ...) there is a way to run another instance of app witch would be sligtly modified - select first not used gpu and run . next thing is that app must run in async mode ,prevent blocking each other on cpu. and last GPUs may not be in SLI mode - in SLI they looked as one .
you say thats better, no? because then its cumulated their power .... by seti not. seti has not too extensive math for this mode - they would better in that case when the number of points wold be about 10-20 milions . then wolud be good thing for SLI mode to distribute the FFT and so on over cards

ID: 692009 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65709
Credit: 55,293,173
RAC: 49
United States
Message 692015 - Posted: 16 Dec 2007, 20:28:30 UTC

This is fantastic news, Only problem is I don't have an 8800 of any sort and It's going to be a while before I get one, So I can wait as I have a 7800GTX right now.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 692015 · Report as offensive
Profile Rowe Family and Friends

Send message
Joined: 25 Dec 00
Posts: 17
Credit: 38,395,231
RAC: 67
New Zealand
Message 692416 - Posted: 18 Dec 2007, 0:26:53 UTC

anyway there can be another download link added? something u dont have to log into?
ID: 692416 · Report as offensive
Profile mimo
Volunteer tester
Avatar

Send message
Joined: 7 Feb 03
Posts: 92
Credit: 14,957,404
RAC: 0
Slovakia
Message 692565 - Posted: 18 Dec 2007, 12:39:21 UTC

i am sorry but for now is application for testing only so you must register for download ....


ID: 692565 · Report as offensive
HDL
Volunteer tester

Send message
Joined: 20 Apr 05
Posts: 27
Credit: 11,577,352
RAC: 0
United Kingdom
Message 692764 - Posted: 19 Dec 2007, 10:14:20 UTC - in response to Message 692565.  
Last modified: 19 Dec 2007, 10:14:50 UTC

mimo, thanks for all the efforts.

In a 4 CPU core, 1 GPU system, how does BOINC know if it should get GPU application for next work unit? In the present system, how many units can be running concurrently?

If there is 3 CPU and 1 GPU application, there will be no competition for CPU as the GPU application also needs CPU support. If there are 4 CPU and 1 GPU application, then processor affinity could be ruined, performance will be impacted. Am I right?
ID: 692764 · Report as offensive
1 · 2 · 3 · 4 · Next

Message boards : Number crunching : GPU crunching on G8x


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.