Posts by mimo

81) Message boards : Number crunching : GPU crunching (Message 494063)
Posted 30 Dec 2006 by Profile mimo
Post:
ok after i have read infos on this page brook implementation is gone because its
speed is incomparable to CUDA ...
82) Message boards : Number crunching : GPU crunching (Message 493237)
Posted 29 Dec 2006 by Profile mimo
Post:
256 x 256 matrix multiply have comparable speed on cpu and gpu ...
binaries i upload tommorow evening.
128k ? complex points is how many floats ???
because you can upload only 2048 x 2048 float4 texture onto many gpus...

sorry for my stupid questions but i am working with seti source 5 hours only...
83) Message boards : Number crunching : GPU crunching (Message 493219)
Posted 29 Dec 2006 by Profile mimo
Post:
ok i take a look to the source code for fft in seti CVS.
please can some one give me some extra explanation for cdft routine params ?
if it is really standart 1d-dft then its easy implement it ...

and please can someone send me a functional source tarball ?... thanx
84) Message boards : Number crunching : GPU crunching (Message 493203)
Posted 29 Dec 2006 by Profile mimo
Post:
brook compiler : SSE2 + max optimization with VS2005 + SP1 x86 compiled
brook runtime : SSE2 + max optimization with VS2005 + SP1 x86 compiled
selected dx9 brook backend
GPU : NV43 (6600 PCIE)
cpu : Athlon64 3000+ @3400 939 socket(512kb cache)
cpu multiply : standard math algorithm (3 loops)

test app : SSE2 + max optimization with VS2005 + SP1 x86 compiled ( i think that cpu multiply is in SSE2 from compiler not from me)
ok there are some numbers :

matrix multiply 1024*1024:
with brook : 2 sec
cpu only : 30 sec
for fft send me matrix representation of algo , but i think its similar to dct ?


seee difs ...
85) Message boards : Number crunching : GPU crunching (Message 493174)
Posted 29 Dec 2006 by Profile mimo
Post:
Hans have you tried a BrookGPU ? i have implemented in brook a (i)dct algorithm and its a nice fast (ffdshowtryout). brook is very simple and nice optimized


Nope, I haven't looked at it yet.
Did you find any recent performance numbers for the 1D FFT?

The Core 2 gets at up to 15GFlops and is pretty tough to beat :o)


Regards Hans

maybe i compile some test program ...
86) Message boards : Number crunching : GPU crunching (Message 492152)
Posted 28 Dec 2006 by Profile mimo
Post:
Hans have you tried a BrookGPU ? i have implemented in brook a (i)dct algorithm and its a nice fast (ffdshowtryout). brook is very simple and nice optimized


Previous 20


 
©2020 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.