Message boards :
Number crunching :
Hey Who, lets discuss code
Message board moderation
Author | Message |
---|---|
![]() Send message Joined: 25 Jul 99 Posts: 517 Credit: 465,152 RAC: 0 ![]() |
Regarding your idea of splitting the chirped complex data into simd size chunks...apparently the Intel IPP library is already doing something like that. This is from a profile disassembly of a run WU. Intel library calls a function 'cFft_BlkSplit' which does the following: w7_ipps_cFft_BlkSplit_32fc+200: movaps xmm0,[edi] movaps xmm1,[edi+10h] movaps xmm2,[edi+20h] movaps xmm3,[edi+30h] add edi,40h movaps xmm4,xmm0 unpckhps xmm4,xmm1 movaps xmm5,xmm2 unpcklps xmm2,xmm3 unpckhps xmm5,xmm3 movaps [edx+esi],xmm0 movaps [edx+esi+10h],xmm4 movaps [edx+esi+20h],xmm2 movaps [edx+esi+30h],xmm5 add edx,40h sub eax,08h jnle $-3dh (0x4adf78) |
![]() ![]() Send message Joined: 17 Dec 99 Posts: 4215 Credit: 3,474,603 RAC: 0 ![]() |
Regarding your idea of splitting the chirped complex data into simd size chunks...apparently the Intel IPP library is already doing something like that. Your post shortened to save space; You might want to find Chicken and discuss the code with him/her. They are doing the current 'optimized' versions of the software and should be able to discuss this with you. ![]() |
![]() ![]() Send message Joined: 9 Jul 99 Posts: 1199 Credit: 6,615,780 RAC: 0 ![]() |
Yeah well, we are ;) Ben has been part of that work for a while, mikey. Let him be :o) Regards, Simon. Donate to SETI@Home via PayPal! Optimized SETI@Home apps + Information |
![]() Send message Joined: 2 Aug 00 Posts: 1851 Credit: 5,955,047 RAC: 0 ![]() |
I don't think Chicken is a girl. Now if his name were Simona or Simone that would probably be a different story. I think he is a young (and very important to this project) man. Maybe about the age of one of my nephews, if I remember correctly. |
![]() ![]() Send message Joined: 9 Jul 99 Posts: 1199 Credit: 6,615,780 RAC: 0 ![]() |
31, to be exact, and male :o) Regards, Simon. Donate to SETI@Home via PayPal! Optimized SETI@Home apps + Information |
![]() Send message Joined: 25 Jul 99 Posts: 517 Credit: 465,152 RAC: 0 ![]() |
For me, the di is going in the range of from 3900 to 5500, I guess it is based on the work load. I wrote a few sampler lines of code in find_pulse. Let it crunch WU #2 for about 2 minutes, then captured the output... The number in the [number] brackets is the number of times the find_pulse was called with a given length. The number outside the bracket is the length. So as you can see, find_pulse was called 15million times with lenght of 17, and 4 million times with length 33, and so on. This was only during a 2 minute run, so multiply these results by a few hours. Short are FAR more common. The new relase of chicken should be around 25-30% faster soon. Typical calling length values taken from a WU run: LENGTH[ times used ] -- 16913[ 105] -- 8456[ 225] -- 4228[ 465] -- 2114[ 945] -- 1057[ 5715] -- 529[19125] -- 264[68,985] -- 132[260,865] -- 66[1,043,970] -- 33[4,118,610] -- 17[15,480,990] -- Which lengths cause the following di length value in sum2 and sum3 --- Length = 132 pulse tbl3: (di: 44) [ 132]= 0, 44, 88 pulse tbl2: (di: 22) [ 198]= 132, 154, pulse tbl2: (di: 11) [ 231]= 198, 209, pulse tbl2: (di: 5) [ 247]= 231, 237, --- Length = 66 pulse tbl3: (di: 22) [ 66]= 0, 22, 44 pulse tbl2: (di: 11) [ 99]= 66, 77, pulse tbl2: (di: 5) [ 115]= 99, 105, --- Length = 33 pulse tbl3: (di: 11) [ 33]= 0, 11, 22 pulse tbl2: (di: 5) [ 49]= 33, 39, --- Length = 17 pulse tbl3: (di: 5) [ 17]= 0, 6, 11 |
![]() Send message Joined: 25 Nov 01 Posts: 21731 Credit: 7,508,002 RAC: 20 ![]() ![]() |
OK, I'm jumping in 99% blind here 'cos I ain't looked at the code!... Are you rewriting the chirp routines to make better use of SIMD for small chunks? Could that be extended so that you also make better use of L1 cache and then L2 cache for more of the range?? The 30% speedup sounds very interesting!... :-) Happy hackings, Regards, Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.