Message boards :
Number crunching :
compiling seti cruncher
Message board moderation
Author | Message |
---|---|
Metod, S56RKO Send message Joined: 27 Sep 02 Posts: 309 Credit: 113,221,277 RAC: 9 |
I'm compiling seti cruncher (eventually to give some work to my linux/alpha machines) and am going through optimization switches. To verify correct crunching, I'm comparing 'my' results to the reference results. There's a nice table about how close different numbers have to be to the reference result to consider it OK. Nothing, however, is written about the pot values. Should they be identical to the reference result or what? I guess no gaussians should be completely missing, right? Metod ... |
Metod, S56RKO Send message Joined: 27 Sep 02 Posts: 309 Credit: 113,221,277 RAC: 9 |
Bump. Metod ... |
Chris Bosshard Send message Joined: 5 Jun 99 Posts: 86 Credit: 3,474,583 RAC: 0 |
Hi Metod Did you checkout this site? http://www.pperry.f2s.com/ Chris Bosshard Visit my homepage astroinfo SETI page |
Ned Slider Send message Joined: 12 Oct 01 Posts: 668 Credit: 4,375,315 RAC: 0 |
These are the validation limits I believe seti uses: ra 0.00066 hours absolute dec 0.01 degrees absolute time .000011574 days absolute frequency 0.01 Hz absolute chirp_rate 0.01 Hz/s absolute fft_len exact powers 1% relative chisqrs 1% relative triplet period 0.01 second absolute pulse period 0.1% relative pulse threshold 1% relative pulse snr 1% relative I don't have any more info than this. Ned *** My Guide to Compiling Optimised BOINC and SETI Clients *** *** Download Optimised BOINC and SETI Clients for Linux Here *** |
Metod, S56RKO Send message Joined: 27 Sep 02 Posts: 309 Credit: 113,221,277 RAC: 9 |
> These are the validation limits I believe seti uses: [ snip ] I've seen these limits (on your site, thank you). I still don't know how to treat the differences in pot lists. Additionaly, I can't figure out to which result data correspond some of result.xml entries. I hope to get some info from project developers ;) Metod ... |
Benher Send message Joined: 25 Jul 99 Posts: 517 Credit: 465,152 RAC: 0 |
Metod, There should be a one to one match for every signal in the baseline result file, and the result file produced by your compiled client. For every <spike> in one file there should be a <spike> in yours. For every <triplet>, <pulse>, <gaussian>, etc. The inside parts of these signals ( period, frequency, ascention, etc.) should use the rules listed below to see if there is a close enough match. From looking at the source for sah_validate I don't see any references to <pot>, so I'm going to suggest they don't compare them. Here is how to compute absolute and rel differences...as mentioned. ---------- // the difference between two numbers, // as a fraction of the largest in absolute value // double rel_diff(double x1, double x2) { if (x1 == 0 && x2 == 0) return 0; if (x1 == 0) return 1; if (x2 == 0) return 1; double d1 = fabs(x1); double d2 = fabs(x2); double d = fabs(x1-x2); if (d1 > d2) return d/d1; return d/d2; } double abs_diff(double x1, double x2) { return fabs(x1-x2); } |
Metod, S56RKO Send message Joined: 27 Sep 02 Posts: 309 Credit: 113,221,277 RAC: 9 |
Would you guys with custom-compiled seti cruncher enlighten me, how to do it properly? I'm kinda desperate. For a week I've been trying to get a decent working executable using two different C/C++ compilers (gcc 3.4.4 and Intel icc 8.1) with different levels of optimization. I tried two different snapshots (2005-03-23 and 2005-03-31) - both declaring themselves as 4.07 - and could not produce a binary that wouldn't miss a certain gaussian which is found by official 4.02 client. Even if I compiled the thing using gcc with only '-O1' the result is worthless. [ed] I'm trying this on my linux/x86 FWIW [/ed] [ed 2] The difference between results of gcc -O1 and icc -fast compiled crunchers is negligible precision-wise and 25% CPU-wise. The official cruncher is in the middle time-wise. [/ed 2] My (a bit wild) guess is that I'm trying to compile too fresh sources. Is there some smoke here? Metod ... |
Benher Send message Joined: 25 Jul 99 Posts: 517 Credit: 465,152 RAC: 0 |
Congratulations on getting it to compile. Thats no mean feat sometimes ;) Eric Korpela has been adding some math to the v_chirpData function recently...in line with some discussions he and other mathmeticians have been having on the boinc_opt or boinc_dev mailing list. I believe it was on or about Feb 17th-24th. If you get a version from before those dates of "analyseFuncs.cpp, seti.cpp and seti.h", that would, I believe bring you back to where the signal detection code is currently. I can't recall if the CVS for setiboinc is available as a navigable website yet (like the BOINC source is). If not you would need to find a tarball with an earlier copy of those files, or do a CVS checkout of those files only with the date specific command line argument. -------- If I read your earlier message correctly, Intel Linux compiler produces seti code that is 12.5% faster than current "live" seti cruncher? |
Metod, S56RKO Send message Joined: 27 Sep 02 Posts: 309 Credit: 113,221,277 RAC: 9 |
> If you get a version from before those dates of "analyseFuncs.cpp, seti.cpp > and seti.h", that would, I believe bring you back to where the signal > detection code is currently. Benher, you're my man. Now my client produces result which would pass the validation. I'm going to test it on another host, but here's the speedup: official: 12424 seconds my: 8183 seconds Both results on dual X @ 3.06 GHz HT (a quarter of a machine ;) ) My cruncher is optimized for P4 and am going to test it on a P4 @ 1.7GHz. And the best thing is: it's completely staticaly linked. Now I can build a P2-optimized client and check my older boxes if they benefit the same rate as this beast. Metod ... |
Ned Slider Send message Joined: 12 Oct 01 Posts: 668 Credit: 4,375,315 RAC: 0 |
> > I'm going to test it on another host, but here's the speedup: > > official: 12424 seconds > my: 8183 seconds > WOW - care to elaborate exactly how you got this much improvement over the official 4.02 seti/linux client?? Ned *** My Guide to Compiling Optimised BOINC and SETI Clients *** *** Download Optimised BOINC and SETI Clients for Linux Here *** |
Metod, S56RKO Send message Joined: 27 Sep 02 Posts: 309 Credit: 113,221,277 RAC: 9 |
> WOW - care to elaborate exactly how you got this much improvement over the > official 4.02 seti/linux client?? No big deal. I took the seti_boinc-client-cvs-2005-03-31.tar.gz, boinc_public-cvs-2005-03-31.tar.gz, took the files benher mentioned from seti_boinc-client-cvs-2005-02-16.tar.gz and then followed the instructions from your pages. The key here is the following: export CFLAGS="-fast -mcpu=pentium4 -march=pentium4 -tpp7 -parallel -ipo -ipo_obj" export CXXFLAGS=${CFLAGS} CC=icc CXX=icpc ./configure --disable-server The above is basically full optimization available for P4 in Intel CC. Ad. static executable: I noticed that configure (will look into it if I have time during the weekend) creates Makefile that explicitly links with shared version of libjpeg and thusly with shared version of libc (and another one which I don't remember now). So I removed client/seti_boinc and re-ran the command which creates seti_boinc. I did replace '-l/usr/lib/libjpeg.so.62' with '-ljpeg' and ... voilá, staticaly linked executable. [edit] Just finished compilaton with PII optimizations. Ned, if you care to get the two binaries for testing, you can mail me at metodk_at_gmail_dot_com. [/edit] Metod ... |
N/A Send message Joined: 18 May 01 Posts: 3718 Credit: 93,649 RAC: 0 |
I took the seti_boinc-client-cvs-2005-03-31.tar.gz, boinc_public-cvs-2005-03-31.tar.gz, took the files benher mentioned from seti_boinc-client-cvs-2005-02-16.tar.gz and then followed the instructions from your pages. What's the URL? export CFLAGS="-fast -mcpu=pentium4 -march=pentium4 -tpp7 -parallel -ipo -ipo_obj" export CXXFLAGS=${CFLAGS} CC=icc CXX=icpc ./configure --disable-server GCC3, right? I'm thinking export CFLAGS="-mcpu=7450 -fast"... |
Ned Slider Send message Joined: 12 Oct 01 Posts: 668 Credit: 4,375,315 RAC: 0 |
I could test a client optimized for athlon-xp if you cared to do one. Well, it seems your improvements are in large part coming from the use of the intel compiler (icc) over the gnu compiler (gcc). Very interesting :) Ned *** My Guide to Compiling Optimised BOINC and SETI Clients *** *** Download Optimised BOINC and SETI Clients for Linux Here *** |
Metod, S56RKO Send message Joined: 27 Sep 02 Posts: 309 Credit: 113,221,277 RAC: 9 |
> I took the seti_boinc-client-cvs-2005-03-31.tar.gz, > boinc_public-cvs-2005-03-31.tar.gz, took the files benher mentioned from > seti_boinc-client-cvs-2005-02-16.tar.gz and then followed the instructions > from your pages. > What's the URL? Ned's pages have it all: seti and boinc. > export CFLAGS="-fast -mcpu=pentium4 -march=pentium4 -tpp7 -parallel -ipo -ipo_obj" > export CXXFLAGS=${CFLAGS} > CC=icc CXX=icpc ./configure --disable-server > GCC3, right? I'm thinking export CFLAGS="-mcpu=7450 -fast"... No, Intel CC v8.1. @Ned: Intel CC only performs optimizations for Intel CPUs. You can still try P4-optimized on Athlon (I've heard that recent ICC versions produce quite AMD-friendly code). Metod ... |
N/A Send message Joined: 18 May 01 Posts: 3718 Credit: 93,649 RAC: 0 |
Thanks for the link. Intel CC v8.1. [shrug] Oh, well. Once I get my slow box crunching properly I'll try using export with gcc - It's worth a try. |
Ned Slider Send message Joined: 12 Oct 01 Posts: 668 Credit: 4,375,315 RAC: 0 |
> > @Ned: Intel CC only performs optimizations for Intel CPUs. You can > still try P4-optimized on Athlon (I've heard that recent ICC versions produce > quite AMD-friendly code). > LOL - well that's entering into the spirit of openness!! I know PIII and P4 optimized code from gcc won't run on an athlon so I can't believe that code generated by icc would be any different, unless it's retaining full i686 compatability (like using -mcpu instead of -march). One other thing we (Chris and I) had previously discussed was the possibility of making a generic i686 optimized client. Most of our improvements with gcc came from -ffast-math rather than any specific processor architecture optimizations, so I wondered if a static -mcpu=i686 generic client may be the way to go for broad compatability with improved performance. Would it be possible for you to create such a client using the icc compiler and compare performance to your P4 optimized client? We are unable to produce static builds atm due to limitations in gcc passing the -static flags properly. Ned *** My Guide to Compiling Optimised BOINC and SETI Clients *** *** Download Optimised BOINC and SETI Clients for Linux Here *** |
Metod, S56RKO Send message Joined: 27 Sep 02 Posts: 309 Credit: 113,221,277 RAC: 9 |
> > @Ned: Intel CC only performs optimizations for Intel CPUs. You > can > > still try P4-optimized on Athlon (I've heard that recent ICC versions > produce > > quite AMD-friendly code). > > > > LOL - well that's entering into the spirit of openness!! > > I know PIII and P4 optimized code from gcc won't run on an athlon so I can't > believe that code generated by icc would be any different, unless it's > retaining full i686 compatability (like using -mcpu instead of -march). Well, my first step was to compile boinc CC using icc (just to optimize benchmark ;) ). A friend of mine is successfully running my icc P4 optimized boinc CC on (at least one) AMD box. So it might be worth to try (I might succeed persuading him to kick in ...). Metod ... |
Benher Send message Joined: 25 Jul 99 Posts: 517 Credit: 465,152 RAC: 0 |
> Now I can build a P2-optimized client and check my older boxes if they benefit > the same rate as this beast. Don't expect great results from "P2 optimized", as original seti with 686 flags is probably as good as that gets (maybe -O3 slight improvement). A pentium II has big issues with L2 cache size, RAM access speed, and FPU speed with the CPU itself. ---------- Intel "may" know a thing or two about Pentium 4 optimizations ;) I know on the Windows version of their compiler they can detect C++ code that can be compiled into SIMD FPU code. They call it "vectorized". So if they saw a loop of code that was adding 4 single precision floats, their compiler could turn it into a SIMD 4 add instruction. It would probably find lots of opportunities inside of the Fast Fourier Transform code (which is about 36% of execution time). ---------- athlon-xp... Regular Athlon has 3DNow+ abilities, so a Pentium II optimized client should work there. Athlon-XP adds SSE abilities, so a Pentium III optimized client might well work there. Athlon 64 or Athlon XP-M add SSE2 abilities so Pentium IV optimized might work. |
Chris Bosshard Send message Joined: 5 Jun 99 Posts: 86 Credit: 3,474,583 RAC: 0 |
Very interesting indeed.... ;-) Metod let us know if you have compiled a i686 version, I would be interested to try that on some of my boxes comparing it to the clients I have compiled with GCC. Chris Chris Bosshard Visit my homepage astroinfo SETI page |
Metod, S56RKO Send message Joined: 27 Sep 02 Posts: 309 Credit: 113,221,277 RAC: 9 |
Life is full of surprises. My Pii-optimized client on a P3 (dual 1GHz coppermine 256kB cache) did gain some speed-up: official: 38325 seconds optimized: 34549 seconds (10% speedup) Test run on my 'staight' P4 (non-HT, 1,7GHz, 256kB cache) using P4-optimized client: official: 25040 seconds optimized: 16242 seconds (54% speedup) This speedup is quite significant compared to the one for Xeon HT (see this post). [edit] I rechecked the meaning of icc optimization flags and it turned out that my P4-optimized client was indeed using P4 instruction set but not SSE3 instructions (probably only SSE2/MMX). What is the meaning of this for Athlons ... I don't know. [/edit] I will compile an i686-optimized client without CPU-specific optimizations and let you know the results. Metod ... |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.