Message boards :
Number crunching :
Version 3.4 of Faster SETI cruncher for Linux
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author | Message |
---|---|
![]() ![]() Send message Joined: 23 Jul 99 Posts: 311 Credit: 6,955,447 RAC: 0 ![]() |
Part II: To HT or not to HT: that is the question. Test machine: HT testing on a single-P4 3.2GHz Prescott sse sse2 pni support. 2GB RAM. Result: time spent on a reference workunit in minutes. 2 processes in parallel: setiathome_SSE3-naparst-r3.4 55 setiathome_SSE2-naparst-r3.4 56 1 process in parallel: setiathome_SSE3-naparst-r3.4 29 So, the answer is it depends. I recommend leaving "Number of processors" at 2 in the Boinc preferences. Harold indicated that FSB is the bottleneck. For the dual-Xeon, the shared FSB seems to be much more of a limitation compared to a single-P4. ![]() ![]() |
Ned Slider Send message Joined: 12 Oct 01 Posts: 668 Credit: 4,375,315 RAC: 0 ![]() |
Yes, because you now have four simultaneous processes sharing that bandwidth instead of two. Interesting that even the single HT procesor running two processes gains very little. 2 WUs in HT mode in 56mins versus 2 single WUs in non-HT mode in 2 x 29 = 58 mins. It would appear that hyperthreading actually gains you very little (~3%). Ned *** My Guide to Compiling Optimised BOINC and SETI Clients *** *** Download Optimised BOINC and SETI Clients for Linux Here *** |
![]() ![]() Send message Joined: 23 Jul 99 Posts: 311 Credit: 6,955,447 RAC: 0 ![]() |
Indeed. Also note that Boinc benchmarking on HT computers is nearly meaningless. Boinc benchmark uses one process and symmetric multi-processing to the specified number of processors, while Seti application runs in a single process/single thread. I found that running single-P4/HT with 2 "Boinc CPUs" produces better claimed credit for the nearly equal (within 3%) amount of work. A workunit claims about 15-18 credit with Harold's ICC-for-P4 boinc. Running dual-Xeon/HT with 4 "Boinc CPUs" is so much slower that it is not worth the credit. A work unit typically claims about 10 credit when running with 2 "Boinc CPUs". Same boinc client. ![]() ![]() |
![]() Send message Joined: 4 Jun 99 Posts: 112 Credit: 471,529 RAC: 0 ![]() |
I do not dispute 3% is very little. But taken in the context of machine(s) crunching 24/7 it adds up. If I offer you $100 today or $103 it may not seem like a big difference. But $100 everyday versus $103 everyday and the propositions do not compare. ![]() ![]() |
HDL Send message Joined: 20 Apr 05 Posts: 27 Credit: 11,577,352 RAC: 0 ![]() |
Which is the best SETI client for Pentium with HT? Hans Dorn using TMR 4th Nov client can crunch two work units at around 2100s (35 mins at 3.2GHz). From what I saw below, Harold's client can crunch 2 work unit at 56 mins. |
Hans Dorn ![]() Send message Joined: 3 Apr 99 Posts: 2262 Credit: 26,448,570 RAC: 0 ![]() |
Which is the best SETI client for Pentium with HT? Hans Dorn using TMR 4th Nov client can crunch two work units at around 2100s (35 mins at 3.2GHz). From what I saw below, Harold's client can crunch 2 work unit at 56 mins. Whoops. Thats just me being too lazy to fix the credits. The client I'm using is similar to Harold's. I compiled my own version. Regards Hans |
![]() ![]() Send message Joined: 15 Apr 99 Posts: 1546 Credit: 3,438,823 RAC: 0 ![]() |
Which is the best SETI client for Pentium with HT? Hans Dorn using TMR 4th Nov client can crunch two work units at around 2100s (35 mins at 3.2GHz). From what I saw below, Harold's client can crunch 2 work unit at 56 mins. Harolds 3.4 client is the fastes setiathome app. for P4. ![]() Join BOINC United now! |
HDL Send message Joined: 20 Apr 05 Posts: 27 Credit: 11,577,352 RAC: 0 ![]() |
Which is the best SETI client for Pentium with HT? Hans Dorn using TMR 4th Nov client can crunch two work units at around 2100s (35 mins at 3.2GHz). From what I saw below, Harold's client can crunch 2 work unit at 56 mins. Then why is the big difference between Hans work unit time and that of Michael37? 35 mins vs 56 mins? Is Hans' 3.2GHz computer overclocked? |
Hans Dorn ![]() Send message Joined: 3 Apr 99 Posts: 2262 Credit: 26,448,570 RAC: 0 ![]() |
Which is the best SETI client for Pentium with HT? Hans Dorn using TMR 4th Nov client can crunch two work units at around 2100s (35 mins at 3.2GHz). From what I saw below, Harold's client can crunch 2 work unit at 56 mins. Hi, I'm not sure which host you're referring to :o) I just used Andy's Page to get some averages: P4 3.2 (1MB L2) : 3200s P4 3.2 (2MB L2) : 2550s P4D 3.2@3.6 : 1810s Pentium-M 2.0@2.1 : 2100s Regards Hans P.S: The P4D absolutely rocks. It overclocks remarkably well, too :o) |
HDL Send message Joined: 20 Apr 05 Posts: 27 Credit: 11,577,352 RAC: 0 ![]() |
Which is the best SETI client for Pentium with HT? Hans Dorn using TMR 4th Nov client can crunch two work units at around 2100s (35 mins at 3.2GHz). From what I saw below, Harold's client can crunch 2 work unit at 56 mins. The P4 3.2 (2MB L2) : 2550s is about 42.5 mins (for 2 work unit). The P4 3.2 (1MB L2) : 3250s is about 53.5 mins (for 2 work unit). It is still better than Michael37's 56 mins (5%). How do you achieve that? I hope the figure you give is averaged out of many units so it will be similar to the reference unit. P4D is excellent. I have a P4D XE 3.2GHz, it is overclocked to 3.9GHz. 47 mins for 4 work units. RAC is 2354 at momoent. I am thinking of buying a P4D 3.0. It is 1/3 price of the P4D XE 3.2. I can overclock it to around 3.6 GHz. That will be 60 mins for 4 work unit (according to your figure above). Performance/price ratio is much better. Best regards |
![]() Send message Joined: 3 Apr 99 Posts: 98 Credit: 2,667,122 RAC: 1 ![]() |
my p4 3.0 prescott overclocked to 3.35 has achieve 58min 2wu.... it seems a bit slow... |
![]() ![]() Send message Joined: 23 Jul 99 Posts: 311 Credit: 6,955,447 RAC: 0 ![]() |
To continue my topic, the next questions is Eh? What is Hyperthreading? Test machine: dual AMD Opteron(tm) Processor 242 1.6GHz, 1MB L2 cache, 1GB RAM Result: time spent on a reference workunit in minutes. Kernel: 2.4.21-37.ELsmp 32-bit for athlon 2 processes in parallel: AthlonXP_SSE_FFTW3_Caching_V2.09s w/wisdom 68 setiathome_SSE2-naparst-r3.4 58 Kernel: 2.6.9-22.0.1.ELsmp 64-bit for Opteron (not the default "Generic x86_64") 2 processes in parallel: AthlonXP_SSE_FFTW3_Caching_V2.09s w/wisdom 65 setiathome_SSE2-naparst-r3.4 59 Has anyone made a FFTW/SSE2 version of the client? I would love to benchmark it too. Since we are at it, a couple of boinc benchmarks: version 4.72 boinc_amd64_immoral (naparst) 2177/6002 boinc_naparst_p4 2101/4061 Boinc_4.72_AthlonXP 1799/3319 version 4.19 ned's AMD64 64-bit 1452/5524 ned's AMD64 32-bit 1699/4641 ![]() ![]() |
![]() ![]() Send message Joined: 15 Apr 99 Posts: 1546 Credit: 3,438,823 RAC: 0 ![]() |
To continue my topic, the next questions is Eh? What is Hyperthreading? To answer you question about FFTW/SSE2: It doesn't work because we are using FFTW3 in single precision (SSE) and SSE2 is double precision and we don't need that ;) P.S. You should also take a look at my page for boinc clients i guess they are the fastest on linux (exept the immoral amd64) ![]() Join BOINC United now! |
![]() ![]() Send message Joined: 23 Jul 99 Posts: 311 Credit: 6,955,447 RAC: 0 ![]() |
I see. Well, the AthlonXP FFTW/SSE build was about 15% slower.
For boinc v4 clients, ICC/IPP Harold's build beat the GCC AthlonXP one. I bet SSE2 definitely helps with Whetstone. I'll try boinc v5 clients next. ![]() ![]() |
Ned Slider Send message Joined: 12 Oct 01 Posts: 668 Credit: 4,375,315 RAC: 0 ![]() |
I built a 32-bit SSE2 enabled seti client using FFTW3 for athlon64 based on Harold's source tree and benchmarked it on one of Harold's AMD X2 systems. It was slower than Harold's SSE2 clients. Ned *** My Guide to Compiling Optimised BOINC and SETI Clients *** *** Download Optimised BOINC and SETI Clients for Linux Here *** |
![]() ![]() Send message Joined: 23 Jul 99 Posts: 311 Credit: 6,955,447 RAC: 0 ![]() |
@Crunch3r First of all, thanks for putting together optimized boinc and seti for itanium. Even though I think that Itanium was Intel biggest mistake in the past years, I have one system with Itanium 2 running RHEL4. I tried using your builds for Itanium, and they work. Again, thank you. However, the seti application looks quite dated. Two questions: Was it built with ICC or GCC? Considering I can't even build my own boinc on EM64T with icc, I won't even try building it on IA64. Do you want to try building Harold's client on Itanium? Do you want to give me some pointers? michael37 ![]() ![]() |
![]() ![]() Send message Joined: 15 Apr 99 Posts: 1546 Credit: 3,438,823 RAC: 0 ![]() |
@Crunch3r First of all the ia64 was build with gcc and sure it is a bit outdated. If you want anew client i can build one for you. P.S. Would be interesting if you could post processing times on you itanium ( your computers are hidden so i can't look myself) ![]() Join BOINC United now! |
![]() ![]() Send message Joined: 23 Jul 99 Posts: 311 Credit: 6,955,447 RAC: 0 ![]() |
That would be nice. Is there a reason why you haven't tried icc and ipp? Both are available for Itaniums.
Here it is. The average time is 9,757s. ![]() ![]() |
![]() ![]() Send message Joined: 15 Apr 99 Posts: 1546 Credit: 3,438,823 RAC: 0 ![]() |
I think i can cut processing times at least by half with a new client :-) My Alpha ev67@600MHz went down from 30000sec. per wu to 3h30min running the 3.5 client! That should be possible with the itanium too. I guess 2h per wu or less is possible. I'll post a link to the new client when i've build it. ![]() Join BOINC United now! |
![]() ![]() Send message Joined: 14 Apr 01 Posts: 435 Credit: 842,179 RAC: 0 ![]() |
@Crunch3r Hi! I just thought this morning it ain't a bad idea to run some test of your optimized seti clients against the reference work unit on my Athlon XP 3200+. So, I tested your clients V2.7s, V2.08s and V2.09s on this host. The tests finished some minutes ago, but the result is somewhat confusing to me. Here are the results: AthlonXP_SSE_FFTW3_Caching_V2.07s: 5808.850943 AthlonXP_SSE_FFTW3_Caching_V2.08s: 5508.440612 AthlonXP_SSE_FFTW3_Caching_V2.09s: 8814.406029 Do you know the reasons for the 2.09 slower than 2.08 or 2.07 (on my host)? The Hardware is an Athlon XP 3200+ , no overclocking, 512 MB RAM The OS is Debian 3.1 kernel 2.6.8-2-386 Would be nice if you could give me some enlightment ;-) Regards rattelschneck |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.