Are there any site providing optimized clients?

Message boards : Number crunching : Are there any site providing optimized clients?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 . . . 21 · Next

AuthorMessage
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 335374 - Posted: 13 Jun 2006, 4:28:43 UTC
Last modified: 13 Jun 2006, 4:33:45 UTC

Since I'm using Eric's sources to compile, I'd hope the multiplier is how it should be.

As stated, I did not modify the source other than removing some double declarations, the multiplier did not factor into that equation...no idea where exactly it even is :)

Where exactly in the result.sah file does it state what sort of credit it's claiming? Haven't studied it long enough to figure that out yet.
The only thing I can find that seems related is "flops" which also appear in the result stats when you check online.

I've seen people post formulae for claiming credit, but how does it really work?
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 335374 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 335423 - Posted: 13 Jun 2006, 5:23:56 UTC - in response to Message 335363.  

Stock 5.12 client as distributed to everyone
700 seconds

GCC SSE2-optimized 5.15 client
625 seconds

ICC/IPP SSE2-optimized 5.15 client (first compile)
520 seconds

So:

GCC SSE2 takes ~10% less time than stock,
ICC/IPP SSE2 takes ~25% less time than stock.


Or the other way around, so the figures look bigger ;)

Stock takes ~35% longer than ICC/IPP SSE2.
Stock takes ~12% longer than GCC SSE2.



ICC/IPP SSE only 5.15 client
512 seconds

Even a bit quicker than my SSE2 one :) Using the same CFLAGS, I believe. This is all on the same machine - but 8 seconds difference is within statistical anomalies I think. No load on the machine either.

More results upcoming, I think I found some more aggressive compiler flags for ICC.
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 335423 · Report as offensive
Hans Dorn
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2262
Credit: 26,448,570
RAC: 0
Germany
Message 335424 - Posted: 13 Jun 2006, 5:28:32 UTC
Last modified: 13 Jun 2006, 5:39:25 UTC

Looks VERY good so far.

Simon, could you drop me a line at hans.dorn@gmail.com ?

Regards Hans

P.S: Don't forget to post your compiler flags :o)

P.S:

The infamous Credit multiplier can be found in "seti.h":

#define LOAD_STORE_ADJUSTMENT 3.51


Change this to 3.35 if you want to run your code "in the wild", or wait until 5.17 is out.
ID: 335424 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 335466 - Posted: 13 Jun 2006, 6:52:53 UTC
Last modified: 13 Jun 2006, 7:17:50 UTC

Well, I figured out something else in regards to sincos.h - it's there for systems that don't offer math functions on their own. However, that doesn't seem to be enough for ICC. So instead of fiddling around with sincos.h, I know that my system offers these functions, so I removed the "#include <sincos.h>" lines from fft8g.cpp and analyzeFuncs.cpp. This resulted in stable builds from ICC without any further annoyances.

It has to be noted that it compiles fine with those includes when you use GCC. ICC is a fickle mistress.


---note
When you do profiling, first build disabled *all* optimizations. With that build, a WU takes ~22 minutes. Just for comparison. Now for the second profiling compile stage :)
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 335466 · Report as offensive
Hans Dorn
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2262
Credit: 26,448,570
RAC: 0
Germany
Message 335470 - Posted: 13 Jun 2006, 6:58:03 UTC

Erics source compiled without further changes on my setup.
(Besides the boinc_worker_timer issue)

Debian Stable (Sarge) with icc 9.0.30 and ipp5.0

Regards Hans
ID: 335470 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 335498 - Posted: 13 Jun 2006, 8:04:15 UTC

Hm,
you're using 9.0 and 5.0, probably has less issues. I'm on 9.1/5.1 here.

What flags did you use?
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 335498 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 335603 - Posted: 13 Jun 2006, 12:34:36 UTC - in response to Message 335470.  
Last modified: 13 Jun 2006, 13:02:04 UTC

Erics source compiled without further changes on my setup.
(Besides the boinc_worker_timer issue)

Debian Stable (Sarge) with icc 9.0.30 and ipp5.0

Regards Hans

You are correct, Sir!

ICC 9.0.30 and ipp5.0 behave a lot nicer than 9.1 and 5.1. No errors thrown on sincos.h, so the only source edit necessary is now the SETIERROR one inside s_util.h.

I'm currently compiling to compare my results with Hans' config. Looks much less error-prone so far.

Also, I've discarded Fedora Core 3 for a Debian Sarge system. Much more comfortable. App now builds fine, although ICC really needs a lot of memory to compile with optimizations. Had to up my VM to allocate 512MB memory instead of 256 because it was stalling on building. The ICC docs do state you should have at least 512, so I guess they mean it, too.
ICC 9.1 didn't need as much mem, it seemed. Still using all available mem (and swap) with 512MB set - therefore it's compiling hellishly slow, only using about 10-20% CPU because the rest of the time it has to wait for disk swaps.

Hans, how much memory does your compile host have?

Wondering whether I should finally swap my 2x512 for 2x1GB.
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 335603 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 335748 - Posted: 13 Jun 2006, 16:10:12 UTC - in response to Message 335374.  

I've seen people post formulae for claiming credit, but how does it really work?

The app multiplies its internal FLOP_counter value by LOAD_STORE_ADJUSTMENT and reports it to BOINC. BOINC (if version 5.2.6 or better) puts it in client_state.xml <boinc_fpops_cumulative>, and that is included when the WU is reported to the Scheduler. The Scheduler divides it by 1e9 to get equivalent seconds on the "reference computer", then multiplies by 100/86400 to convert to credits.
                                                              Joe
ID: 335748 · Report as offensive
Pepperammi

Send message
Joined: 3 Apr 99
Posts: 200
Credit: 737,775
RAC: 0
United Kingdom
Message 335902 - Posted: 13 Jun 2006, 17:31:12 UTC

Sorry to be so blunt but Has anyone been sucessful in creating a viable optimized or just streamlined app yet?

I created a thread for anyone who has to post links for everyone so people would'n't have to do what i've been trying-even after trying to read the whole last third or so of this thread my eyebaals are sore and my head hurts from all the techy talk.


Wish i could say it was all just going over my head but its not quite making it over and instead is pounding against my forehead :)
Theres more brainpower than crunching power on this thread i think :)
ID: 335902 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 336278 - Posted: 14 Jun 2006, 4:00:49 UTC

Pepperammi, in a word,

yes.

However, not releasing anything yet.

Should I release anything, a lot of testing will have gone into it beforehand. I'm sure others working on optimized clients are also making very sure that everything is as it should be before even posting about it.

I've posted some preliminary times for 3 different optimization strategies below - these were clients I compiled myself.
I've also explained how you can do it, too - the tools are free for personal and non-commercial use.

I'm in contact with Hans, and things are definitely looking good so far.

You will know if anything gets released, trust me :)

Regards,
Simon.
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 336278 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 336399 - Posted: 14 Jun 2006, 6:58:57 UTC - in response to Message 335748.  
Last modified: 14 Jun 2006, 6:59:38 UTC

I've seen people post formulae for claiming credit, but how does it really work?

The app multiplies its internal FLOP_counter value by LOAD_STORE_ADJUSTMENT and reports it to BOINC. BOINC (if version 5.2.6 or better) puts it in client_state.xml <boinc_fpops_cumulative>, and that is included when the WU is reported to the Scheduler. The Scheduler divides it by 1e9 to get equivalent seconds on the "reference computer", then multiplies by 100/86400 to convert to credits.
                                                              Joe

Thanks!

That should allow me to test credit claims vs. WU's I've already run with working Apps. I don't want to run my test clients on BOINC yet, so that's a good thing for comparison.
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 336399 · Report as offensive
Profile Jakesnake5
Volunteer tester

Send message
Joined: 14 May 99
Posts: 65
Credit: 18,370,396
RAC: 0
United States
Message 336430 - Posted: 14 Jun 2006, 8:41:31 UTC

I got a 1ghz Duron that needs a SSE Windows (98 capable) Seti client. It's currently munching with a standard distribution.

With the Crunch3r clients, it was outperforming a 1.4ghz Thunderbird, which has only MMX (no SSE). But this was before Enhanced.
ID: 336430 · Report as offensive
Pepperammi

Send message
Joined: 3 Apr 99
Posts: 200
Credit: 737,775
RAC: 0
United Kingdom
Message 336451 - Posted: 14 Jun 2006, 9:54:57 UTC - in response to Message 336278.  
Last modified: 14 Jun 2006, 10:14:06 UTC

Pepperammi, in a word,

yes.

However, not releasing anything yet.

Should I release anything, a lot of testing will have gone into it beforehand. I'm sure others working on optimized clients are also making very sure that everything is as it should be before even posting about it.

I've posted some preliminary times for 3 different optimization strategies below - these were clients I compiled myself.
I've also explained how you can do it, too - the tools are free for personal and non-commercial use.

I'm in contact with Hans, and things are definitely looking good so far.

You will know if anything gets released, trust me :)

Regards,
Simon.


Thanks. I fully undertstand you and others will want to be sure its all working (and wont get into trouble and replaying events) before realeasing.

I will look at having bash at the code with some the strategies you an others have posted but i wont use it to do the work and reporting in case mess it up. Just trying to get some experience coding myself for other reasons.

This'l probly seems way out there but i started this thread over on einstien about maybe starting to use GPU's. Its come up in the past i know. Since a lot of you are already looking at the seti code and seti is more likely to get the huge boost maybe you could look an see what you think. Uncharted territory
http://einstein.phys.uwm.edu/forum_thread.php?id=4312#38341
ID: 336451 · Report as offensive
Pepperammi

Send message
Joined: 3 Apr 99
Posts: 200
Credit: 737,775
RAC: 0
United Kingdom
Message 336467 - Posted: 14 Jun 2006, 10:33:27 UTC - in response to Message 336451.  
Last modified: 14 Jun 2006, 10:35:32 UTC

Sorry if this has come up already.

For those pointing out you can download the evaluation Intel compiler/math libries and ipp ect. Well there also something called VTune. Supose to look through and optimise/streamline your code for you with a few clicks to to best performace for your architecture.. Would this be any relevance here?
ID: 336467 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 336476 - Posted: 14 Jun 2006, 10:51:04 UTC

Yes,

Vtune is a part of Intel's compiler and optimizer offering.
It looks to be useful in profiling the application, which until now I did with running an unoptimized profiling app (which took forever).

Will see whether I can work with it when it finishes downloading. That makes about 1 GB of stuff I've grabbed in the quest for the faster binary :o)

I'd recommend using Tetsuji's flags to compile your application. What platform you compile on also seems to change things - I've had the best results using a plain vanilla Debian Sarge installation. The Fedora Core version I was using didn't work out so well.
You can get a Debian netsetup ISO and have it running inside an hour (if your internet connection is fast enough, anyway...). I would recommend using something like VMWare instead of dedicating an entire PC to playing around. Also, to be able to test and compile all sorts of different stuff, your CPU should be SSE2 or SSE3 capable.

Good luck!
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 336476 · Report as offensive
Pepperammi

Send message
Joined: 3 Apr 99
Posts: 200
Credit: 737,775
RAC: 0
United Kingdom
Message 336504 - Posted: 14 Jun 2006, 11:58:24 UTC - in response to Message 336476.  

Yes,

Vtune is a part of Intel's compiler and optimizer offering.
It looks to be useful in profiling the application, which until now I did with running an unoptimized profiling app (which took forever).

Will see whether I can work with it when it finishes downloading. That makes about 1 GB of stuff I've grabbed in the quest for the faster binary :o)

I'd recommend using Tetsuji's flags to compile your application. What platform you compile on also seems to change things - I've had the best results using a plain vanilla Debian Sarge installation. The Fedora Core version I was using didn't work out so well.
You can get a Debian netsetup ISO and have it running inside an hour (if your internet connection is fast enough, anyway...). I would recommend using something like VMWare instead of dedicating an entire PC to playing around. Also, to be able to test and compile all sorts of different stuff, your CPU should be SSE2 or SSE3 capable.

Good luck!




uuuuhhh yea i understud everything you said.. :)
Sorry i'm just a beginner with only extreme basics of c++ and Visual basic from a long time ago. i'll figure out most of what you said slowly. I'm on windows so i'll have a mooch for something to compare to Debian.
Thanks for your info
ID: 336504 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 336538 - Posted: 14 Jun 2006, 12:36:11 UTC
Last modified: 14 Jun 2006, 12:39:49 UTC

Actually, it's easier on Windows because the necessary libs are included.

But, you'll need wxwidgets - www.wxwidgets.org to successfully compile the BOINC client (not seti@home, which is called seti_boinc in the sources).

Also, you will need Visual Studio .NET/2003 (not 2005!) to be able to use the included project (.cproj) and solution (.sln) files.

A "solution" is a collection of "projects" - i.e. the solution is the whole client, the various projects are its parts. You'll see when you open it up in Visual Studio.

You should also define an environment variable (in control panel/system/advanced configuration, should be the bottom button) called WXWIN to the path of your wxwidgets installation - this will make Visual Studio automatically find it for compiling/linking.

I haven't played with building on Windows for a few days - so far, the Linux version is giving my head enough to crunch ;o)

Regards,
Simon

P.S.: Profiling (with ICC, but probably in general) means monitoring what exactly the application does and whether this can be optimized. I.e. when you run what you compiled, it will try and figure out whether there are things that can still be improved afterwards. It records that info and you can then use it to recompile and hopefully squeeze out a few more percent. VTune does this a bit more efficiently, or so Intel says.
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 336538 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 336599 - Posted: 14 Jun 2006, 14:24:35 UTC

To keep you up to date,

I've finally managed to compile a version that is as quick as Hans'. Won't say how quick that is just yet :o)

I've also figured out how to make a static binary, at least I think so. Remains to be tested on different hosts (which I'm currently doing).

So - Whoever would like to be part of a small test community, pipe up now. Linux-only, for now, too.

I'm looking for about 5 people, max.

Email me at savant <insert funny looking sign here> lunabyte.net if you'd like to be part of it.

Regards,
Simon.
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 336599 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 336759 - Posted: 14 Jun 2006, 17:19:15 UTC - in response to Message 336504.  


uuuuhhh yea i understud everything you said.. :)
Sorry i'm just a beginner with only extreme basics of c++ and Visual basic from a long time ago. i'll figure out most of what you said slowly. I'm on windows so i'll have a mooch for something to compare to Debian.
Thanks for your info

The idea behind profiling is pretty simple:

As a rule of thumb, 10% of a program's code is responsible for 90% of the execution time.

If you're hand-optimizing the code (trying to replace inefficient C with more efficient equivalents) you don't want to spend time making initialization faster because that happens one. Same with any housekeeping on exit -- the speed improvements only happen once.

If you can make the actual crunching part faster, that will show up big-time.

That said, it's gritty work.

It's also the kind of thing that helps every processor and every OS -- like caching trig. that made the last optimized app. so much faster (and is part of the standard enhanced app.).
ID: 336759 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 337079 - Posted: 14 Jun 2006, 22:06:12 UTC

So far I haven't received any email - I asked for a small test-group for optimized clients. Thought there would be takers, actually :)

So let me ask again:

Are you interested in test-driving a new optimized application?

Email me at savant<you know what goes here>lunabyte.net if you would like to.

I'm almost at the point where I want to distribute the app - it's quick, the credit it claims is okay, and it runs on several machines without complaining (as it should, since it's now a fully static binary).

All I'm missing is a test group :o)

Regards,
Simon.
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 337079 · Report as offensive
Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 . . . 21 · Next

Message boards : Number crunching : Are there any site providing optimized clients?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.