Are there any sites providing optimized clients? -- PART II

Message boards : Number crunching : Are there any sites providing optimized clients? -- PART II
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 19 · Next

AuthorMessage
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 351906 - Posted: 29 Jun 2006, 14:20:24 UTC - in response to Message 351168.  

Following up my observations:
Windows 5.15 can be run with a -nographics argument in standalone, and that gives another data point to consider. Unfortunately the -nographics argument causes a non-graphic build to quit with an error, which makes automating tests more complex.

One thing I discovered in the last two days is that running standalone with graphics but minimizing the graphics window as soon as it appears also reduces the run time to very near the -nographics case. What I don't know yet is whether running with BOINC but not showing graphics is similar to the minimized window situation.

Running the stock 5.15 with BOINC, I checked the amount of CPU time each of the setiathome threads was using. With graphics off, after an hour of run time the main worker thread had almost all the time, another thread about 2 seconds. Turning graphics on for the WU started accumulating time in another thread at about 5 to 10 percent CPU usage, turning them off again that thread stopped using any appreciable time.

My take is that a graphics build adds considerably less than 1% to crunch time if graphics are not turned on. Using the -nographics argument when doing standalone testing to compare against optimized builds seems justified.
                                                       Joe
ID: 351906 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 351914 - Posted: 29 Jun 2006, 14:26:16 UTC
Last modified: 29 Jun 2006, 14:27:42 UTC

Quick question Joe,

So starting the CC from the command line like:

path\\boinc.exe -allow_remote_gui_rpc -nographics

will kill the graphics loop for win boxes?

Alinator

PS: Talking about the stock build here.
ID: 351914 · Report as offensive
kevint
Volunteer tester

Send message
Joined: 17 May 99
Posts: 414
Credit: 11,680,240
RAC: 0
United States
Message 351928 - Posted: 29 Jun 2006, 14:47:34 UTC - in response to Message 351356.  

KWSN- Chicken of Angnor wrote:

So I bought a P-D 805 system last week which I'm using to benchmark on now because it has 4 operating systems (Windows/Linux 32/64 Bit) installed on it. The times it was pulling with my optimized builds were quite okay even though it was running single channel until today (finally got my second DDR2 module), but they were really nowhere close to what I expected given my first test runs on the VM.

???

If you can afford to buy all this HW on a whim, why not just pay $600 for an Intel license?


Erik,

Why should this matter - if you want it - you go buy it. It is his money, and he can do with it what ever he likes, this includes flushing it down the toilet, or spending it on ice cream and cake. He is in no way obligated to pay for anything for this project.
ID: 351928 · Report as offensive
kevint
Volunteer tester

Send message
Joined: 17 May 99
Posts: 414
Credit: 11,680,240
RAC: 0
United States
Message 351933 - Posted: 29 Jun 2006, 14:54:16 UTC - in response to Message 350181.  

Around $900 AFAIK, but I haven't checked closely. Sadly, the licenses are time-limited, I believe that's for one year and includes ICC, IPP and MKL (the latter two being library packages).

Intel pricing page

Also don't forget both Linux and Windows (and maybe even OS X, for X86-based Macs) are required, so that doubles/triples normal license costs.

--edit
Intel Compiler for Linux - $399
Intel Compiler for Windows - $399
Intel Compiler for OS X - $399

IPP - $199
MKL - $399 (maybe not necessary, but probably faster than fftw).

From the pricing page, it seems that all supported OS flavours are included when you buy one license for the library packages. The compiler has to be licensed once per OS.

There's a promotional MacOS package available that includes Compiler, IPP and MKL and costs $549. Since the libraries seem to license cross-OS, that might be the cheapest path overall.

Regards,
Simon.


I find it hard to believe that Crunch3r was paying $900 US every year just so he could make BOINC and s@h optimizations!




Erik,
Why do you find it so difficult to believe that Crunch3r was not paying this, it is not so hard to understand. Why do you think he got so upset and left the project ?
ID: 351933 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 351985 - Posted: 29 Jun 2006, 16:04:43 UTC
Last modified: 29 Jun 2006, 16:05:00 UTC

Kevin,

I have some sort-of indecent questions - since you bought the licences, will you be releasing any clients to the public?

Or will you keep it Team-only (or just to yourself)? Please, I'm not trying to offend you, just want to know.

You buying the license was the reason I put in some extra hours for the Windows How-To - not that this obligates you in any way, obviously :o)

Regards,
Simon.
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 351985 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 351989 - Posted: 29 Jun 2006, 16:17:58 UTC - in response to Message 351763.  

Hi again. I've temporarily got a hold of VS .net 2003 :-)
I've come across the same problem as Bluesilvegreen. The ipp_7t.h file doesn't exist an my computer :(. i think it maybe because the only version i could find on intels website is ipp 5.1 whilst your seems to be 8.0.2 first time and versin 5.0 next. oh and i dont have a staticlib folder in tools. i have a stubliib in the main directory thought?

Will this cause me roblems. Not worried by all the warings as you say

Whilst mooching around i noticed the 'optimizaion' tab in c/c++ folder icon when you've rightclicked into properties. I sthis redundent since you've used the command line route.. I ask because the commands are slightly didderent forprocessor specific. QaxP for pentium 4 SSE3 maybe that might get me better performance in my settup?
Also is the MKL math stuff automatically used when you've included them as per your instructions. I ask because in preproccesor commands you added use_IPP. does it need use_MKL?

[EDIT]
Went through more thorought and it seems its all to do with trying to impliments SSE ect. You prbly knew that already. looking thruoght that part the code i can see i dont have any of those #include files except ipp.h:

// In order to use IPP, set -DUSE_IPP and one of -DUSE_SSE3, -DUSE_SSE2,
// -DUSE_SSE or nothing(generic), IPP precedes FFTW, ooura // TMR
#if defined(USE_IPP)
#pragma message ("-----IPP-----")
#if defined(USE_SSE3)
#define T7 1
#pragma message ("-----sse3-----")
#include <ipp_t7.h>
#elif defined(USE_SSE2)
#define W7 1
#pragma message ("-----sse2-----")
#include <ipp_w7.h>
#elif defined(USE_SSE)
#define A6 1
#pragma message ("-----sse-----")
#include <ipp_a6.h>
#else
#pragma message ("-----mmx-----")
#include <ipp_px.h>
#endif // T7
#include <ipp.h>
#elif defined(USE_FFTWF)
#pragma message ("----FFTW----")
#include "fftw3.h"
#else
#pragma message ("----ooura----")
#include "fft8g.h"
#endif // USE_IPP

these are the files i do have. Any chance one of them is the right one and i can edit the code to use it instead?:
ipp.h - ippac.h - ippalign.h - ippcc.h - ippch.h - ippcore.h - ippcv.h - ippdefs.h - ippi.h - ippj.h - ippm.h - ipps.h - ippsc.h - ippsr.h - ippvc.h - ippvm.h

Thanks very much again. Made this all extremely interesting. learnt more going thought your tutorial than did in the last week with a couple of old c++ and VB books

Pepperammi, check the "How to make your own optimized Windows Seti@Home" thread for an answer, please :o)

Regards,
Simon.
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 351989 · Report as offensive
H.
Volunteer tester

Send message
Joined: 26 Jun 06
Posts: 63
Credit: 1,192
RAC: 0
Message 352026 - Posted: 29 Jun 2006, 17:26:58 UTC

I have put both of Simons HOW-TO's up for sticky. They will stay here for at least a week. Or until the next moderator passes through and finds it too cluttering.

H.
ID: 352026 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 352038 - Posted: 29 Jun 2006, 18:07:41 UTC
Last modified: 29 Jun 2006, 18:07:49 UTC

ID: 352038 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 352115 - Posted: 29 Jun 2006, 19:50:30 UTC - in response to Message 347982.  

Like Josef stated, testing an optimized build without graphics vs. the standard with graphics is not quite fair. So - here are some revised numbers:
Default 5.15 with -nographics
9m 36s (576 seconds)
So that compares to 646 seconds before, and is a sizable difference of 70 seconds or 10.8% vs. with graphics.

So the revised speedup for my clients (on this WU, but it should hold true elsewhere) -
Crunch3r 5.12 SSE2
4m 19s (259 seconds) - 55.0% quicker

My 5.15 SSE2
4m 17s (257 seconds) - 55.38% quicker

I'd expect 5-10% less speedup in the scores posted before.

Regards,
Simon.

Stats for third WU:

testWU-4 (AR: 1.2796485198966)

Windows 32-Bit

Default 5.15 with graphics
10m 46s (646 seconds)

Crunch3r 5.12 SSE2
4m 19s (259 seconds) - 59.9% quicker

My 5.15 SSE2
4m 17s (257 seconds) - 60.0% quicker

Those 2 seconds difference are again well within standard result variance, so I'd call it a draw at this AR on Windows (between optimized clients), too.

Linux 32-Bit

Default 5.12 (no X-Win installed, so no graphics? not sure)
8m 00s (480 seconds)

Crunch3r 5.12 SSE2
5m 02s (302 seconds) - 37.1% quicker

My 5.15 SSE2
4m 45s (285 seconds) - 40.6% quicker

--------------------------------

Linux vs. Windows

Default client
Linux is 25.7% quicker.

Crunch3rs 5.12
Windows is 14.2% quicker.

My 5.15 builds
Windows is 9.8% quicker.

The same sort of scaling seems to apply, roughly.

Regards,
Simon.


Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 352115 · Report as offensive
Pepperammi

Send message
Joined: 3 Apr 99
Posts: 200
Credit: 737,775
RAC: 0
United Kingdom
Message 352128 - Posted: 29 Jun 2006, 20:08:33 UTC - in response to Message 352038.  
Last modified: 29 Jun 2006, 20:11:04 UTC

Great stuff KWSN! just finished first test unit and its strongly similar (great) and I tested it against crunch3r's 5.12 too;
Pentium D 830 3.21Ghz, Dual channel memory, Windows XP

Default Standard; 20:40 (1240seconds)

My SSE3 from your instructions; 9:23 (563secs) -54.6%

Crunch3r's 5.12 SSE2; 9:17 (557secs) -55%

Going to test a SSE2 compiled version too as sometimes I find them faster on this machine. Also going to try the other test units too for the different AR's.
Eventually i'll get round to having a go on my HT machine ;)

Also i'm going to turn off my blank screensaver for the next test because it maybe gave crunch3r's a very slight advantage when it blanked screen halfway through default and all of crunch3r's

[EDIT]
Sorry my numbers were with the default with graphics.
ID: 352128 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 352173 - Posted: 29 Jun 2006, 21:10:10 UTC
Last modified: 29 Jun 2006, 21:11:18 UTC

Great results :)

I've heard from a couple of people who have successfully made their own Windows crunchers now. Their benchmark times - or rather, the relative speedup - are very close to what you're getting (and me as well).

So it seems as if it's reproducible across different hardware (and by different people ;o) ) - I had tested it on several installations here, but it's great to see the instructions work for you guys, too.

Thanks for all the supportive posts, a definite ego-booster there!

I'm still waiting to hear back from Kevin - since he bought the licenses, right now he seems one of the few people around here who could actually release a binary to the public.

Oh, and Pepperammi, when you repeat benchmark runs you will notice there is some slight variance in times, usually. So I would think that your client and Crunch3rs is very much on equal terms. Time will tell :o)

Regards,
Simon.
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 352173 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 352460 - Posted: 30 Jun 2006, 2:18:37 UTC - in response to Message 351914.  

Quick question Joe,

So starting the CC from the command line like:

path\\boinc.exe -allow_remote_gui_rpc -nographics

will kill the graphics loop for win boxes?

Alinator

PS: Talking about the stock build here.

Sorry I wasn't clear, the -nographics argument is only available when testing the setiathome app standalone (without BOINC):
path\\setiathome_5.15_windows_intelx86.exe -nographics
                                                    Joe
ID: 352460 · Report as offensive
Bart Barenbrug

Send message
Joined: 7 Jul 04
Posts: 52
Credit: 337,401
RAC: 0
Netherlands
Message 352643 - Posted: 30 Jun 2006, 5:23:13 UTC - in response to Message 352460.  


Sorry I wasn't clear, the -nographics argument is only available when testing the setiathome app standalone (without BOINC):
path\\setiathome_5.15_windows_intelx86.exe -nographics[/pre]

To me that sounds like a usefull boinc feature: to be able to spawn the app with the -nographics option (or any other, maybe project-specific, options). Would such a feature give everybody who uses it (I personally never use graphics) a 5-10% speedup?

ID: 352643 · Report as offensive
Pepperammi

Send message
Joined: 3 Apr 99
Posts: 200
Credit: 737,775
RAC: 0
United Kingdom
Message 352769 - Posted: 30 Jun 2006, 8:26:48 UTC - in response to Message 352643.  
Last modified: 30 Jun 2006, 8:30:19 UTC


Sorry I wasn't clear, the -nographics argument is only available when testing the setiathome app standalone (without BOINC):
path\\setiathome_5.15_windows_intelx86.exe -nographics[/pre]

To me that sounds like a usefull boinc feature: to be able to spawn the app with the -nographics option (or any other, maybe project-specific, options). Would such a feature give everybody who uses it (I personally never use graphics) a 5-10% speedup?


maybe its possible to code this into boinc or the app. Maybe not the option to choose but that when the graphics window is closed then graphics are dissabled an initialised when you open the graphics.
Not very good way of doing it but the simplest i can think of is reprogram boinc so it automatically loads tha seti app with '-nographics' and when yo press that show graphic button it just quickly stops the app and loads again without the '-nographics' and continues where it left off. And agin when you close the window it stops the app and restarts with '-nographics'. It may have problems with preemtied ect. I'l have a look. not saying i'll come up with something-i don't know enought but i find it interesting.
you'd have to consider the screensver part too though. Ensure it does the same there.

@ KWSN
Is there anyway to change how '1D FFTs' are handled? Is it possible (without getting extremely complicated)? Anyway to single/sort them out to work them differently? or unload them to a different part the app to be done elsewhere then when computing then bring the results back to continue as they would normally. Might come into trouble doing that because its using IPP now?
Hope i made sence there. Doesn't to me :)

I'll have some more complete benchmark numbers after a few more runs. I started again because remembered i was tinckering around when couldn't get it to work because of that critical error(thanks again) so i recompiled to see if i maybe missed a change i made. To sum up- doing it again with your exact instruction for SSE3 nocked about five more seconds off. Thanks for the info on repeat benchmarks. i spotted that. luckily it only fluctuates by about 2seconds when i does
ID: 352769 · Report as offensive
Pepperammi

Send message
Joined: 3 Apr 99
Posts: 200
Credit: 737,775
RAC: 0
United Kingdom
Message 352843 - Posted: 30 Jun 2006, 10:12:06 UTC

There any kind of software (free) to monitor work load on the gpu? I cant find one. sounds nuts but i've had a really basic go at including gpu use in the app using http://gamma.cs.unc.edu/GPUFFTW/documentation.html (got got bored of benchmarks ;-) ). Their insutruntions are so overly simple that i thought what the hell even if i dont believe it.
Anyway tried those instructions and compiled with USE_FFTWF so it'll probly be slower but i was just interested if it'd work. I can't tell... all i got at the mo is temp levels to go by and they do pop up 2-4c now and then...?

times i got where on test WU 3
12min01secs with this 'bodgit' app
11min28secs with SSE3 compiled
Not as huge difference as i thought would make as theres no SSE or IPP. Or i got it wrong
ID: 352843 · Report as offensive
Hans Dorn
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2262
Credit: 26,448,570
RAC: 0
Germany
Message 352880 - Posted: 30 Jun 2006, 11:19:28 UTC - in response to Message 352843.  

There any kind of software (free) to monitor work load on the gpu? I cant find one. sounds nuts but i've had a really basic go at including gpu use in the app using http://gamma.cs.unc.edu/GPUFFTW/documentation.html (got got bored of benchmarks ;-) ). Their insutruntions are so overly simple that i thought what the hell even if i dont believe it.
Anyway tried those instructions and compiled with USE_FFTWF so it'll probly be slower but i was just interested if it'd work. I can't tell... all i got at the mo is temp levels to go by and they do pop up 2-4c now and then...?

times i got where on test WU 3
12min01secs with this 'bodgit' app
11min28secs with SSE3 compiled
Not as huge difference as i thought would make as theres no SSE or IPP. Or i got it wrong


Hi Pepperami,

GPUFFTW does not have the same API as fftw3f, so it won't work as a drop-in replacement.
Have you tried compiling the GPUFFTW example?

Regards Hans
ID: 352880 · Report as offensive
Pepperammi

Send message
Joined: 3 Apr 99
Posts: 200
Credit: 737,775
RAC: 0
United Kingdom
Message 352884 - Posted: 30 Jun 2006, 11:35:28 UTC - in response to Message 352880.  


Hi Pepperami,

GPUFFTW does not have the same API as fftw3f, so it won't work as a drop-in replacement.
Have you tried compiling the GPUFFTW example?

Regards Hans

Hi Hans,
I guessed not to replace the fftw so i just added the gpufftw include as well. Like said very basic attempt (not very good) :). was wondering if the compiler would be able to make a little sence out of it.

surprised all that messing hasn't seriously affected the speed and its still valid.

I did have a quick look throught the example and it cetainly shows a lot more setup needed. It interesting to go through anyway. I need to learn a lot more.

Thank you.
ID: 352884 · Report as offensive
Profile Diego -=Mav3rik=-
Avatar

Send message
Joined: 1 Jun 99
Posts: 333
Credit: 3,587,148
RAC: 0
Message 352895 - Posted: 30 Jun 2006, 11:54:31 UTC - in response to Message 352884.  

Guys can someone link me to a seti_enchanced shortened reference unit?
The one you are using maybe? ;)

Thanks and regards.
/Mav

We have lingered long enough on the shores of the cosmic ocean.
We are ready at last to set sail for the stars.

(Carl Sagan)
ID: 352895 · Report as offensive
Pepperammi

Send message
Joined: 3 Apr 99
Posts: 200
Credit: 737,775
RAC: 0
United Kingdom
Message 352914 - Posted: 30 Jun 2006, 12:26:46 UTC - in response to Message 352895.  
Last modified: 30 Jun 2006, 12:28:38 UTC

Guys can someone link me to a seti_enchanced shortened reference unit?
The one you are using maybe? ;)

Thanks and regards.

theres five of them in KWSN's source download from the Windows How-To. Plus an invaluable benchmark script to compare the speed and another invauable tool to test your results againts the defaut reference unit.
About to finish fifth wu and ill have average times to put up.
ID: 352914 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 353090 - Posted: 30 Jun 2006, 15:48:01 UTC
Last modified: 30 Jun 2006, 15:48:12 UTC

Also, I put it them for download before for Josef Segur.

So here's the URL to just the WUs:
http://www.zadra.org/seti_enhanced/testWUs.zip

Cheers,
Simon.
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 353090 · Report as offensive
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 19 · Next

Message boards : Number crunching : Are there any sites providing optimized clients? -- PART II


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.