SSE2

Message boards : Number crunching : SSE2
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 9 · Next

AuthorMessage
nemesis
Avatar

Send message
Joined: 12 Oct 99
Posts: 1408
Credit: 35,074,350
RAC: 0
Message 752339 - Posted: 13 May 2008, 1:07:07 UTC

darn it, i'm always too late for the alcohol....

thanks for the great work, JD!
ID: 752339 · Report as offensive
_heinz
Volunteer tester

Send message
Joined: 25 Feb 05
Posts: 744
Credit: 5,539,270
RAC: 0
France
Message 752505 - Posted: 13 May 2008, 10:12:47 UTC

Hi all,
P4 Northwood 2,6 GHz with the new SSE2 app look at my wingman, he runs the standard app.

9,886.92 to 29,973.44 seconds, what a progress for a 72 credit unit

regards heinz
ID: 752505 · Report as offensive
Profile hiamps
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 752544 - Posted: 13 May 2008, 14:13:32 UTC

The SSE2 app makes a huge difference on most of my old machines but on the 2.6 Celeron it has cut 6000 seconds...the slowest got the biggest boost. Thanks everyone that worked on this and Thank you Phud for complaining....
Official Abuser of Boinc Buttons...
And no good credit hound!
ID: 752544 · Report as offensive
Profile Mumps [MM]
Volunteer tester
Avatar

Send message
Joined: 11 Feb 08
Posts: 4454
Credit: 100,893,853
RAC: 30
United States
Message 752554 - Posted: 13 May 2008, 14:53:30 UTC

Well, I've grabbed the SSE2 flavor for my middle-aged crunchers and it's doing a bang-up job. I've just passed hiamps in over-all RAC! :-)

So that only leaves one really ancient P3 based 4-way cruncher stuck with SSE flavored Chicken soup. Here's to hoping the Whale soup eventually hits the shelves for that venerable ancient too. :-)
ID: 752554 · Report as offensive
Eirik
Volunteer tester
Avatar

Send message
Joined: 25 Mar 01
Posts: 45
Credit: 2,173,371
RAC: 0
Norway
Message 752583 - Posted: 13 May 2008, 15:46:10 UTC
Last modified: 13 May 2008, 15:47:04 UTC

I just grabbed SSE2 for this one since it is still crunching its kinda hard for me to tell if the AK made any difference from it.
But hopefully it will :-D
ID: 752583 · Report as offensive
Profile hiamps
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 752602 - Posted: 13 May 2008, 16:20:14 UTC - in response to Message 752554.  

Well, I've grabbed the SSE2 flavor for my middle-aged crunchers and it's doing a bang-up job. I've just passed hiamps in over-all RAC! :-)

Hmmmm, I smell an upgrade comming......Have a great day!
Official Abuser of Boinc Buttons...
And no good credit hound!
ID: 752602 · Report as offensive
Profile JDWhale
Volunteer tester
Avatar

Send message
Joined: 6 Apr 99
Posts: 921
Credit: 21,935,817
RAC: 3
United States
Message 752605 - Posted: 13 May 2008, 16:23:56 UTC - in response to Message 752583.  

I just grabbed SSE2 for this one since it is still crunching its kinda hard for me to tell if the AK made any difference from it.
But hopefully it will :-D


I do hope that you meant the SSE2 client for this hostid=3762048.

Your Q6600 would be better served running SSSE3x client.

Kind regards,
JDWhale


ID: 752605 · Report as offensive
Profile Karsten Vinding
Volunteer tester

Send message
Joined: 18 May 99
Posts: 239
Credit: 25,201,931
RAC: 11
Denmark
Message 752630 - Posted: 13 May 2008, 21:07:18 UTC

OK.

I have been testing the SSE2 client against the SSE3 client on my AMD Phenom, and my AMD Athlon64-X2, with the knabench tool.

I wasn't expecting any speedups, but knabench tells me that the SSE2 version i from 2 - 5,5% faster on the Athlon64, and from -1 - 4% faster on my Phenom.

This is a bit of a surprise to me, but I have switched to the SSE2 client on both machines, and will be watching carefully to see if it is indeed faster.
ID: 752630 · Report as offensive
Eirik
Volunteer tester
Avatar

Send message
Joined: 25 Mar 01
Posts: 45
Credit: 2,173,371
RAC: 0
Norway
Message 752632 - Posted: 13 May 2008, 21:14:30 UTC - in response to Message 752605.  

I just grabbed SSE2 for this one since it is still crunching its kinda hard for me to tell if the AK made any difference from it.
But hopefully it will :-D


I do hope that you meant the SSE2 client for this hostid=3762048.

Your Q6600 would be better served running SSSE3x client.

Kind regards,
JDWhale



My bad, i meant the hostid you posted yes.
ID: 752632 · Report as offensive
Profile Mumps [MM]
Volunteer tester
Avatar

Send message
Joined: 11 Feb 08
Posts: 4454
Credit: 100,893,853
RAC: 30
United States
Message 752634 - Posted: 13 May 2008, 21:25:00 UTC - in response to Message 752630.  

OK.

I have been testing the SSE2 client against the SSE3 client on my AMD Phenom, and my AMD Athlon64-X2, with the knabench tool.

I wasn't expecting any speedups, but knabench tells me that the SSE2 version i from 2 - 5,5% faster on the Athlon64, and from -1 - 4% faster on my Phenom.

This is a bit of a surprise to me, but I have switched to the SSE2 client on both machines, and will be watching carefully to see if it is indeed faster.

I've noticed something similar, although maybe a bit unrelated as well.

I have two crunchers:

Xeon Dual-Core/Dual CPU with HT
Opteron Dual-Core/Dual CPU

Which were previously running the Crunch3r op app. In that configuration, the version on the Opteron was running 7% faster than the version on the Xeon. But since switching over to the AKv8.0, the Xeon has flat out run away from the Opteron. 13% faster and still climbing more rapidly.

Just an observation that I felt was worth mentioning. It's most likely primarily driven by the difference in Cache between the processors, but is also somewhat affected by the SSE3 implementations.

Regardless of the cause, it seems the AKv8.0 will see a better increase in speed on Intel processors.
ID: 752634 · Report as offensive
Profile Mumps [MM]
Volunteer tester
Avatar

Send message
Joined: 11 Feb 08
Posts: 4454
Credit: 100,893,853
RAC: 30
United States
Message 752635 - Posted: 13 May 2008, 21:29:54 UTC - in response to Message 752602.  
Last modified: 13 May 2008, 21:30:30 UTC

Well, I've grabbed the SSE2 flavor for my middle-aged crunchers and it's doing a bang-up job. I've just passed hiamps in over-all RAC! :-)

Hmmmm, I smell an upgrade comming......Have a great day!


Well, it better be a big 'un. 'Cause it looks like this climb won't settle down until well over 16,000 RAC, maybe around 18,000. :-)

And in an ever-expanding Arm's race, I may end up having to replace that SSE only host with a Quad-Core sooner than I was contemplating. ;-) (It eats up how much power for a measley 250 RAC?!?!?)

Crunch On!
ID: 752635 · Report as offensive
Profile The Gas Giant
Volunteer tester
Avatar

Send message
Joined: 22 Nov 01
Posts: 1904
Credit: 2,646,654
RAC: 0
Australia
Message 752645 - Posted: 13 May 2008, 21:50:07 UTC
Last modified: 13 May 2008, 21:50:35 UTC

So my question is..Does anyone yet know if the SSE2 version or the SSE3 version of the AK port faster on a Prescott chip? I'll check it out in a few weeks, but I want my RAC to settle out first.

Live long and BOINC!
Paul
(S@H1 8888)
And proud of it!
ID: 752645 · Report as offensive
_heinz
Volunteer tester

Send message
Joined: 25 Feb 05
Posts: 744
Credit: 5,539,270
RAC: 0
France
Message 752671 - Posted: 13 May 2008, 22:51:01 UTC

P4 Northwood 2,66 GHz
---------------------
CPU time 15360.40625
stderr out <core_client_version>5.8.11</core_client_version>
<![CDATA[
<stderr_txt>
Optimized SETI@Home Enhanced application
Optimizers: Ben Herndon, Josef Segur, Alex Kan, Simon Zadra
Version: Windows SSE2 32-bit based on S@H V5.15 'Noo? No - Ni!'
Revision: R-2.4V|xB|FFT:IPP_SSE2|Ben-Joe
CPUID: Intel(R) Pentium(R) 4 CPU 2.66GHz
Speed: 1 x 2694 MHz
Features: MMX SSE SSE2

Work Unit Info
WU Credit multi. is: 2.85
WU True angle range: 0.394113

Spikes Pulses Triplets Gaussians Flops
5 0 0 0 22065615942717

</stderr_txt>
]]>

Validate state Initial
Claimed credit 72.8023185534521
-----------------------------------------------------------

CPU time 10432.66
stderr out <core_client_version>5.8.11</core_client_version>
<![CDATA[
<stderr_txt>
Windows optimized S@H Enhanced application by Alex Kan
Version info: SSE2x (AMD/Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSE2x Win32 Build 44 , Ported by : Jason G, Raistmer, JDWhale

CPUID: Intel(R) Pentium(R) 4 CPU 2.66GHz
Speed: 1 x 2662 MHz
Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 0.394257

Flopcounter: 22061248049434.602000

Spike count: 5
Pulse count: 1
Triplet count: 0
Gaussian count: 3
called boinc_finish

</stderr_txt>
]]>

Validate state Valid
Claimed credit 72.7879050925926
-------------------------------------------
duration now 67,9% for a 72 credit wu
-------------------------------------------

heinz
ID: 752671 · Report as offensive
Profile JDWhale
Volunteer tester
Avatar

Send message
Joined: 6 Apr 99
Posts: 921
Credit: 21,935,817
RAC: 3
United States
Message 752697 - Posted: 13 May 2008, 23:50:42 UTC - in response to Message 752635.  
Last modified: 14 May 2008, 0:04:51 UTC


(It eats up how much power for a measley 250 RAC?!?!?)

Crunch On!


I agree, but for testing the new clients I've pulled 2 reserves back into active duty. With outside temps already reaching 30C I'll have to retire both Ginger and MaryAnn very soon... I've identified a couple more SSE2 tweaks I want to try, since MaryAnn is my 2.66GHz Northwood, she will have to stay online a bit longer to run the time trials for evaluation.

@Gas Giant - I'll switch Ginger (3.2Ghz Prescott) over to SSE2 clients... I'll do the release client first, but I also want to try two others for comparison. Maybe ~24 hours each, maybe longer if needed for similar ARs. Keep in mind that Jason sprinkles magic dust on his computer and recites ancient incantations while building the releases to give them a little extra power ;-) (So my personal builds would be slightly faster if Jason had build them.)

[edit] Wait!!! I already ran those tests on Ginger, about 6 days ago. SSE3 is faster on Prescott. That was before I called MaryAnn back into service.[/edit]

Best regards,
JDWhale
ID: 752697 · Report as offensive
Profile hiamps
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 752705 - Posted: 14 May 2008, 0:07:42 UTC - in response to Message 752697.  
Last modified: 14 May 2008, 0:08:46 UTC


(It eats up how much power for a measley 250 RAC?!?!?)

Crunch On!


I agree, but for testing the new clients I've pulled 2 reserves back into active duty. With outside temps already reaching 30C I'll have to retire both Ginger and MaryAnn very soon... I've identified a couple more SSE2 tweaks I want to try, since MaryAnn is my 2.66GHz Northwood, she will have to stay online a bit longer to run the time trials for evaluation.

@Gas Giant - I'll switch Ginger (3.2Ghz Prescott) over to SSE2 clients... I'll do the release client first, but I also want to try two others for comparison. Maybe ~24 hours each, maybe longer if needed for similar ARs. Keep in mind that Jason sprinkles magic dust on his computer and recites ancient incantations while building the releases to give them a little extra power ;-) (So my personal builds would be slightly faster if Jason had build them.)

[edit] Wait!!! I already ran those tests on Ginger, about 6 days ago. SSE3 is faster on Prescott. That was before I called MaryAnn back into service.[/edit]

Best regards,
JDWhale

So what does the professor have in it? Too bad they don't show computer names...I bet that would be interesting...
Official Abuser of Boinc Buttons...
And no good credit hound!
ID: 752705 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65763
Credit: 55,293,173
RAC: 49
United States
Message 752721 - Posted: 14 May 2008, 1:01:50 UTC - in response to Message 752705.  


(It eats up how much power for a measly 250 RAC?!?!?)

Crunch On!


I agree, but for testing the new clients I've pulled 2 reserves back into active duty. With outside temps already reaching 30C I'll have to retire both Ginger and MaryAnn very soon... I've identified a couple more SSE2 tweaks I want to try, since MaryAnn is my 2.66GHz Northwood, she will have to stay online a bit longer to run the time trials for evaluation.

@Gas Giant - I'll switch Ginger (3.2Ghz Prescott) over to SSE2 clients... I'll do the release client first, but I also want to try two others for comparison. Maybe ~24 hours each, maybe longer if needed for similar ARs. Keep in mind that Jason sprinkles magic dust on his computer and recites ancient incantations while building the releases to give them a little extra power ;-) (So my personal builds would be slightly faster if Jason had build them.)

[edit] Wait!!! I already ran those tests on Ginger, about 6 days ago. SSE3 is faster on Prescott. That was before I called MaryAnn back into service.[/edit]

Best regards,
JDWhale

So what does the professor have in it? Too bad they don't show computer names...I bet that would be interesting...

No comment, It's a 60's TV show for Me. ;)
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 752721 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 752797 - Posted: 14 May 2008, 4:21:06 UTC - in response to Message 752630.  

OK.

I have been testing the SSE2 client against the SSE3 client on my AMD Phenom, and my AMD Athlon64-X2, with the knabench tool.

I wasn't expecting any speedups, but knabench tells me that the SSE2 version i from 2 - 5,5% faster on the Athlon64, and from -1 - 4% faster on my Phenom.

This is a bit of a surprise to me, but I have switched to the SSE2 client on both machines, and will be watching carefully to see if it is indeed faster.


In testing there was around ~8% natural variation in procesing times for the AMD chips with either of the apps running the same test WU. I'd probably attribute that mostly to the state of the cache at any given time, and what context switching is going on, so apparent -1 to 4% difference would be considered 'the same app' on these machines. Full-size WUs may increase the difference one way or another.

processing time on the Intel chips doesn't *seem to* exibit the same variation on test WUs , not sure about full-size live WUs though.

Either way I'd rely on full size-WUs though, rather than the synthetic shortened test ones, and compare same ARs to figure out which App is really faster on a given machine (if any). The apps may have different performance characteristics at different ARs too.

Jason

"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 752797 · Report as offensive
NewtonianRefractor
Volunteer tester
Avatar

Send message
Joined: 19 Sep 04
Posts: 495
Credit: 225,412
RAC: 0
United States
Message 752814 - Posted: 14 May 2008, 5:05:51 UTC - in response to Message 752797.  
Last modified: 14 May 2008, 5:46:22 UTC

OK.

I have been testing the SSE2 client against the SSE3 client on my AMD Phenom, and my AMD Athlon64-X2, with the knabench tool.

I wasn't expecting any speedups, but knabench tells me that the SSE2 version i from 2 - 5,5% faster on the Athlon64, and from -1 - 4% faster on my Phenom.

This is a bit of a surprise to me, but I have switched to the SSE2 client on both machines, and will be watching carefully to see if it is indeed faster.


In testing there was around ~8% natural variation in procesing times for the AMD chips with either of the apps running the same test WU. I'd probably attribute that mostly to the state of the cache at any given time, and what context switching is going on, so apparent -1 to 4% difference would be considered 'the same app' on these machines. Full-size WUs may increase the difference one way or another.

processing time on the Intel chips doesn't *seem to* exibit the same variation on test WUs , not sure about full-size live WUs though.

Either way I'd rely on full size-WUs though, rather than the synthetic shortened test ones, and compare same ARs to figure out which App is really faster on a given machine (if any). The apps may have different performance characteristics at different ARs too.

Jason


I can try to run a test on my Turion64 X2. I currently about 180 Wu analyzed using the SSE3 app, and can switch to SSE2. Here is an Excel Worksheet for what I have so far: http://abrau.durso.googlepages.com/sah_boinc07.xls

P.S. I will check in on Friday with a comparison graph...
ID: 752814 · Report as offensive
Kim Vater
Volunteer tester

Send message
Joined: 27 May 99
Posts: 227
Credit: 22,743,307
RAC: 0
Norway
Message 752850 - Posted: 14 May 2008, 6:11:34 UTC

A quick glance at my two P4 Northwoods and a Pentium-M @1,7Ghz (with AK-V8 SSE2x) shows a performance gain of about 35% over 2.4V ;-)

A big thanks to the guys that made it possible :thumbsup:

Regards
kiva
Greetings from Norway

Crunch3er & AK-V8 Inside
ID: 752850 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 752856 - Posted: 14 May 2008, 6:14:48 UTC - in response to Message 752850.  

A quick glance at my two P4 Northwoods and a Pentium-M @1,7Ghz (with AK-V8 SSE2x) shows a performance gain of about 35% over 2.4V ;-)

A big thanks to the guys that made it possible :thumbsup:

Regards
kiva

Hiya Kim.....
Have you noticed that AndyK's stuff has been boinked by the Boinc webpage 'improvements'?

"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 752856 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 9 · Next

Message boards : Number crunching : SSE2


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.