Windows port of Alex v8 code

Message boards : Number crunching : Windows port of Alex v8 code
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 50 · Next

AuthorMessage
Profile John Clark
Volunteer tester
Avatar

Send message
Joined: 29 Sep 99
Posts: 16515
Credit: 4,418,829
RAC: 0
United Kingdom
Message 732140 - Posted: 29 Mar 2008, 23:32:35 UTC

You will get it all credited in the end as wingmen report.

I seem to be stable at a low pending ATM


It's good to be back amongst friends and colleagues



ID: 732140 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 732189 - Posted: 30 Mar 2008, 1:03:38 UTC

Well, lookee here then. I am rapidly coming to the conclusion that the big advantage that the Mac's have had IS Alex's incredible code.

First off, Mark has been running a pre-release SSE4 port put together by the lunatics.at fronted by Jason_Gee on his supercharged Penny rig. He has run 2 versions (identified as "ms JasonGee W32" and "ms JasonGee2 W32" on the following chart). He also trialled a Crunch3r SSE4 App (not Alex-based as I understand it) and I have included results from the same machine running Crunch3r SSSE3 for comparison:


Direct Link

The differences between the Crunch3r SSSE3 and Crunch3r SSE4 Apps seems to indicate that there may be small benefits from the inclusion of the SSE4 features, though, as I have said in other contexts, the Mean values for both Apps fall between the Max and Min values for the other in any given Angle Range band and there are still very few results in some AR bands so I don't see these results as being conclusive.

Looking at the difference between the SSSE3 App and the latest version of the Alex-based SSE4 App is a different kettle of fish. The latest version has managed to process results in only 3 AR bands but the improvement over the SSSE3 App is very marked: 15.7% for VLAR, 16.4% in the 0.22 - 0.4 range, and a massive 28.4% in the 0.4 - 0.5 range.

Of course, this is of most interest to those who have invested in SSE4 capable hardware. But what of the rest of us?

Here is my latest take on JDWhale's machines (SSSE3 port of the Alex code):


Direct Link

Again, John has gone through 2 pre-release versions and the progress is quite evident, though at VHAR version 0.2 does not improve on version 0.1 (again there are a very limited number of results in this AR band so treat with caution). And, again, the latest version can be referenced against the Crunch3r SSSE3 App. Improvements are even more marked than for SSE4: 39% in the 0.22 - 0.4 band, 40.6% in the 0.4 - 0.5 band, and 24.9% for VHAR units.

What most of us didn't know when JDWhale started on his adventure was that the lunatics were active in this area too. I have monitored a test run by Mark on one of his Q6600's of the lunatics port of Alex's code for SSSE3:


Direct Link

This has many fewer results to work with and the improvements of the ported code over the Crunch3r SSSE3 is "only" 21 - 23% in the mid Angle Range bands where there are results from both Apps.

All in all we must commend JDWhale and the Lunatics crew for what has been achieved so far and I am sure that those of us with SSSE3 and SSE4 capable hardware are chomping at the bit for general release of this code. I am sure that there will be increased pressure on the Lunatics for SSE2 and SSE3 variants but remember, guys, that this all takes time out of people's private lives.

F.
ID: 732189 · Report as offensive
Profile Karsten Vinding
Volunteer tester

Send message
Joined: 18 May 99
Posts: 239
Credit: 25,201,931
RAC: 11
Denmark
Message 732237 - Posted: 30 Mar 2008, 2:12:51 UTC - in response to Message 732189.  

I don't have much to add, except that Lunatics allready have SSE3 and SSE3 for AMD versions that work, and validate.

It's the porting to SSE2 and older that is going to cause problems, as it doesn't seem as easy to do (because AK's V8 improvements are made with SSE3 and never instructions in mind).


ID: 732237 · Report as offensive
Profile David
Volunteer tester
Avatar

Send message
Joined: 19 May 99
Posts: 411
Credit: 1,426,457
RAC: 0
Australia
Message 732242 - Posted: 30 Mar 2008, 2:17:51 UTC

Thanks for the post Fred. Those results, even though they are only early and limited results, are brilliant. I'm definately after the SSE3 / SSSE3 version as alas the Q6600 does not support SSE4 :(
ID: 732242 · Report as offensive
Profile JDWhale
Volunteer tester
Avatar

Send message
Joined: 6 Apr 99
Posts: 921
Credit: 21,935,817
RAC: 3
United States
Message 732256 - Posted: 30 Mar 2008, 2:51:10 UTC
Last modified: 30 Mar 2008, 2:54:41 UTC

Very nice presentation Fred.

I would not expect any performance improvement on VHAR WUs, I'm understanding that there is no Gaussian search performed on those WUs. The main (only?) difference between v0.1 and v0.2 was to turn on the part of Alex code that implements vectorized gaussian detection (I don't mean to sound smart, I'm just reading the comments/descriptions in the code. For all I know that part of the code walks your dog for you ;-).

The vectorized gaussian code seems to knock ~20 minutes off the CPU times for Intel Core2 hosts running at ~2.4GHz over the non-vectorized code regardless of AR. Thus we see the 1200 second step between v0.1 and v0.2 for the ARs presenting so far. It's been documented that I ran out of time before leaving on holiday to debug this part of the code, thus the intermediate (v0.1) version. It's interresting to see how improvements in different parts of the code impact run times.

=================================================================
On another note, my Smithfield has posted enough results to be included for those that will benefit from SSE3 capabilities... hostid=2938512&offset=60 is showing similar performance gains as already presented for the SSSE3 capable chips. This host is a Dell XPS400 running a P4D at 2.8GHz. (~30% faster than KWSN_2.4_SSE3-Intel-P4_MB)

This is my HTPC hooked up to a HiDef front projector and Dolby digital 5.1 receiver. It rarely gets used, but I'm heading in there right now... My son just brought over a digital remastered HiDef version of "Blade Runner" !!! (ie: "Do Androids Dream of Electric Sheep?")


BOINC On...On...
ID: 732256 · Report as offensive
Profile Geek@Play
Volunteer tester
Avatar

Send message
Joined: 31 Jul 01
Posts: 2467
Credit: 86,146,931
RAC: 0
United States
Message 732268 - Posted: 30 Mar 2008, 3:38:45 UTC
Last modified: 30 Mar 2008, 3:38:57 UTC

Good Job guy's...........

I'm wondering if any of you is seeing increased temps on the cpu while running this V8 stuff.
ID: 732268 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65736
Credit: 55,293,173
RAC: 49
United States
Message 732284 - Posted: 30 Mar 2008, 4:29:36 UTC - in response to Message 732268.  

Good Job guy's...........

I'm wondering if any of you is seeing increased temps on the cpu while running this V8 stuff.

Good point, I'm curious about that aspect of the new apps code too.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 732284 · Report as offensive
Profile JDWhale
Volunteer tester
Avatar

Send message
Joined: 6 Apr 99
Posts: 921
Credit: 21,935,817
RAC: 3
United States
Message 732289 - Posted: 30 Mar 2008, 4:49:03 UTC - in response to Message 732268.  
Last modified: 30 Mar 2008, 5:10:28 UTC

Good Job guy's...........

I'm wondering if any of you is seeing increased temps on the cpu while running this V8 stuff.


I too, was curious about that, but am happy to report the same temps as when running SSSE3 V2.4V on Q6600.

Q6600 running at 3200MHz (9x356)... CoreTemp reports 63-57-57-61 with room temp at 26C. HSF is KingWIN Revolution RVT-9225 in Antec P190 case.

Note HSF is rotated 90 degrees so that the fan blows toward the top of the case... this aligns the "Heatpipes" perpendicular to the bead of AS5 and cools much better than when fan is pointed toward the back of the case. Not a problem with the Antec P180 which has exhaust fans both on top and on rear.


HTH and YMMV,
JDWhale
ID: 732289 · Report as offensive
Profile John Clark
Volunteer tester
Avatar

Send message
Joined: 29 Sep 99
Posts: 16515
Credit: 4,418,829
RAC: 0
United Kingdom
Message 732341 - Posted: 30 Mar 2008, 10:00:41 UTC

What can I say but very very interesting. The early indications point to the efficiency of Alex's code, and the efficacy of the port by Jason and JD, and others.

Keep it up lads, and I look forwards to the confirmed release.
It's good to be back amongst friends and colleagues



ID: 732341 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 732342 - Posted: 30 Mar 2008, 10:02:24 UTC

@JDWhale, Thanks for sharing details on the IPP correlation issue, works a charm.

Jason

"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 732342 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 732343 - Posted: 30 Mar 2008, 10:03:53 UTC - in response to Message 732341.  
Last modified: 30 Mar 2008, 10:07:14 UTC

What can I say but very very interesting. The early indications point to the efficiency of Alex's code, and the efficacy of the port by Jason and JD, and others.

Keep it up lads, and I look forwards to the confirmed release.

It just keeps getting better......Jason has just done a new build with some more small tweaks.....and got another few percent improvement on my test runs on some ARs......
My oh my......
Hang in there folks......this is gonna be good.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 732343 · Report as offensive
Profile JDWhale
Volunteer tester
Avatar

Send message
Joined: 6 Apr 99
Posts: 921
Credit: 21,935,817
RAC: 3
United States
Message 732358 - Posted: 30 Mar 2008, 10:42:03 UTC - in response to Message 732342.  

@JDWhale, Thanks for sharing details on the IPP correlation issue, works a charm.

Jason


You're quite welcome, I'd never chased down that issue if you hadn't pointed out the problem with the early app hanging in that spot. This is a collaboration, not a competition ;-)

Regards,
JDWhale
ID: 732358 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 732361 - Posted: 30 Mar 2008, 10:46:09 UTC - in response to Message 732358.  

@JDWhale, Thanks for sharing details on the IPP correlation issue, works a charm.

Jason


You're quite welcome, I'd never chased down that issue if you hadn't pointed out the problem with the early app hanging in that spot. This is a collaboration, not a competition ;-)

Regards,
JDWhale

And a grand collaboration it's turning out to be......you guys are just great...
The rest of you stay tuned......this is gonna be grand, my friends......
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 732361 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 732403 - Posted: 30 Mar 2008, 13:10:45 UTC - in response to Message 732343.  

Jason has just done a new build with some more small tweaks.....and got another few percent improvement on my test runs on some ARs......

Yes, indeed. I've updated the graphs posted in message 731773 in this thread. As a line in stderr out reads:
Windows optimized S@H Enhanced application by Alex Kan
Version info: SSE4.1 (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSE4.1 Win32 [b][color=blue]rev 25[/color][/b] Pre-Release, Ported by : Jason G, Joe Segur, Alex Kan, Raistmer
I given this one the label KanJasG25, as distinct from KanJasG20 for the previous one.

At least some of the points plotted on this first reading are almost certainly mixed units (started on 20, finished on 25). The picture should clarify, at least for the .398 Angle Range within a few hours.

Even if you click on my message link, your browser may show you the older graph from its cache, but if you click the update button, you should get fresh. Just check the legend for KanJasG25.

ID: 732403 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 732406 - Posted: 30 Mar 2008, 13:16:48 UTC - in response to Message 732403.  
Last modified: 30 Mar 2008, 13:19:51 UTC

Jason has just done a new build with some more small tweaks.....and got another few percent improvement on my test runs on some ARs......

Yes, indeed. I've updated the graphs posted in message 731773 in this thread. As a line in stderr out reads:
Windows optimized S@H Enhanced application by Alex Kan
Version info: SSE4.1 (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSE4.1 Win32 [b][color=blue]rev 25[/color][/b] Pre-Release, Ported by : Jason G, Joe Segur, Alex Kan, Raistmer
I given this one the label KanJasG25, as distinct from KanJasG20 for the previous one.

At least some of the points plotted on this first reading are almost certainly mixed units (started on 20, finished on 25). The picture should clarify, at least for the .398 Angle Range within a few hours.

Even if you click on my message link, your browser may show you the older graph from its cache, but if you click the update button, you should get fresh. Just check the legend for KanJasG25.

Still running for a few hours.....please do update it....this is amazing stuff.....
And this is still not the fully polished versison.........a few more percent may be in the offing.......
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 732406 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 732411 - Posted: 30 Mar 2008, 13:42:36 UTC - in response to Message 732406.  

Still running for a few hours.....please do update it....this is amazing stuff.....
And this is still not the fully polished versison.........a few more percent may be in the offing.......

Only mid-AR results so far (0.22 - 0.4 AR) but these show a further 5% improvement over the previous version (R20) i.e. up from a 16% improvement over the Crunch3r SSE4 to a 23.4% improvement. I'll post updated charts later on today.

Keep up the good work folks. It's so-o-o-o good to see collaborative development in action :)

F.
ID: 732411 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 732421 - Posted: 30 Mar 2008, 13:56:02 UTC - in response to Message 732411.  

Still running for a few hours.....please do update it....this is amazing stuff.....
And this is still not the fully polished versison.........a few more percent may be in the offing.......

Only mid-AR results so far (0.22 - 0.4 AR) but these show a further 5% improvement over the previous version (R20) i.e. up from a 16% improvement over the Crunch3r SSE4 to a 23.4% improvement. I'll post updated charts later on today.

Keep up the good work folks. It's so-o-o-o good to see collaborative development in action :)

F.

Thanks and good morning Fred.....
I tried to get some sleep myself, but this stuff has me so all-fired up that I just couldn't......
The kitties are just dancing....LOL.
Look forward to your new graphs later today.....
Doubt that you will see much range in the ARs....just the way my 10 day cache works.....tends to download new work in bunches.....
But it should be an eye-opener regardless.

"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 732421 · Report as offensive
Profile JDWhale
Volunteer tester
Avatar

Send message
Joined: 6 Apr 99
Posts: 921
Credit: 21,935,817
RAC: 3
United States
Message 732442 - Posted: 30 Mar 2008, 14:34:43 UTC - in response to Message 732421.  
Last modified: 30 Mar 2008, 14:42:51 UTC


I tried to get some sleep myself, but this stuff has me so all-fired up that I just couldn't......
The kitties are just dancing....LOL.


Glad to hear that I'm not the only one losing sleep... That whole week before holiday, from the day this thread started, I was getting by on ~2 hours sleep a night.

I kept telling myself that I was in "training" for the Vegas scene ;-) Truth be told, I had to go to Las Vegas just to get some rest. Try telling that one to the folks you're gambling with at 3:00AM. I had to resort to "catnaps" in the afternoons just to keep alert at the tables. While driving up into Death Valley, I pulled onto the shoulder just to fit in a 45 minute nap when I felt myself nodding at the wheel.

Don't know if you pulled an all-nighter last night or not, but at least I got a couple hours sleep while you were running benchmarks early this morning.

Keep with it Mark, you know the saying... "You can rest when you're dead" ;-)

BOINC Onward, through the fog!
ID: 732442 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 732454 - Posted: 30 Mar 2008, 14:45:44 UTC - in response to Message 732442.  


I tried to get some sleep myself, but this stuff has me so all-fired up that I just couldn't......
The kitties are just dancing....LOL.


Glad to hear that I'm not the only one losing sleep... That whole week before holiday, from the day this thread started, I was getting by on ~2 hours sleep a night.

I kept telling myself that I was in "training" for the Vegas scene ;-) Truth be told, I had to go to Las Vegas just to get some rest. Try telling that one to the folks you're gambling with at 3:00AM. I had to resort to "catnaps" in the afternoons just to keep alert at the tables. While driving up into Death Valley, I pulled onto the shoulder just to fit in a 45 minute nap when I felt myself nodding at the wheel.

Don't know if you pulled an all-nighter last night or not, but at least I got a couple hours sleep while you were running benchmarks early this morning.

Keep with it Mark, you know the saying... "You can rest when you're dead" ;-)

BOINC Onward, through the fog!

LOL, John........
The GF is not gonna understand lightly that I stayed up all night tugging on the whiskey bottle and playing Seti monster all night......
Somehow she just does not get it.......
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 732454 · Report as offensive
Profile JDWhale
Volunteer tester
Avatar

Send message
Joined: 6 Apr 99
Posts: 921
Credit: 21,935,817
RAC: 3
United States
Message 732473 - Posted: 30 Mar 2008, 15:17:05 UTC
Last modified: 30 Mar 2008, 15:21:46 UTC

If someone[you know W..?] could loan me a fully loaded Skulltrail system (for a year or two), I'd love to benchmark it with the WhalePort of Alex' V8 code.

Hint, Hint...

[edit]Maybe just lift the restrictions on my "Evaluation Licenses", or grant a "Not-For-Profit" license, for ICC & IPP so that I can distribute the app.[/edit]
ID: 732473 · Report as offensive
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 50 · Next

Message boards : Number crunching : Windows port of Alex v8 code


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.