Message boards :
Number crunching :
Windows port of Alex v8 code
Message board moderation
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 50 · Next
Author | Message |
---|---|
John Clark Send message Joined: 29 Sep 99 Posts: 16515 Credit: 4,418,829 RAC: 0 |
You will get it all credited in the end as wingmen report. I seem to be stable at a low pending ATM It's good to be back amongst friends and colleagues |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
Well, lookee here then. I am rapidly coming to the conclusion that the big advantage that the Mac's have had IS Alex's incredible code. First off, Mark has been running a pre-release SSE4 port put together by the lunatics.at fronted by Jason_Gee on his supercharged Penny rig. He has run 2 versions (identified as "ms JasonGee W32" and "ms JasonGee2 W32" on the following chart). He also trialled a Crunch3r SSE4 App (not Alex-based as I understand it) and I have included results from the same machine running Crunch3r SSSE3 for comparison: Direct Link The differences between the Crunch3r SSSE3 and Crunch3r SSE4 Apps seems to indicate that there may be small benefits from the inclusion of the SSE4 features, though, as I have said in other contexts, the Mean values for both Apps fall between the Max and Min values for the other in any given Angle Range band and there are still very few results in some AR bands so I don't see these results as being conclusive. Looking at the difference between the SSSE3 App and the latest version of the Alex-based SSE4 App is a different kettle of fish. The latest version has managed to process results in only 3 AR bands but the improvement over the SSSE3 App is very marked: 15.7% for VLAR, 16.4% in the 0.22 - 0.4 range, and a massive 28.4% in the 0.4 - 0.5 range. Of course, this is of most interest to those who have invested in SSE4 capable hardware. But what of the rest of us? Here is my latest take on JDWhale's machines (SSSE3 port of the Alex code): Direct Link Again, John has gone through 2 pre-release versions and the progress is quite evident, though at VHAR version 0.2 does not improve on version 0.1 (again there are a very limited number of results in this AR band so treat with caution). And, again, the latest version can be referenced against the Crunch3r SSSE3 App. Improvements are even more marked than for SSE4: 39% in the 0.22 - 0.4 band, 40.6% in the 0.4 - 0.5 band, and 24.9% for VHAR units. What most of us didn't know when JDWhale started on his adventure was that the lunatics were active in this area too. I have monitored a test run by Mark on one of his Q6600's of the lunatics port of Alex's code for SSSE3: Direct Link This has many fewer results to work with and the improvements of the ported code over the Crunch3r SSSE3 is "only" 21 - 23% in the mid Angle Range bands where there are results from both Apps. All in all we must commend JDWhale and the Lunatics crew for what has been achieved so far and I am sure that those of us with SSSE3 and SSE4 capable hardware are chomping at the bit for general release of this code. I am sure that there will be increased pressure on the Lunatics for SSE2 and SSE3 variants but remember, guys, that this all takes time out of people's private lives. F. |
Karsten Vinding Send message Joined: 18 May 99 Posts: 239 Credit: 25,201,931 RAC: 11 |
I don't have much to add, except that Lunatics allready have SSE3 and SSE3 for AMD versions that work, and validate. It's the porting to SSE2 and older that is going to cause problems, as it doesn't seem as easy to do (because AK's V8 improvements are made with SSE3 and never instructions in mind). |
David Send message Joined: 19 May 99 Posts: 411 Credit: 1,426,457 RAC: 0 |
Thanks for the post Fred. Those results, even though they are only early and limited results, are brilliant. I'm definately after the SSE3 / SSSE3 version as alas the Q6600 does not support SSE4 :( |
JDWhale Send message Joined: 6 Apr 99 Posts: 921 Credit: 21,935,817 RAC: 3 |
Very nice presentation Fred. I would not expect any performance improvement on VHAR WUs, I'm understanding that there is no Gaussian search performed on those WUs. The main (only?) difference between v0.1 and v0.2 was to turn on the part of Alex code that implements vectorized gaussian detection (I don't mean to sound smart, I'm just reading the comments/descriptions in the code. For all I know that part of the code walks your dog for you ;-). The vectorized gaussian code seems to knock ~20 minutes off the CPU times for Intel Core2 hosts running at ~2.4GHz over the non-vectorized code regardless of AR. Thus we see the 1200 second step between v0.1 and v0.2 for the ARs presenting so far. It's been documented that I ran out of time before leaving on holiday to debug this part of the code, thus the intermediate (v0.1) version. It's interresting to see how improvements in different parts of the code impact run times. ================================================================= On another note, my Smithfield has posted enough results to be included for those that will benefit from SSE3 capabilities... hostid=2938512&offset=60 is showing similar performance gains as already presented for the SSSE3 capable chips. This host is a Dell XPS400 running a P4D at 2.8GHz. (~30% faster than KWSN_2.4_SSE3-Intel-P4_MB) This is my HTPC hooked up to a HiDef front projector and Dolby digital 5.1 receiver. It rarely gets used, but I'm heading in there right now... My son just brought over a digital remastered HiDef version of "Blade Runner" !!! (ie: "Do Androids Dream of Electric Sheep?") BOINC On...On... |
Geek@Play Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0 |
Good Job guy's........... I'm wondering if any of you is seeing increased temps on the cpu while running this V8 stuff. |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65736 Credit: 55,293,173 RAC: 49 |
Good Job guy's........... Good point, I'm curious about that aspect of the new apps code too. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
JDWhale Send message Joined: 6 Apr 99 Posts: 921 Credit: 21,935,817 RAC: 3 |
Good Job guy's........... I too, was curious about that, but am happy to report the same temps as when running SSSE3 V2.4V on Q6600. Q6600 running at 3200MHz (9x356)... CoreTemp reports 63-57-57-61 with room temp at 26C. HSF is KingWIN Revolution RVT-9225 in Antec P190 case. Note HSF is rotated 90 degrees so that the fan blows toward the top of the case... this aligns the "Heatpipes" perpendicular to the bead of AS5 and cools much better than when fan is pointed toward the back of the case. Not a problem with the Antec P180 which has exhaust fans both on top and on rear. HTH and YMMV, JDWhale |
John Clark Send message Joined: 29 Sep 99 Posts: 16515 Credit: 4,418,829 RAC: 0 |
What can I say but very very interesting. The early indications point to the efficiency of Alex's code, and the efficacy of the port by Jason and JD, and others. Keep it up lads, and I look forwards to the confirmed release. It's good to be back amongst friends and colleagues |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
@JDWhale, Thanks for sharing details on the IPP correlation issue, works a charm. Jason "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
What can I say but very very interesting. The early indications point to the efficiency of Alex's code, and the efficacy of the port by Jason and JD, and others. It just keeps getting better......Jason has just done a new build with some more small tweaks.....and got another few percent improvement on my test runs on some ARs...... My oh my...... Hang in there folks......this is gonna be good. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
JDWhale Send message Joined: 6 Apr 99 Posts: 921 Credit: 21,935,817 RAC: 3 |
@JDWhale, Thanks for sharing details on the IPP correlation issue, works a charm. You're quite welcome, I'd never chased down that issue if you hadn't pointed out the problem with the early app hanging in that spot. This is a collaboration, not a competition ;-) Regards, JDWhale |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
@JDWhale, Thanks for sharing details on the IPP correlation issue, works a charm. And a grand collaboration it's turning out to be......you guys are just great... The rest of you stay tuned......this is gonna be grand, my friends...... "Freedom is just Chaos, with better lighting." Alan Dean Foster |
archae86 Send message Joined: 31 Aug 99 Posts: 909 Credit: 1,582,816 RAC: 0 |
Jason has just done a new build with some more small tweaks.....and got another few percent improvement on my test runs on some ARs...... Yes, indeed. I've updated the graphs posted in message 731773 in this thread. As a line in stderr out reads: Windows optimized S@H Enhanced application by Alex Kan Version info: SSE4.1 (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan SSE4.1 Win32 [b][color=blue]rev 25[/color][/b] Pre-Release, Ported by : Jason G, Joe Segur, Alex Kan, RaistmerI given this one the label KanJasG25, as distinct from KanJasG20 for the previous one. At least some of the points plotted on this first reading are almost certainly mixed units (started on 20, finished on 25). The picture should clarify, at least for the .398 Angle Range within a few hours. Even if you click on my message link, your browser may show you the older graph from its cache, but if you click the update button, you should get fresh. Just check the legend for KanJasG25. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Jason has just done a new build with some more small tweaks.....and got another few percent improvement on my test runs on some ARs...... Still running for a few hours.....please do update it....this is amazing stuff..... And this is still not the fully polished versison.........a few more percent may be in the offing....... "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
Still running for a few hours.....please do update it....this is amazing stuff..... Only mid-AR results so far (0.22 - 0.4 AR) but these show a further 5% improvement over the previous version (R20) i.e. up from a 16% improvement over the Crunch3r SSE4 to a 23.4% improvement. I'll post updated charts later on today. Keep up the good work folks. It's so-o-o-o good to see collaborative development in action :) F. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Still running for a few hours.....please do update it....this is amazing stuff..... Thanks and good morning Fred..... I tried to get some sleep myself, but this stuff has me so all-fired up that I just couldn't...... The kitties are just dancing....LOL. Look forward to your new graphs later today..... Doubt that you will see much range in the ARs....just the way my 10 day cache works.....tends to download new work in bunches..... But it should be an eye-opener regardless. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
JDWhale Send message Joined: 6 Apr 99 Posts: 921 Credit: 21,935,817 RAC: 3 |
Glad to hear that I'm not the only one losing sleep... That whole week before holiday, from the day this thread started, I was getting by on ~2 hours sleep a night. I kept telling myself that I was in "training" for the Vegas scene ;-) Truth be told, I had to go to Las Vegas just to get some rest. Try telling that one to the folks you're gambling with at 3:00AM. I had to resort to "catnaps" in the afternoons just to keep alert at the tables. While driving up into Death Valley, I pulled onto the shoulder just to fit in a 45 minute nap when I felt myself nodding at the wheel. Don't know if you pulled an all-nighter last night or not, but at least I got a couple hours sleep while you were running benchmarks early this morning. Keep with it Mark, you know the saying... "You can rest when you're dead" ;-) BOINC Onward, through the fog! |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
LOL, John........ The GF is not gonna understand lightly that I stayed up all night tugging on the whiskey bottle and playing Seti monster all night...... Somehow she just does not get it....... "Freedom is just Chaos, with better lighting." Alan Dean Foster |
JDWhale Send message Joined: 6 Apr 99 Posts: 921 Credit: 21,935,817 RAC: 3 |
If someone[you know W..?] could loan me a fully loaded Skulltrail system (for a year or two), I'd love to benchmark it with the WhalePort of Alex' V8 code. Hint, Hint... [edit]Maybe just lift the restrictions on my "Evaluation Licenses", or grant a "Not-For-Profit" license, for ICC & IPP so that I can distribute the app.[/edit] |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.