Message boards :
Number crunching :
Windows port of Alex v8 code
Message board moderation
Previous · 1 . . . 27 · 28 · 29 · 30 · 31 · 32 · 33 . . . 50 · Next
Author | Message |
---|---|
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
Noo.. no speculation? I'm soo hungry! |
archae86 Send message Joined: 31 Aug 99 Posts: 909 Credit: 1,582,816 RAC: 0 |
First look says that the "SSE4.1 Win64 rev 32" is a bit slower than the "SSSE3 Xeon Win64 rev 32 Pre-Release" in three batches of close comparisons, all at mid Angle Range: With about a day of running the SSE4.1 rev 32 build, there is substantial comparison data for very low Angle Range and for mid Angle Range. The disadvantage of the SSE4.1 build compared to the "SSS3 Xeon" flavor, both of rev 32, is quite marked in the very low Angle Range. It is probably about 7% slower. It is only slightly faster in this range than SSSE3 rev 27. While the percentage disadvantage to same rev SSSE3 Xeon seems less in mid Angle Range, the version discouragement is yet stronger as it is clearly slower in this range than SSSE3 rev 27. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
[quote]First look says that the "SSE4.1 Win64 rev 32" is a bit slower than the "SSSE3 Xeon Win64 rev 32 Pre-Release" in three batches of close comparisons, all at mid Angle Range: I just cut over to the x64 SSSE3 xeon ipp u8 pgo Ox r32 version.....don't know what the stderr output says yet, but that's what I am running for test. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
JDWhale Send message Joined: 6 Apr 99 Posts: 921 Credit: 21,935,817 RAC: 3 |
I just cut over to the x64 SSSE3 xeon ipp u8 pgo Ox r32 version.....don't know what the stderr output says yet, but that's what I am running for test. This is turning into Baskin Robbins... Think I'll have a triple scoop with BubbleGum, Chocolate Almond Fudge, and Strawberry Cheescake on top in a 64-bit waffle cone....LOL [edit] No, wait... make it Tiramisu on top [/edit] @Mark - Is this going to be the first "u8" live run? |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
I just cut over to the x64 SSSE3 xeon ipp u8 pgo Ox r32 version.....don't know what the stderr output says yet, but that's what I am running for test. LOL......to paraphrase Clint Eastwood.....'In all of this excitement I've kinda forgotten myself'..... There have been soo many new builds of the new apps I have tested offline, with many incarnations closely related, that I have been having a little trouble keeping them sorted...... This is very involved, as I am sure that you know, and many things are tried that simply do not work as planned..... And I have been finding (not to any of the optimizer's surprise) that sometimes what seems to be the fastest in short offline tests does not always hold true when crunching full length WUs of many different ARs... By the short tests, that last SSE4.1 build looked pretty good, but this SSSE3 variant may actually perform better on real Seti work.....we shall see. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
...Ox r32 version...Hmmm, that one escaped the lab huh? we pulled that build ... so Murphy says it'll be best... "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
...Ox r32 version...Hmmm, that one escaped the lab huh? we pulled that build ... so Murphy says it'll be best... Well....I'll run it for a bit to see what pans out pending your next round of builds.....keep up the great work there Jason.....standing by for the next batch of apps to knabench for ya..... "Freedom is just Chaos, with better lighting." Alan Dean Foster |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Well....I'll run it for a bit to see what pans out pending your next round of builds.....keep up the great work there Jason.....standing by for the next batch of apps to knabench for ya.....Thanks for giving things a careful run. At this point, some builds will be better than others and will indicate various possible avenues for further investigation. Thanks for your patience everyone. May 1st is target initial release date. Jason "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
David Send message Joined: 19 May 99 Posts: 411 Credit: 1,426,457 RAC: 0 |
Thanks for your patience everyone. May 1st is target initial release date. Looking great so far Jason, but more than 2 more weeks for the initial public release? ARGH!!!!! ;) Thats like 1100 WU's (on each PC) away! |
John Clark Send message Joined: 29 Sep 99 Posts: 16515 Credit: 4,418,829 RAC: 0 |
Whow! I wonder how many hits the Lunatic site (Jason and the others porting Alex's V8 code) will get on 1st May, when the first code is released to the wild? I bet, after a week, there will be major upgrades to Win-cruncher RACs, and movement in the table of the top 40. Many Q6600s will climb the charts as well. A new order will settle down soon It's good to be back amongst friends and colleagues |
JDWhale Send message Joined: 6 Apr 99 Posts: 921 Credit: 21,935,817 RAC: 3 |
Whow! Stay tuned to this NC thread.... I will predict the future... (Of course we know how that's turned out from past experience.... ;-) @Curmudgeon - Watch your back... You've got a whale on your tail (for now...) ;-) BOINC... BOINC... BOINC... ON...ON... JDWhale |
John Clark Send message Joined: 29 Sep 99 Posts: 16515 Credit: 4,418,829 RAC: 0 |
@Curmudgeon - Watch your back... You've got a whale on your tail (for now...) ;-) Yes, I know! But my Penny has borked twice over the last 2 nights night, and the RAC is suffering. I don't know why, but suspect it is heat! The RAC should be just short of or just over 5,600, which would only delay Thurston a little bit. It's good to be back amongst friends and colleagues |
archae86 Send message Joined: 31 Aug 99 Posts: 909 Credit: 1,582,816 RAC: 0 |
[quote]First look says that the "SSE4.1 Win64 rev 32" is a bit slower than the "SSSE3 Xeon Win64 rev 32 Pre-Release" in three batches of close comparisons, all at mid Angle Range: What is most recently returned as stderr out like this: Windows optimized S@H Enhanced application by Alex Kan Version info: SSSE3 Xeon (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan SSSE3 Xeon Win64 rev 32 Pre-Release, Ported by : Jason G, Joe Segur, Alex Kan, Raistmer But before the SSE4.1 ver 32 run you were running a version 32 that said this Windows optimized S@H Enhanced application by Alex Kan Version info: SSSE3 Xeon (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan SSSE3 Xeon Win64 rev 32 Pre-Release, Ported by : Jason G, Joe Segur, Alex Kan, Raistmer (the same, in other words). So combining your and Jason's input, I'm currently assuming that new results posted after the SSE4.1 group with this stderr differ, and am using "Ox" as a suffix for distinction. At first look, there are two results returned at very low angle range, which appear very slightly slower than the Xeon Rev 32 results, but are quite a bit better than the SSE4.1 rev 32 in that range. There are a number of results returned in medium angle range of .38 to .45. Here the Ox variant appears a bit slower than Xeon Rev 32, but a bit faster than SSE4.1 rev 32. First look, small sample, subjective comments only. |
KB7RZF Send message Joined: 15 Aug 99 Posts: 9549 Credit: 3,308,926 RAC: 2 |
Just 1 quick question, will these apps also work on a P4 2.8GHz w/HT, I believe the highest set is SSE3, or are there plans to make an app that will work? I may have skipped over it if you already posted. Thanks |
JDWhale Send message Joined: 6 Apr 99 Posts: 921 Credit: 21,935,817 RAC: 3 |
Just 1 quick question, will these apps also work on a P4 2.8GHz w/HT, I believe the highest set is SSE3, or are there plans to make an app that will work? I may have skipped over it if you already posted. Thanks While I can't speak for the Lunatics release, I brought Ginger (3.2GHz Prescott H/T) back online to evaluate AK_WhalePort (SSE3). Keep a watch out for pending performance report. Cheers, JDWhale |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
Just 1 quick question, will these apps also work on a P4 2.8GHz w/HT, I believe the highest set is SSE3, or are there plans to make an app that will work? I may have skipped over it if you already posted. Thanks Here is what I have gleaned on SSE3 from "Ginger". Direct Link Direct Link Again, we are limited by the spread of AR's but this represents just over 30% speed up in the mid-AR bands and nearly 15% for VHAR. None too shabby, I would say. Nice one guys. More on the Frozen Penny a bit later. I have the data but I need some sleep before I try to do anything with it :) F. |
JDWhale Send message Joined: 6 Apr 99 Posts: 921 Credit: 21,935,817 RAC: 3 |
Time for a weekly update on all 4 hosts I have running AK_WhalePort v0.2. Yes, it's been two weeks since I posted these charts the first time. The new thumbnails are scaled 30x50% of the fullsize images compared to 30x30% on prior screen caps. Hope this is still not too confusing. Host = Skipper: P4D-820 @ 2800MHz JDWhale on Apr 6 wrote: JDWhale on Mar 30 wrote:Lastly we see P4D-820 SSE3 chart, note it's very erratic and with it's just losing its 10 day cache will probably remain that way as Pending Credits will be rising. Thus, this chart bears no witness to the WhalePort effect as the others do. I'm really upset about blowing the cache away yesterday, I think this host could have shown the largest relative benefit, judging from what Fred W. has charted so far. Skippers RAC continues upward, though showing signs of leveling off. As one of only two of my hosts which had established RAC history, Skippers chart demonstrates different behavior than most others with their steady ramp upwards. The RAC performance vs. time exhibited here is more typical to what most hosts will exhibit. Who am I trying to fool, they are all going through he roof! My revised estimated RAC = 1342 This marks a 48% increase since switching to AK_WhalePort V0.2S (SSE3) Host = Lovey: E4500 @ 2420MHz JDWhale on Apr 6 wrote: JDWhale on Mar 30 wrote:This is the mighty E4500 @ 2420Mhz left running WhalePort v0.1 while I was in Las Vegas Mar 23-27 and switched to v0.2 on Mar 28. This chip continues to amaze me. Purchased for $118(including an ECS mobo) the E4500 now coupled with the Gigabyte GA-P35-DS3L is a real performer as long as you do not allow 2 VHARs to run simultaneously. Here is a 113CR (AR=0.055) wuid=242481897, Lovey has been called in as tie breaker with a couple comparable hosts both running optimized clients... the results should prove interesting pitted against AK_Whaleport. Still about 17 WUs from being crunched, result should be posted at about 14 Apr 12:00 UDT... I'm excited, don't see WUs worth 113 credits very often. My revised estimated RAC = 2720 Host = Wrongway: Q6600 @ 2520MHz JDWhale on Apr 6 wrote: JDWhale on Mar 30 wrote:This host is Q6600 @ 2520MHz switched from SSSE3 R2.4V to WhalePort v0.2 on Mar 28. Current graph shows some bouncing after Wrongways cache hit a low of ~20 WUs during the project servers issues last week. I've been working to rebuild his cache DLing in batches of 100WUs a couple times daily to mix it up a bit. Current cache ~475 WUs or ~7 days worth. My revised estimated RAC = 5040 Host = Thurston: Q6600 @ 3200MHz (for now) Which brings us to Thurston, the star of my lineup. JDWhale on Mar 28 wrote: Thanks Crunch3r, do you mind if I hold off 'till my Q6600 makes it into the top 20 before I hand over the ported source? I figure it's got a chance to reach RAC ~6500 running at 3200MHz with the AK codes. JDWhale on Apr 6 wrote: JDWhale on Mar 30 wrote:This host is Q6600 @ 3200MHz I guess it didn't report results on Mar 27 explaining the deep V before switching to WhalePort v0.2 on Mar 28. If you've been following Thurstons plight, you know that he suffered a seizure after raising his CPU clock to 3375MHz and memory to 1200MHz. This resulted in Boinc crashing and wiping out his established cache of ~1100WUs. Server and DB problems made recovery of the cache especially when combined with the flood of VHAR WUs, Thurston had back to back days returning 200 WUs each. Since the ~1100 lost during the seizure and the inability to build ample cache to pursue Skulltrail in "high performance' fashion, I made the decision to detatch all WUs and start fresh after giving Thurston the night off. This released the ~1100 WUs to rereleased without them having to "timeout". Well Thurston is back online, cache is now ~580 or about 4-5 days. Will be adding to it in lots of 100 WUs 3-4 times daily until we reach 10 days reserve. Since the Skulltrail hunt, "The Race", is over, I've relaxed scheduling from "extreme" to "agressive" mode. My revised estimated RAC = 7080 Host = Ginger: P4-540 (Prescott H/T) @ 3200MHz I'm not posting a graph for Ginger because she is being switched between R-2.4 SSE3 and AK_WhalePort V0.2S (SSE3) clients as different ARs are downloaded in order to "fill the gaps" in the charts. Couple that her being offline for the past 3 weeks and you can imagine the graph, "steady drop for the past 3 weeks, up the past 4 days" I agree with Fred W. assessment of ~30% improvement in CPU times at mid-range ARs and 15-17% at VHARs. The way I do the switching does not create "hybrid" results, each result is processed entirely using the same client. (Until Murphy steps in and causes power outage or Ginger is restarted during transition from one client to the other.) ============================================== Earlier in this thread I offered up a teaser that I be posting predictions later. My first prediction was that Trevor Immelman will shoot "lights out" 66 and win "The Masters" by 5 strokes. Well I didn't post in time and I would have been wrong on both counts. Next up.... Mac Pros will continue to be the top RAC hosts on the project even after the Windows port of AK client comes public, there is room for one exception that involves overclocking. RAC 10,000 and above will only be seen by the same boxes that are currently capable. The gap will narrow, but the song remains the same. (There might be an exception to this prediction if the owners of the Windows based 8-core rigs can figure out how to feed their hosts without getting bit.) I could very well be wrong here also... time will tell. Sorry for being somewhat cryptic.... NOT!!! Regards, JDWhale |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
Where do we download a copy of the v8 app from I'd be keen to run it onThis Beast Thanks Speedy |
KB7RZF Send message Joined: 15 Aug 99 Posts: 9549 Credit: 3,308,926 RAC: 2 |
Well, comparing the 0.385032 AR on This work unit and mine @ 0.382598 here, you shaved about 7000 seconds off with Windows Port: JDWhale V0.2S version. So thats pretty darn good. I'm looking forward to that release, hopefully about the same time as the others, but willing to wait, SETI isn't my main project, but I'd love to have a faster app once there is one. Good work JD, Jason, and all that are helping at lunatic's site. Jeremy Just 1 quick question, will these apps also work on a P4 2.8GHz w/HT, I believe the highest set is SSE3, or are there plans to make an app that will work? I may have skipped over it if you already posted. Thanks |
JDWhale Send message Joined: 6 Apr 99 Posts: 921 Credit: 21,935,817 RAC: 3 |
600th post @KB7RZF - You can browse Gingers results, I'm switching her between KWSN R2.4-SSE3 and AK_WhalePort V0.2S daily to run same AR WUs on both clients for comparisons and analysis. While this is not going to compare exactly the same as the pending Lunatics release, it should give you a pretty good idea of what to expect. Regards, JDWhale |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.