Message boards :
Number crunching :
AVX Extensions - Ongoing development?
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · Next
Author | Message |
---|---|
B-Man Send message Joined: 11 Feb 01 Posts: 253 Credit: 147,366 RAC: 0 |
How is the dev going? |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
How is the dev going? FYI: On the Lunatics side of things, focus has temporarily shifted to getting Raistmer's OpenCL apps into the installer, getting the recently optimised Cuda code into the X Branch, and perhaps most importantly assisting Berkeley with SaH_v7 stock refinement. Fortunately there is quite a lot of crossover between stock SaH_v7 refinement and the additions to AK_v8b required to support AVX hand optimisation (as opposed to compiler rebuilds). That means a steady development through stock leading into a major AK_v8b update to support V7 & AVX at the same time, seems the logical path at the moment, though it's likely to continue to be slow going since where AVX would likely benefit the most is some tough code. Jason "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
-BeNt- Send message Joined: 17 Oct 99 Posts: 1234 Credit: 10,116,112 RAC: 0 |
How is the dev going? Forget AVX and OpenCL, FERMI FERMI FERMI FERMI FERMI...... Eck....shake it off....shake it off....good job guys can't wait to see what's added in the next version! Take your time. ;) Traveling through space at ~67,000mph! |
skildude Send message Joined: 4 Oct 00 Posts: 9541 Credit: 50,759,529 RAC: 60 |
there is work on a fermi OpenCL app IIRC at Lunatics In a rich man's house there is no place to spit but his face. Diogenes Of Sinope |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
there is work on a fermi OpenCL app IIRC at Lunatics LoL. Sadly despite that some OpenCL apps can, & will, run even some ATi SDK (as well as nVidia) builds on nVidia cards, doesn't make it as fast on Cuda Cores as even relatively unoptimised Cuda code yet. The tools and libraries have quite a way to catch up (though getting there slowly), so I wouldn't hold your breath for Cuda to be superseded on these cards. Forget AVX and OpenCL, FERMI FERMI FERMI FERMI FERMI...... :D. A few other people are getting anxious for an improved Fermi app too. I understand that completely. Some description of how I work might be in order just to fill in the picture a bit, bearing in mind I don't work for the public service, don't have time to keep a blog to give constant updates to the curious & the well wishers, and do have other responsibilities. I work in highly productive 'bursts', with thorough investigation, repeated analysis & proving/disproving different methods & techniques (sometimes seemingly unconnected) that tend to converge & combine on a final outcome. Much of that time is spent reading & experimenting behind the scenes. Some of the in-between is sometimes also spent recovering from those unstoppable bursts, like right now taking a break to answer questions here :) The main publicly visible artefacts of that process are either the 'final product' or viable looking intermediate stages toward the goal. In the case of the first phase optimisations to the Cuda codebase, the experimentation & refinement of certain heuristics & optimisations occurred very publicly, though that this particular strain of development applies to a near 60% total rewrite of the cuda application, applicable to all cards (inc Fermi), isn't immediately recognisable to casual observer. Of course, integrating a 60% rewrite of what is after all prototype code that uses higher level engineering practices not well documented in the gpGPU field, and that the approach itself tends to expose limitations/bugs in preceding work, complicates things a lot. It does so by having to first analyse to make sure the new work is 'right', make decisions about whether the old work should be improved, or accept the 'original' as reference, etc.., then come up with fixes for the 'other stuff' if that's thought appropriate. Make no mistake about it, I am far enough along toward the skillset needed to complete the first total rewrite, that I've drawn a line under the opt1 60% rewrite and said 'good enough for phase one'. How that goes over the coming weeks will likely determine the priorities for 'Opt2', and how both are to mesh with coming SaH_v7. I hope that clarifies a bit for the curious, as I see a lot of questions floating about. One thing I can say is that someone that claims they can churn out faster Cuda code than nVidia Engineers, that doesn't take the time to do all the reading & experiment ... is either a genius or a liar. Jason "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13854 Credit: 208,696,464 RAC: 304 |
One thing I can say is that someone that claims they can churn out faster Cuda code than nVidia Engineers, that doesn't take the time to do all the reading & experiment ... is either a genius or a liar. Or just a deluded fool. I've met enough of them over the years. *rolls eyes* Grant Darwin NT |
-BeNt- Send message Joined: 17 Oct 99 Posts: 1234 Credit: 10,116,112 RAC: 0 |
:D. A few other people are getting anxious for an improved Fermi app too. I understand that completely. Some description of how I work might be in order just to fill in the picture a bit, bearing in mind I don't work for the public service, don't have time to keep a blog to give constant updates to the curious & the well wishers, and do have other responsibilities. Thanks for that Jason. We know everything is coming along for better apps for us, no doubt about that. It's just gets a bit aggravating knowing my equipment could be possibly doing a lot more with the proper optimizations. With that being said, I understand it takes time, and making every step you are taking may not be advantageous in helping keep the curious at bay as it would probably only bring more questions in what you are doing. However it would be nice to kind of see a public check list of what you are working on, is that something that's available of could be made available. IE a road map on the work you are doing etc. Would be interesting and kind of keep people in the right lane on what's going on with development on new apps etc. Thanks again! Traveling through space at ~67,000mph! |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
... IE a road map on the work you are doing etc. Would be interesting and kind of keep people in the right lane on what's going on with development on new apps etc. That's not a bad idea at all. I'll see what I can do. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
perryjay Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0 |
Hmmm, so that means it will be a few more days then? Should we send more coffee? :-) PROUD MEMBER OF Team Starfire World BOINC |
Helli_retiered Send message Joined: 15 Dec 99 Posts: 707 Credit: 108,785,585 RAC: 0 |
Hmmm, so that means it will be a few more days then :-) Phew...thankfully that i did not say that... ;-) Helli A loooong time ago: First Credits after SETI@home Restart |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Hmmm, so that means it will be a few more days then :-) Hmmm, so that means it will be a few more days then? Should we send more coffee? :-) Right now a cloning device would be more appreciated. (i.e. 1am on Monday morning :) ) "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
Hmmm, so that means it will be a few more days then :-) Sorry, the cloning device never got out of alpha testing. Currently it is turning everyone into clowns. There seems to have been a typo in the design specs. :) SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Sorry, the cloning device never got out of alpha testing. Currently it is turning everyone into clowns. There seems to have been a typo in the design specs. :) Ahhh, perfect. That would give Raistmer plenty of help with the Ati stuff :P Rough roadmap presented at Lunatics, trying to place the Cuda work in context of surrounding development, and all the collaboration & overlap that goes on. http://lunatics.kwsn.net/1-discussion-forum/lunatics-experimental-development-roadmap.msg36185.html#msg36185 "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
ML1 Send message Joined: 25 Nov 01 Posts: 21209 Credit: 7,508,002 RAC: 20 |
Rough roadmap presented at Lunatics, trying to place the Cuda work in context of surrounding development, and all the collaboration & overlap that goes on. OK, very good! And a good summary. Just one small omission... You've missed off my mil-spec spiky hair! :-) Happy fast crunchin' very soon, Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
You've missed off my mil-spec spiky hair! LoL. Missed out Richard's beard as well, so don't worry, your do will be on the long term chart. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
You've missed off my mil-spec spiky hair! See, spec. changes already! lol I'm sure you guys will bang out some good stuff as always. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
aad Send message Joined: 3 Apr 99 Posts: 101 Credit: 204,131,099 RAC: 26 |
On a serieus side; Don't you mean 'installer 0.38'??? Or was that the older version of the roadmap? |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
On a serieus side; Don't you mean 'installer 0.38'??? Blocks in green are the 'done stuff', yellow 'near term', and orange 'medium term' Existing, recent, completed work I decided to show, in order to give context to current efforts/direction. Jason "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
-BeNt- Send message Joined: 17 Oct 99 Posts: 1234 Credit: 10,116,112 RAC: 0 |
Very nice Jason, it appear my plan is coming together nicely.....so what are we going to do tomorrow you ask? We are going to......TAKE OVER THE WORLD!!!!! Sorry got side tracked lol ;) Traveling through space at ~67,000mph! |
aad Send message Joined: 3 Apr 99 Posts: 101 Credit: 204,131,099 RAC: 26 |
Blocks in green are the 'done stuff' Argh......feelin' stupid now....... Thats why you are a developer and I just a dumb-ass-cruncher...... |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.