Anything relating to AstroPulse tasks

Author	Message
betreger Send message Joined: 29 Jun 99 Posts: 11362 Credit: 29,581,041 RAC: 66	Message 1765766 - Posted: 17 Feb 2016, 18:35:52 UTC don't look now but a few are being split ID: 1765766 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1765790 - Posted: 17 Feb 2016, 20:30:24 UTC - in response to Message 1765787. Do the math on this machine, http://setiathome.berkeley.edu/result.php?resultid=4733535830 On that machine an AP takes about 35 minutes on the 750Ti, the shorty takes 2.7 minutes and pays ~45 credits. 35 minutes divided by 2.7 = ~13 x 45 = 583 credits in 35 minutes for shorties verses around 550 credits for an AP. APs are Irrelevant on that machine. ID: 1765790 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1765806 - Posted: 17 Feb 2016, 21:10:47 UTC - in response to Message 1765790. Last modified: 17 Feb 2016, 21:20:16 UTC Well not completely obsolete. The various AP bin files can be edited to use NVIDIA special cache modifies for GPU RAM loads and stores. ld.global.cs.nc.f32 %f1, [%rd7]; // streaming read There are other variations for reads the are random, random but spread closely, last read, ... And the same applies to writes too. YMMV (see pm for testing) To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1765806 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1765807 - Posted: 17 Feb 2016, 21:12:21 UTC - in response to Message 1765803. Looks pretty relevant to me, http://setiathome.berkeley.edu/top_hosts.php?sort_by=expavg_credit&offset=20 ID: 1765807 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1765817 - Posted: 17 Feb 2016, 21:26:18 UTC Just to throw some more randomness on the massive pile of work to do, Vulkan api was just released... "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1765817 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1765818 - Posted: 17 Feb 2016, 21:26:35 UTC - in response to Message 1765809. Where do you get this 'unofficial' from? SETI encourages people to build their own Apps, Compiling and testing You can compile SETI@home for new platforms, or (if you know what you're doing) compile it or revise it to run faster on an existing platform. In either case, you'll need to know about BOINC's anonymous platform mechanism. http://setiathome.berkeley.edu/sah_porting.php Nothing 'unofficial' about it. If you can make SETI run faster Go For It. Isn't that what you are testing with SoG? Do you consider that 'unofficial'? ID: 1765818 ·

Ulrich Metzner Volunteer tester Send message Joined: 3 Jul 02 Posts: 1256 Credit: 13,565,513 RAC: 13	Message 1765821 - Posted: 17 Feb 2016, 21:28:45 UTC - in response to Message 1765806. I'm very interested in that special build, including the manual Nvidia optimizations possible. If not openly available somewhere, I would be very excited getting a pm with more info. ;) Aloha, Uli ID: 1765821 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1765835 - Posted: 17 Feb 2016, 21:46:33 UTC - in response to Message 1765821. I'm very interested in that special build, including the manual Nvidia optimizations possible. If not openly available somewhere, I would be very excited getting a pm with more info. ;) Sorry Ulrich, your NVIDIA cards look unfamiliar to me. I have not had time to investigate their instruction set and/or cache hierarchy. 1) The AP manual optimization is hardware and .cl file version dependent. I have done it with 780 and 980. 750 has similar instructions set and cache hierarchy to 980. I think I could do that for TBar's 750Ti for testing purposes. But since the bin file changes every time a new driver is used or a new .cl file version is used i do not think it will ever be mainstream unless someone close to AP bin file creation is willing to make an automated replacement system for each different kernel with different ram usage pattern. More info can be found in CUDA PTX manuals from NVIDIA. 2) The source for the special version of CUDA MB is in the svn. This thread is for AP. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1765835 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1765838 - Posted: 17 Feb 2016, 21:50:22 UTC - in response to Message 1765821. Last modified: 17 Feb 2016, 21:53:52 UTC I'm very interested in that special build, including the manual Nvidia optimizations possible. If not openly available somewhere, I would be very excited getting a pm with more info. ;) Current status for the Cuda MB build in terms of stock integration, is the code is sitting in the Berkeley svn, specifically alpha subdirectory of the client sources, labelled Petri. Rolling your own is encouraged [my own stipulation adding as long as you know what you are doing], abide by the GPL requirements in full if wanting to distribute anything, and the code was put there for that purpose. Only things preventing public releases in binary form are: - no specific documentation for it (other than lots of developer talk) - that code breaks the existing codebase for certain kinds of supported cards, it needs to be integrated via options and supporting logic, e.g. amount of Vram, whether device even can stream. (It's happened before, idiots + builds meant for something specific they don't have = mass error/invalids, and possibly make the scheduler more complex) - For my end, Tooling up for production on 4 platforms at the same time is taking a lot more work than I anticipated, So working towards streamlining all that is where I'm up to. With Baseline (old generic code) builds operational, and temporarily stuck at documentation and deployment stage, now vulkan has been released it needs factoring into the alpha structure as well "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1765838 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1765839 - Posted: 17 Feb 2016, 21:52:19 UTC - in response to Message 1765828. Where do you get this 'unofficial' from? SETI encourages people to build their own Apps, Compiling and testing You can compile SETI@home for new platforms, or (if you know what you're doing) compile it or revise it to run faster on an existing platform. In either case, you'll need to know about BOINC's anonymous platform mechanism. http://setiathome.berkeley.edu/sah_porting.php Nothing 'unofficial' about it. If you can make SETI run faster Go For It. Isn't that what you are testing with SoG? Do you consider that 'unofficial'? Of course the SoG is unofficial, it's clearly a test version (ask Raistmer), but it was released openly here on main, by Raistmer at least. And I do not boast about it being much faster than CUDA50, on my computer. I keep those comment in the test thread (apples and oranges) AFAIK the 6.X CUDA version you're running hasn't been released at all to be built (is it the branch you programming people (Oops, I almost used the nerd word) call it?) Have fun though, comparing oranges (CUDA 5) and apples (CUDA 6) and boast about that apples are better than oranges. Especially since the better oranges aren't "sold" yet :-*) Hmm, anyhow, wasn't this thread about AP? Yes, and to your post about AP 'pay' I pointed out that AP 'pay' is irrelevant on some machines running the 'Offical' code from the SETI Repository. The Code is here BTW, https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt/Xbranch/client/alpha/PetriR_raw. The SoG code is in the Repository as well. How is Code in the Repository 'Unofficial'? Even if the code wasn't in the Repository SETI encourages people to 'revise' the Code as already noted. I don't see how 'unofficial' enters into the discussion. ID: 1765839 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1765841 - Posted: 17 Feb 2016, 22:06:17 UTC - in response to Message 1765840. Are you going to continue until you get the last word? (Is the U.S the best and biggest country in the world? :-) ) Be my guest (refrains from using the A-word). Geeze.... I will continue until you stop calling Apps compiled from 'Code' in the SETI Repository 'Unofficial'. You can stop anytime you wish... It's up to you. ID: 1765841 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1765845 - Posted: 17 Feb 2016, 22:25:09 UTC - in response to Message 1765843. Last modified: 17 Feb 2016, 22:26:01 UTC Here is the 'Official' word. http://setiathome.berkeley.edu/sah_porting.php You can compile SETI@home for new platforms, or (if you know what you're doing) compile it or revise it to run faster on an existing platform... ID: 1765845 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874	Message 1765846 - Posted: 17 Feb 2016, 22:31:00 UTC SETI is Open Source software. Anybody is free to contribute improvements to it, and anybody is free to download it and tinker with it. Even the most basic, CPU only, non-optimised, Berkeley-compiled 'stock' applications contain algorithms contributed by volunteers who have never been on the Berkeley payroll (just look at the AUTHORS file sometime). To facilitate the free exchange of ideas and techniques, some external volunteers have been given permission to use the SETI repository as a shared 'code warehouse': those coders have been entrusted with the responsibility of acting as gatekeepers, storing shared code contributed by other volunteers in the central warehouse. The fact that code has been accepted by one or other of those gatekeepers as worthy of sharing more widely via the repository in no way makes it 'official' code. And likewise, the mere fact of warehouse storage is no guarantee that it will compile, run, or generate scientifically accurate results. ID: 1765846 ·

betreger Send message Joined: 29 Jun 99 Posts: 11362 Credit: 29,581,041 RAC: 66	Message 1765888 - Posted: 18 Feb 2016, 1:19:21 UTC Aps have been split since apx 10:30 AM PST and I've only got 7 so far, this is not causing me to go into my happy dance. ID: 1765888 ·

tullio Volunteer tester Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1	Message 1765955 - Posted: 18 Feb 2016, 5:46:20 UTC I got a GPU AP on my Windows 10 PC with its newly installed Geforce GTX 750, with 361.91 nVidia driver. Tullio ID: 1765955 ·

Phil Burden Send message Joined: 26 Oct 00 Posts: 264 Credit: 22,303,899 RAC: 0	Message 1765999 - Posted: 18 Feb 2016, 9:20:40 UTC - in response to Message 1765888. Aps have been split since apx 10:30 AM PST and I've only got 7 so far, this is not causing me to go into my happy dance. I got 2 at 0840 UTC yesterday, and nothing since ;-( P. ID: 1765999 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1766024 - Posted: 18 Feb 2016, 13:25:16 UTC - in response to Message 1765835. Last modified: 18 Feb 2016, 13:28:31 UTC But since the bin file changes every time a new driver is used or a new .cl file version is used i do not think it will ever be mainstream unless someone close to AP bin file creation is willing to make an automated replacement system for each different kernel with different ram usage pattern. The various AP bin files can be edited to use NVIDIA special cache modifies for GPU RAM loads and stores. ld.global.cs.nc.f32 %f1, [%rd7]; // streaming read There are other variations for reads the are random, random but spread closely, last read, ... And the same applies to writes too. Did you find a way to tell to NV compiler to include those instructions on CL-file level? If speedup is considerable and includes large amont of NV models it should be possible to distribute kernels in PTX-based (instead of CL-based) file with those modifications. But better to discuss it in more "visible" places - I encountered these messages just by luck cause this thread being too unspecific not monitored long ago. ID: 1766024 ·

betreger Send message Joined: 29 Jun 99 Posts: 11362 Credit: 29,581,041 RAC: 66	Message 1766071 - Posted: 18 Feb 2016, 17:45:09 UTC Aps are no longer being split, I only got 24 and I find this last run to have been most unsatisfactory. ID: 1766071 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1766083 - Posted: 18 Feb 2016, 18:44:18 UTC - in response to Message 1766024. Last modified: 18 Feb 2016, 18:46:01 UTC But since the bin file changes every time a new driver is used or a new .cl file version is used i do not think it will ever be mainstream unless someone close to AP bin file creation is willing to make an automated replacement system for each different kernel with different ram usage pattern. The various AP bin files can be edited to use NVIDIA special cache modifies for GPU RAM loads and stores. ld.global.cs.nc.f32 %f1, [%rd7]; // streaming read There are other variations for reads the are random, random but spread closely, last read, ... And the same applies to writes too. Did you find a way to tell to NV compiler to include those instructions on CL-file level? If speedup is considerable and includes large amont of NV models it should be possible to distribute kernels in PTX-based (instead of CL-based) file with those modifications. But better to discuss it in more "visible" places - I encountered these messages just by luck cause this thread being too unspecific not monitored long ago. I have not found a way to do that yet. And to my knowledge the optimizations affect only Kepler and Maxwell. The speedup in MB pulse finding is noticeable (about 15%). I have no measured percentage for AP. An approximation is in minutes from 30 -> 25 per two tasks at a time, 17% maybe. I'll pm You later this week. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1766083 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1766115 - Posted: 18 Feb 2016, 21:15:16 UTC - in response to Message 1766083. I have not found a way to do that yet. And to my knowledge the optimizations affect only Kepler and Maxwell. The speedup in MB pulse finding is noticeable (about 15%). I have no measured percentage for AP. An approximation is in minutes from 30 -> 25 per two tasks at a time, 17% maybe. I'll pm You later this week. Not bad at all. As first measure would be good to have some manual/how to for make those mods by hands. After initial deployment and fast bugs cleaning app usually lives untouched long enough to do some tweaks if it gives such speedup. So instructions what change would be very useful IMHO. Maybe worth to put them in some sticky thread then. ID: 1766115 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.