Anything relating to AstroPulse tasks

Message boards : Number crunching : Anything relating to AstroPulse tasks
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 42 · 43 · 44 · 45 · 46 · 47 · 48 . . . 120 · Next

AuthorMessage
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11362
Credit: 29,581,041
RAC: 66
United States
Message 1765766 - Posted: 17 Feb 2016, 18:35:52 UTC

don't look now but a few are being split
ID: 1765766 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1765790 - Posted: 17 Feb 2016, 20:30:24 UTC - in response to Message 1765787.  

Do the math on this machine, http://setiathome.berkeley.edu/result.php?resultid=4733535830
On that machine an AP takes about 35 minutes on the 750Ti, the shorty takes 2.7 minutes and pays ~45 credits.
35 minutes divided by 2.7 = ~13 x 45 = 583 credits in 35 minutes for shorties verses around 550 credits for an AP.
APs are Irrelevant on that machine.
ID: 1765790 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1765806 - Posted: 17 Feb 2016, 21:10:47 UTC - in response to Message 1765790.  
Last modified: 17 Feb 2016, 21:20:16 UTC

Well not completely obsolete.

The various AP bin files can be edited to use NVIDIA special cache modifies for GPU RAM loads and stores.

ld.global.cs.nc.f32 %f1, [%rd7]; // streaming read

There are other variations for reads the are random, random but spread closely, last read, ... And the same applies to writes too.

YMMV (see pm for testing)
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1765806 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1765807 - Posted: 17 Feb 2016, 21:12:21 UTC - in response to Message 1765803.  

ID: 1765807 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1765817 - Posted: 17 Feb 2016, 21:26:18 UTC

Just to throw some more randomness on the massive pile of work to do, Vulkan api was just released...
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1765817 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1765818 - Posted: 17 Feb 2016, 21:26:35 UTC - in response to Message 1765809.  

Where do you get this 'unofficial' from? SETI encourages people to build their own Apps,
Compiling and testing

You can compile SETI@home for new platforms, or (if you know what you're doing) compile it or revise it to run faster on an existing platform. In either case, you'll need to know about BOINC's anonymous platform mechanism.
http://setiathome.berkeley.edu/sah_porting.php

Nothing 'unofficial' about it. If you can make SETI run faster Go For It.
Isn't that what you are testing with SoG? Do you consider that 'unofficial'?
ID: 1765818 · Report as offensive
Ulrich Metzner
Volunteer tester
Avatar

Send message
Joined: 3 Jul 02
Posts: 1256
Credit: 13,565,513
RAC: 13
Germany
Message 1765821 - Posted: 17 Feb 2016, 21:28:45 UTC - in response to Message 1765806.  

I'm very interested in that special build, including the manual Nvidia optimizations possible.
If not openly available somewhere, I would be very excited getting a pm with more info. ;)
Aloha, Uli

ID: 1765821 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1765835 - Posted: 17 Feb 2016, 21:46:33 UTC - in response to Message 1765821.  

I'm very interested in that special build, including the manual Nvidia optimizations possible.
If not openly available somewhere, I would be very excited getting a pm with more info. ;)


Sorry Ulrich, your NVIDIA cards look unfamiliar to me. I have not had time to investigate their instruction set and/or cache hierarchy.

1) The AP manual optimization is hardware and .cl file version dependent.

I have done it with 780 and 980. 750 has similar instructions set and cache hierarchy to 980. I think I could do that for TBar's 750Ti for testing purposes.

But since the bin file changes every time a new driver is used or a new .cl file version is used i do not think it will ever be mainstream unless someone close to AP bin file creation is willing to make an automated replacement system for each different kernel with different ram usage pattern.

More info can be found in CUDA PTX manuals from NVIDIA.

2) The source for the special version of CUDA MB is in the svn. This thread is for AP.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1765835 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1765838 - Posted: 17 Feb 2016, 21:50:22 UTC - in response to Message 1765821.  
Last modified: 17 Feb 2016, 21:53:52 UTC

I'm very interested in that special build, including the manual Nvidia optimizations possible.
If not openly available somewhere, I would be very excited getting a pm with more info. ;)


Current status for the Cuda MB build in terms of stock integration, is the code is sitting in the Berkeley svn, specifically alpha subdirectory of the client sources, labelled Petri.

Rolling your own is encouraged [my own stipulation adding as long as you know what you are doing], abide by the GPL requirements in full if wanting to distribute anything, and the code was put there for that purpose.

Only things preventing public releases in binary form are:
- no specific documentation for it (other than lots of developer talk)
- that code breaks the existing codebase for certain kinds of supported cards, it needs to be integrated via options and supporting logic, e.g. amount of Vram, whether device even can stream. (It's happened before, idiots + builds meant for something specific they don't have = mass error/invalids, and possibly make the scheduler more complex)
- For my end, Tooling up for production on 4 platforms at the same time is taking a lot more work than I anticipated,

So working towards streamlining all that is where I'm up to. With Baseline (old generic code) builds operational, and temporarily stuck at documentation and deployment stage, now vulkan has been released it needs factoring into the alpha structure as well
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1765838 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1765839 - Posted: 17 Feb 2016, 21:52:19 UTC - in response to Message 1765828.  

Where do you get this 'unofficial' from? SETI encourages people to build their own Apps,
Compiling and testing

You can compile SETI@home for new platforms, or (if you know what you're doing) compile it or revise it to run faster on an existing platform. In either case, you'll need to know about BOINC's anonymous platform mechanism.
http://setiathome.berkeley.edu/sah_porting.php

Nothing 'unofficial' about it. If you can make SETI run faster Go For It.
Isn't that what you are testing with SoG? Do you consider that 'unofficial'?

Of course the SoG is unofficial, it's clearly a test version (ask Raistmer), but it was released openly here on main, by Raistmer at least. And I do not boast about it being much faster than CUDA50, on my computer. I keep those comment in the test thread (apples and oranges)

AFAIK the 6.X CUDA version you're running hasn't been released at all to be built (is it the branch you programming people (Oops, I almost used the nerd word) call it?)

Have fun though, comparing oranges (CUDA 5) and apples (CUDA 6) and boast about that apples are better than oranges. Especially since the better oranges aren't "sold" yet :-*)

Hmm, anyhow, wasn't this thread about AP?

Yes, and to your post about AP 'pay' I pointed out that AP 'pay' is irrelevant on some machines running the 'Offical' code from the SETI Repository. The Code is here BTW, https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt/Xbranch/client/alpha/PetriR_raw. The SoG code is in the Repository as well. How is Code in the Repository 'Unofficial'? Even if the code wasn't in the Repository SETI encourages people to 'revise' the Code as already noted. I don't see how 'unofficial' enters into the discussion.
ID: 1765839 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1765841 - Posted: 17 Feb 2016, 22:06:17 UTC - in response to Message 1765840.  

Are you going to continue until you get the last word? (Is the U.S the best and biggest country in the world? :-) )
Be my guest (refrains from using the A-word).

Geeze....

I will continue until you stop calling Apps compiled from 'Code' in the SETI Repository 'Unofficial'.
You can stop anytime you wish...
It's up to you.
ID: 1765841 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1765845 - Posted: 17 Feb 2016, 22:25:09 UTC - in response to Message 1765843.  
Last modified: 17 Feb 2016, 22:26:01 UTC

Here is the 'Official' word.
http://setiathome.berkeley.edu/sah_porting.php
You can compile SETI@home for new platforms, or (if you know what you're doing) compile it or revise it to run faster on an existing platform...
ID: 1765845 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14654
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1765846 - Posted: 17 Feb 2016, 22:31:00 UTC

SETI is Open Source software. Anybody is free to contribute improvements to it, and anybody is free to download it and tinker with it. Even the most basic, CPU only, non-optimised, Berkeley-compiled 'stock' applications contain algorithms contributed by volunteers who have never been on the Berkeley payroll (just look at the AUTHORS file sometime).

To facilitate the free exchange of ideas and techniques, some external volunteers have been given permission to use the SETI repository as a shared 'code warehouse': those coders have been entrusted with the responsibility of acting as gatekeepers, storing shared code contributed by other volunteers in the central warehouse.

The fact that code has been accepted by one or other of those gatekeepers as worthy of sharing more widely via the repository in no way makes it 'official' code. And likewise, the mere fact of warehouse storage is no guarantee that it will compile, run, or generate scientifically accurate results.
ID: 1765846 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11362
Credit: 29,581,041
RAC: 66
United States
Message 1765888 - Posted: 18 Feb 2016, 1:19:21 UTC

Aps have been split since apx 10:30 AM PST and I've only got 7 so far, this is not causing me to go into my happy dance.
ID: 1765888 · Report as offensive
Profile tullio
Volunteer tester

Send message
Joined: 9 Apr 04
Posts: 8797
Credit: 2,930,782
RAC: 1
Italy
Message 1765955 - Posted: 18 Feb 2016, 5:46:20 UTC

I got a GPU AP on my Windows 10 PC with its newly installed Geforce GTX 750, with 361.91 nVidia driver.
Tullio
ID: 1765955 · Report as offensive
Phil Burden

Send message
Joined: 26 Oct 00
Posts: 264
Credit: 22,303,899
RAC: 0
United Kingdom
Message 1765999 - Posted: 18 Feb 2016, 9:20:40 UTC - in response to Message 1765888.  

Aps have been split since apx 10:30 AM PST and I've only got 7 so far, this is not causing me to go into my happy dance.


I got 2 at 0840 UTC yesterday, and nothing since ;-(

P.
ID: 1765999 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1766024 - Posted: 18 Feb 2016, 13:25:16 UTC - in response to Message 1765835.  
Last modified: 18 Feb 2016, 13:28:31 UTC


But since the bin file changes every time a new driver is used or a new .cl file version is used i do not think it will ever be mainstream unless someone close to AP bin file creation is willing to make an automated replacement system for each different kernel with different ram usage pattern.


The various AP bin files can be edited to use NVIDIA special cache modifies for GPU RAM loads and stores.

ld.global.cs.nc.f32 %f1, [%rd7]; // streaming read

There are other variations for reads the are random, random but spread closely, last read, ... And the same applies to writes too.



Did you find a way to tell to NV compiler to include those instructions on CL-file level?

If speedup is considerable and includes large amont of NV models it should be possible to distribute kernels in PTX-based (instead of CL-based) file with those modifications.
But better to discuss it in more "visible" places - I encountered these messages just by luck cause this thread being too unspecific not monitored long ago.
ID: 1766024 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11362
Credit: 29,581,041
RAC: 66
United States
Message 1766071 - Posted: 18 Feb 2016, 17:45:09 UTC

Aps are no longer being split, I only got 24 and I find this last run to have been most unsatisfactory.
ID: 1766071 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1766083 - Posted: 18 Feb 2016, 18:44:18 UTC - in response to Message 1766024.  
Last modified: 18 Feb 2016, 18:46:01 UTC


But since the bin file changes every time a new driver is used or a new .cl file version is used i do not think it will ever be mainstream unless someone close to AP bin file creation is willing to make an automated replacement system for each different kernel with different ram usage pattern.


The various AP bin files can be edited to use NVIDIA special cache modifies for GPU RAM loads and stores.

ld.global.cs.nc.f32 %f1, [%rd7]; // streaming read

There are other variations for reads the are random, random but spread closely, last read, ... And the same applies to writes too.



Did you find a way to tell to NV compiler to include those instructions on CL-file level?

If speedup is considerable and includes large amont of NV models it should be possible to distribute kernels in PTX-based (instead of CL-based) file with those modifications.
But better to discuss it in more "visible" places - I encountered these messages just by luck cause this thread being too unspecific not monitored long ago.


I have not found a way to do that yet. And to my knowledge the optimizations affect only Kepler and Maxwell. The speedup in MB pulse finding is noticeable (about 15%). I have no measured percentage for AP. An approximation is in minutes from 30 -> 25 per two tasks at a time, 17% maybe. I'll pm You later this week.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1766083 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1766115 - Posted: 18 Feb 2016, 21:15:16 UTC - in response to Message 1766083.  


I have not found a way to do that yet. And to my knowledge the optimizations affect only Kepler and Maxwell. The speedup in MB pulse finding is noticeable (about 15%). I have no measured percentage for AP. An approximation is in minutes from 30 -> 25 per two tasks at a time, 17% maybe. I'll pm You later this week.


Not bad at all. As first measure would be good to have some manual/how to for make those mods by hands. After initial deployment and fast bugs cleaning app usually lives untouched long enough to do some tweaks if it gives such speedup. So instructions what change would be very useful IMHO. Maybe worth to put them in some sticky thread then.
ID: 1766115 · Report as offensive
Previous · 1 . . . 42 · 43 · 44 · 45 · 46 · 47 · 48 . . . 120 · Next

Message boards : Number crunching : Anything relating to AstroPulse tasks


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.