Anything relating to AstroPulse (3) tasks

Message boards : Number crunching : Anything relating to AstroPulse (3) tasks
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 53 · 54 · 55 · 56 · 57 · 58 · 59 . . . 104 · Next

AuthorMessage
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 2011933 - Posted: 14 Sep 2019, 12:02:05 UTC - in response to Message 2011931.  

I only run SETI@home once in a while now so pleased to see some Astropulse in my recent run.

As these are from April 2009 are they:

1) A rerun of previous data?
2) Old data not sent out before?
3) Data run before but now being re-run with new processing algorithms?
The current split is, ap_09se19aa, "ap" = AstroPulse, "09" = 9th, "se" = September & "19" = 2019. ;-)

Cheers.


Ah my mistake - thanks so new data :-)
ID: 2011933 · Report as offensive     Reply Quote
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 2011936 - Posted: 14 Sep 2019, 13:00:20 UTC - in response to Message 2011925.  

Just wondering if any of you have some favorite settings for ap_cmdline_win_x86_SSE2_OpenCL_NV.txt for 980s? This is for Win10, not Linux.
Seems I lost track of what I was doing there.
Thx in advance ... Jim


Linux or windows doesn`t matter in this case.

Try -unroll 20 -oclFFT_plan 256 16 256 -tune 1 64 4 1 -tune 2 64 4 1 -ffa_block 5660 -ffa_block_fetch 2830

My times are slightly faster with this command line than yours on a mid range AMD R9 380.
So i`m sure you could improve yours on the 980.


With each crime and every kindness we birth our future.
ID: 2011936 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 2011970 - Posted: 14 Sep 2019, 22:05:17 UTC - in response to Message 2011936.  

Try -unroll 20 -oclFFT_plan 256 16 256 -tune 1 64 4 1 -tune 2 64 4 1 -ffa_block 5660 -ffa_block_fetch 2830
Thaks, Mike! I'll give it a shot on both.
l8r, Jim ...
ID: 2011970 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2011974 - Posted: 14 Sep 2019, 22:13:26 UTC

Mike, is your command line the best you have found for your R9 380, or it that command line what you recommend as best generic for all card types?

I came up with:
-unroll 20 -oclFFT_plan 256 16 256 -ffa_block 2304 -ffa_block_fetch 1152 -tune 1 64 8 1 -tune 2 64 8 1 for my Nvidia cards as best generic to cover everything from a 1070 to a 2080.

I tested my previous AP command line against this one in Rick's BenchMT program. It was much better than my previous one where I simply threw large values at everything.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2011974 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 2011975 - Posted: 14 Sep 2019, 22:31:11 UTC - in response to Message 2011974.  

Mike, is your command line the best you have found for your R9 380, or it that command line what you recommend as best generic for all card types?

I came up with:
-unroll 20 -oclFFT_plan 256 16 256 -ffa_block 2304 -ffa_block_fetch 1152 -tune 1 64 8 1 -tune 2 64 8 1 for my Nvidia cards as best generic to cover everything from a 1070 to a 2080.

I tested my previous AP command line against this one in Rick's BenchMT program. It was much better than my previous one where I simply threw large values at everything.

As I look, that's what's in the AIO as well. Thanks, Keith ...
ID: 2011975 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 2011982 - Posted: 15 Sep 2019, 0:03:55 UTC - in response to Message 2011925.  

Just wondering if any of you have some favorite settings for ap_cmdline_win_x86_SSE2_OpenCL_NV.txt for 980s? This is for Win10, not Linux.
Seems I lost track of what I was doing there.
Thx in advance ... Jim
Glad you asked that question- I gave the supplied values a go with my existing SBS values and knocked over 2 minutes off of my GTX 1070s AP times, and maybe as much as 7min off of my GTX 750Ti times.
Grant
Darwin NT
ID: 2011982 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2011984 - Posted: 15 Sep 2019, 0:11:59 UTC

I got 27 AP's for the 14th today. Also see I picked up one resend yesterday.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2011984 · Report as offensive     Reply Quote
Profile Bill G Special Project $75 donor
Avatar

Send message
Joined: 1 Jun 01
Posts: 1282
Credit: 187,688,550
RAC: 182
United States
Message 2011987 - Posted: 15 Sep 2019, 0:23:33 UTC - in response to Message 2011984.  

14 for the 14th here.

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours
ID: 2011987 · Report as offensive     Reply Quote
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 2011988 - Posted: 15 Sep 2019, 0:30:33 UTC

I managed to grab 8 for the 14th UTC.

Cheers.
ID: 2011988 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 2011989 - Posted: 15 Sep 2019, 0:37:08 UTC - in response to Message 2011987.  

14 for the 14th here.

Ditto
ID: 2011989 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 2011990 - Posted: 15 Sep 2019, 0:39:16 UTC - in response to Message 2011982.  
Last modified: 15 Sep 2019, 1:10:48 UTC

Just wondering if any of you have some favorite settings for ap_cmdline_win_x86_SSE2_OpenCL_NV.txt for 980s? This is for Win10, not Linux.
Seems I lost track of what I was doing there.
Thx in advance ... Jim
Glad you asked that question- I gave the supplied values a go with my existing SBS values and knocked over 2 minutes off of my GTX 1070s AP times, and maybe as much as 7min off of my GTX 750Ti times.
Just so I understand, Grant, your improvement was gained based on Keith's values, aka the aio values? i.e. specifically, what was the full command line?
Also wondering, do you differentiate between the 750s and 1070s as far as tuning, i.e. command line override via the AstroPulse_NV_config.xml?
Thanks ...
ID: 2011990 · Report as offensive     Reply Quote
Profile Stargate (SA)
Volunteer tester
Avatar

Send message
Joined: 4 Mar 10
Posts: 1854
Credit: 2,258,721
RAC: 0
Australia
Message 2011991 - Posted: 15 Sep 2019, 0:57:35 UTC

Picked up 7 for the 14th UTC
ID: 2011991 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 2011994 - Posted: 15 Sep 2019, 1:20:34 UTC - in response to Message 2011990.  

Just so I understand, Grant, your improvement was gained based on Keith's values, aka the aio values?
I used the values supplied to replace my existing ones, but kept my existing -hp -sbs 1024 -unroll 5 settings.

Also wondering, do you differentiate between the 750s and 1070s as far as tuning, i.e. command line override via the AstroPulse_NV_config.xml?
Nah, I just use the ap_cmdline_win_x86_SSE2_OpenCL_NV.txt file. But i set the -unroll to 5 in deference to the GTX 750Tis, and only run 1 WU at a time.
2 WUs at a time often gave the best throughput, but with such a disparity in ability between cards there were times where throughput was reduced. So the less aggressive settings & 1 WU at a time was the best compromise.
Grant
Darwin NT
ID: 2011994 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2011997 - Posted: 15 Sep 2019, 1:38:12 UTC

I used to have the AP readme suggested values for high end cards. I thought that matching the unroll to the number of SM's on the card was best. But then Mike told me that anything over 22 SM's was ignored. So instead of using 16384/8192 for the buffer sizes and a unroll of 28 for the 1080Ti was what I used to use. Then revised to the command line I posted and when Rick's BenchMT was updated to handle AP tasks per my request, I actually benched various parameter values and found out the smaller buffer sizes were the fastest and that generally 18 for the unroll was best. So I compromised with the 20 unroll value to handle the 1080Ti and the 2080's.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2011997 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 2011998 - Posted: 15 Sep 2019, 1:46:25 UTC - in response to Message 2011997.  

I actually benched various parameter values and found out the smaller buffer sizes were the fastest and that generally 18 for the unroll was best. So I compromised with the 20 unroll value to handle the 1080Ti and the 2080's.
What about the effect of the -sbs values?
Grant
Darwin NT
ID: 2011998 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2012000 - Posted: 15 Sep 2019, 2:07:09 UTC - in response to Message 2011998.  

I actually benched various parameter values and found out the smaller buffer sizes were the fastest and that generally 18 for the unroll was best. So I compromised with the 20 unroll value to handle the 1080Ti and the 2080's.
What about the effect of the -sbs values?

Those were some of the variables I changed. I started with my original 16384/8192 values and kept working downwards. I through in some lopsided values too. Small first tune, large second tune etc. Came to land on the values of 2304/1152 as the best compromise of my 1070/1080/2080 test subjects.

I kept one readout from the test. Unfortunately it truncates the full parameter set and I have deleted the ap parameter file from the directory. So I don't know what this particular capture was run with.
┌────┬────┬───┬────────────────────────────────────────────────────────────┬────────┬────────┬───────────┬────────┐
│Job#│Slot│xPU│app_name │ start │ finish │tot_time │ state │
│ │ │ │app_args │wu_name │
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│0 │ NA │GPU│astropulse_7.08_x86_64-pc-linux-gnu__opencl_nvidia_100 │16:40:11│16:46:38│0:06:27.328│COMPLETE│
│ │ │ │ -device 0 -unroll 20 -oclFFT_plan 256 16 256 -ffa_block 230│ap_05mr19aa_B0_P1_00214_20190306_16556│
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│1 │ NA │GPU│astropulse_7.08_x86_64-pc-linux-gnu__opencl_nvidia_100 │16:40:11│16:49:26│0:09:15.443│COMPLETE│
│ │ │ │ -device 1 -unroll 20 -oclFFT_plan 256 16 256 -ffa_block 230│ap_05mr19aa_B0_P1_00214_20190306_16556│
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│2 │ NA │GPU│astropulse_7.08_x86_64-pc-linux-gnu__opencl_nvidia_100 │16:40:11│16:51:06│0:10:54.513│COMPLETE│
│ │ │ │ -device 2 -unroll 20 -oclFFT_plan 256 16 256 -ffa_block 230│ap_05mr19aa_B0_P1_00214_20190306_16556│
└────┴────┴───┴────────────────────────────────────────────────────────────┴──────────────────────────────────────┘

Device0 was the 2080, Device1 was the 1080 and Device2 was the 1070
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2012000 · Report as offensive     Reply Quote
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 2012014 - Posted: 15 Sep 2019, 8:35:43 UTC - in response to Message 2011974.  

Mike, is your command line the best you have found for your R9 380, or it that command line what you recommend as best generic for all card types?

I came up with:
-unroll 20 -oclFFT_plan 256 16 256 -ffa_block 2304 -ffa_block_fetch 1152 -tune 1 64 8 1 -tune 2 64 8 1 for my Nvidia cards as best generic to cover everything from a 1070 to a 2080.

I tested my previous AP command line against this one in Rick's BenchMT program. It was much better than my previous one where I simply threw large values at everything.


Its the best i`ve found.
I tested all values you can possibly try.
It seems new Nvidia GPU`s are different.


With each crime and every kindness we birth our future.
ID: 2012014 · Report as offensive     Reply Quote
Profile Kissagogo27 Special Project $75 donor
Avatar

Send message
Joined: 6 Nov 99
Posts: 716
Credit: 8,032,827
RAC: 62
France
Message 2012020 - Posted: 15 Sep 2019, 11:19:35 UTC

some debunkering AP parameters here http://lunatics.kwsn.info/index.php/topic,1437.0.html


Got 2 the 14th ^^
ID: 2012020 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2012073 - Posted: 15 Sep 2019, 18:31:05 UTC - in response to Message 2012014.  

Mike, is your command line the best you have found for your R9 380, or it that command line what you recommend as best generic for all card types?

I came up with:
-unroll 20 -oclFFT_plan 256 16 256 -ffa_block 2304 -ffa_block_fetch 1152 -tune 1 64 8 1 -tune 2 64 8 1 for my Nvidia cards as best generic to cover everything from a 1070 to a 2080.

I tested my previous AP command line against this one in Rick's BenchMT program. It was much better than my previous one where I simply threw large values at everything.


Its the best i`ve found.
I tested all values you can possibly try.
It seems new Nvidia GPU`s are different.

I think you have to rejigger the command line parameters for Turing cards. The SM's on Turing are not equivalent to earlier models. The SM Cuda core count got cut in half from 128 for Pascal to 64 for Turing.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2012073 · Report as offensive     Reply Quote
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 2012099 - Posted: 15 Sep 2019, 20:18:33 UTC
Last modified: 15 Sep 2019, 20:19:05 UTC

What are the odds to get two Astropulse workunits and have the same wingman for both of them?

I noticed as both are flagged as inclusive

https://setiathome.berkeley.edu/workunit.php?wuid=3652425865

https://setiathome.berkeley.edu/workunit.php?wuid=3652425907

If the quorum work unit goes to the wingman will I see this flagged as invalid or error in my stats?
ID: 2012099 · Report as offensive     Reply Quote
Previous · 1 . . . 53 · 54 · 55 · 56 · 57 · 58 · 59 . . . 104 · Next

Message boards : Number crunching : Anything relating to AstroPulse (3) tasks


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.