SETI@home v8.22 Windows GPU applications support thread

Message boards : Number crunching : SETI@home v8.22 Windows GPU applications support thread
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 13 · 14 · 15 · 16 · 17 · 18 · 19 . . . 23 · Next

AuthorMessage
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1951654 - Posted: 23 Aug 2018, 20:52:42 UTC

I don`t use this app arg because my GPU don`t get an improvement from it.
Sadly i had no chance to test it on a 1080.
It certainly would be interesting to see how it performs in high production mode.


With each crime and every kindness we birth our future.
ID: 1951654 · Report as offensive     Reply Quote
Marco Vandebergh ( SETI orphan )

Send message
Joined: 27 Aug 10
Posts: 39
Credit: 12,630,994
RAC: 9
Netherlands
Message 1951664 - Posted: 23 Aug 2018, 21:13:41 UTC - in response to Message 1951654.  

Ok, ill leave this running for a while, to see how it perfoms on the longer run.

In GPU-Z the GPU load is now around 96 to 99 % and memory controller load about 86 / 91 ish.
It was really low before when running with standard parameters, even when used 3 WU's at a time.


I do have to mention the GPU gets hotter, so becareful. My system is all watercooled i have no issues. max GPU temp is 46 degree C.


my app_config:

<app_config>
<app_version>
<app_name>setiathome_v8</app_name>
<plan_class>opencl_nvidia_SoG</plan_class>
<avg_ncpus>1</avg_ncpus>
<ngpus>0.33</ngpus>
<cmdline>-sbs 1024 -total_GPU_instances_num 5 -pref_wg_num_per_cu 20 -cpu_lock -period_iterations_num 1 -hp -high_prec_timer -high_perf -tt 1500</cmdline>
</app_version>
<app_version>
<app_name>astropulse_v7</app_name>
<plan_class>opencl_nvidia_100</plan_class>
<avg_ncpus>1</avg_ncpus>
<ngpus>1</ngpus>
<cmdline>-unroll 28 -oclFFT_plan 256 16 256 -ffa_block 12288 -ffa_block_fetch 6144 -tune 1 64 4 1 -tune 2 64 4 1 -hp -instances_per_device 1</cmdline>
</app_version>
</app_config>
ID: 1951664 · Report as offensive     Reply Quote
Profile ab1vl-pi
Avatar

Send message
Joined: 11 Jun 18
Posts: 4
Credit: 68,014
RAC: 0
United States
Message 1952063 - Posted: 25 Aug 2018, 21:32:38 UTC

Looking for Fermi or later GPU Nvidia hardware.

As I understand it only ASTROPULSE cares about buggy post 340.x drivers for pre-fermi hardware so I might be open to that too.

I have a handful of ASUS mobos with Athlon 64 2x and 4x cores. The 2X with Gt-220 GPU runs blc04_2bit_guppi WUs (showing as v8.8.01 cuda60)on linux and win7, same box.

If you have a used GPU you'd like to sell, PCI or PCIe, that is faster than the 1.2 cc the GT-220 offers, I'm interested in talking.

If you can recommend a low cost NEW card that offers decent performance, I'd like to hear your recommendations. I don't game, I just want a few decent co-processors to stuff into these systems I rescued from the transfer center.

Please use PM.
tnx, chuck
ID: 1952063 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1952064 - Posted: 25 Aug 2018, 21:40:46 UTC - in response to Message 1952063.  

If you can recommend a low cost NEW card that offers decent performance, I'd like to hear your recommendations. I don't game, I just want a few decent co-processors to stuff into these systems I rescued from the transfer center.

For cost, performance and power use it's hard to go past the GTX 750Ti.
Grant
Darwin NT
ID: 1952064 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1952076 - Posted: 25 Aug 2018, 23:40:57 UTC - in response to Message 1951664.  
Last modified: 25 Aug 2018, 23:41:40 UTC

Ok, ill leave this running for a while, to see how it perfoms on the longer run.

In GPU-Z the GPU load is now around 96 to 99 % and memory controller load about 86 / 91 ish.
It was really low before when running with standard parameters, even when used 3 WU's at a time.


I do have to mention the GPU gets hotter, so becareful. My system is all watercooled i have no issues. max GPU temp is 46 degree C.


my app_config:

<app_config>
<avg_ncpus>1</avg_ncpus>
<ngpus>0.33</ngpus>
<cmdline>-sbs 1024 -total_GPU_instances_num 5 -pref_wg_num_per_cu 20 -cpu_lock -period_iterations_num 1 -hp -high_prec_timer -high_perf -tt 1500</cmdline>

</app_config>

. . So you are still running 3 at a time? With the change in settings how does it utilise the GPU running one or two at a time?

Stephen

??
ID: 1952076 · Report as offensive     Reply Quote
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1952092 - Posted: 26 Aug 2018, 0:20:43 UTC - in response to Message 1952076.  

If he's on windows running SoG on 1080Ti then running 2 at a time is slower than running 3 at a time (average time)
ID: 1952092 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1952136 - Posted: 26 Aug 2018, 4:13:03 UTC - in response to Message 1952092.  

If he's on windows running SoG on 1080Ti then running 2 at a time is slower than running 3 at a time (average time)


. . But he is running a parameter I haven't heard anything about before -

-pref_wg_num_per_cu 20

and he says it increases the GPU utilisation. I was wondering if it might increase it to the max even when running 2 or maybe even one. That would be useful.

Stephen

? ?
ID: 1952136 · Report as offensive     Reply Quote
Profile ab1vl-pi
Avatar

Send message
Joined: 11 Jun 18
Posts: 4
Credit: 68,014
RAC: 0
United States
Message 1952243 - Posted: 26 Aug 2018, 20:38:41 UTC - in response to Message 1952064.  

Tnx Grant, Amazon has one by NXDA (China) labelled as 2BG GTX 750Ti, for $70.90, about 1/5 of the "pure" NVIDIA price.

Any intelligence on NXDA? This would be fulfilled from China ARO 30-45 days.

Another concern is trade wars, Amazon boilerplate leaves us open to an new duties.

Other than that I'd give it a shot. Unless it was DOA it would probably be a big improvement.
ID: 1952243 · Report as offensive     Reply Quote
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1952252 - Posted: 26 Aug 2018, 21:03:42 UTC - in response to Message 1952136.  



-pref_wg_num_per_cu 20


It's in the read me file of the lunatics installer
ID: 1952252 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1952264 - Posted: 26 Aug 2018, 22:33:29 UTC - in response to Message 1952252.  



-pref_wg_num_per_cu 20


It's in the read me file of the lunatics installer


. . Yes but have you used it? Because he feels it increases utilisation and I was interested to know if that increase is enough to run only one or two tasks at a time and still have high and efficient use of the 1080/1080ti.

Stephen

? ?
ID: 1952264 · Report as offensive     Reply Quote
Marco Vandebergh ( SETI orphan )

Send message
Joined: 27 Aug 10
Posts: 39
Credit: 12,630,994
RAC: 9
Netherlands
Message 1952939 - Posted: 30 Aug 2018, 20:32:43 UTC - in response to Message 1952264.  
Last modified: 30 Aug 2018, 20:47:49 UTC

I am not running it 24/7 due to work and such, i do not leave it unattended, but as long as i can. On regular basis.

It depends on the workunit type what the results are, there are work units that finish in 4 ish minutes when running 1 at a time with -pref_wg_num_per_cu 1,
and 9 ish minutes when running 3 at a time with -pref_wg_num_per_cu 20.

I think it is an improvement., slightly.
GPU load is 99% most of the time, between 97 and 99%.

Interesting to see is that the memory controller usage did go up when using 3 at a time, its 88% to 89 ish
Lowering -pref_wg_num_per_cu 20 also decreases gpu and memory controller usage, so something is happening there.

This is lower when running only 1 or 2 wu's at a time.

I also found, when using cpu_lock and no hyperthreading stuff the GPU more too.


Still trying some settings out though.

I made a screenshot how do i post that?


picture upload
ID: 1952939 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1953035 - Posted: 31 Aug 2018, 12:54:14 UTC - in response to Message 1952939.  

It depends on the workunit type what the results are, there are work units that finish in 4 ish minutes when running 1 at a time with -pref_wg_num_per_cu 1,
and 9 ish minutes when running 3 at a time with -pref_wg_num_per_cu 20.
I think it is an improvement., slightly.
GPU load is 99% most of the time, between 97 and 99%.
Interesting to see is that the memory controller usage did go up when using 3 at a time, its 88% to 89 ish
Lowering -pref_wg_num_per_cu 20 also decreases gpu and memory controller usage, so something is happening there.
This is lower when running only 1 or 2 wu's at a time.
I also found, when using cpu_lock and no hyperthreading stuff the GPU more too.
Still trying some settings out though.


. .Thanks for that reply, that was enlightening.

Stephen

:)
ID: 1953035 · Report as offensive     Reply Quote
Marco Vandebergh ( SETI orphan )

Send message
Joined: 27 Aug 10
Posts: 39
Credit: 12,630,994
RAC: 9
Netherlands
Message 1954282 - Posted: 7 Sep 2018, 19:26:31 UTC - in response to Message 1953035.  
Last modified: 7 Sep 2018, 19:29:40 UTC

Allright, a little update. This will be my final settings for a while. ( or if someone has a suggestion, i can try that )

Now im completing most guppies in 4.25 / 4.40 minutes ish range, when running 2 instances together on the gtx 1080Ti.
In comparison, a single guppi would finish in 2.30 / 2.40 ish minutes range on my system.

The GPU usage is about the same as in my previous post picture above with 3 instances running.
Increasing -pref_wg_num_per_cu 32 any higher does not have a positive effect anymore on my system. Decreasing results in slower completing times of wu. ( on my system )
Afterall i think its best to run just 2.
My Os is win10.

Appconfig.cfg:

<app_config>
<app_version>
<app_name>setiathome_v8</app_name>
<plan_class>opencl_nvidia_SoG</plan_class>
<avg_ncpus>0.4</avg_ncpus>
<ngpus>0.5</ngpus>
<cmdline>-sbs 1024 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 512 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 -pref_wg_num_per_cu 32 -period_iterations_num 1 -hp -high_prec_timer -high_perf -tt 1500 -cpu_lock -instances_per_device 2</cmdline>
</app_version>
<app_version>
<app_name>astropulse_v7</app_name>
<avg_ncpus>0.4</avg_ncpus>
<ngpus>0.33</ngpus>
<plan_class>opencl_nvidia_100</plan_class>
<cmdline>-unroll 28 -oclFFT_plan 256 16 256 -ffa_block 12288 -ffa_block_fetch 6144 -tune 1 64 4 1 -tune 2 64 4 1 -hp -cpu_lock -instances_per_device 3</cmdline>
</app_version>
<app_version>
<app_name>astropulse_v7</app_name>
<avg_ncpus>0.4</avg_ncpus>
<ngpus>0.33</ngpus>
<plan_class>cuda_opencl_100</plan_class>
<cmdline>-unroll 28 -oclFFT_plan 256 16 256 -ffa_block 12288 -ffa_block_fetch 6144 -tune 1 64 4 1 -tune 2 64 4 1 -hp -cpu_lock -instances_per_device 3</cmdline>
</app_version>
<app_version>
<app_name>astropulse_v7</app_name>
<avg_ncpus>0.4</avg_ncpus>
<ngpus>0.33</ngpus>
<plan_class>opencl_nvidia_cc1</plan_class>
<cmdline>-unroll 28 -oclFFT_plan 256 16 256 -ffa_block 12288 -ffa_block_fetch 6144 -tune 1 64 4 1 -tune 2 64 4 1 -hp -cpu_lock -instances_per_device 3</cmdline>
</app_version>
<app_version>
<app_name>astropulse_v7</app_name>
<avg_ncpus>0.4</avg_ncpus>
<ngpus>0.33</ngpus>
<plan_class>cuda_opencl_cc1</plan_class>
<cmdline>-unroll 28 -oclFFT_plan 256 16 256 -ffa_block 12288 -ffa_block_fetch 6144 -tune 1 64 4 1 -tune 2 64 4 1 -hp -cpu_lock -instances_per_device 3</cmdline>
</app_version>
</app_config>
ID: 1954282 · Report as offensive     Reply Quote
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22160
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1954617 - Posted: 10 Sep 2018, 8:48:33 UTC

Giys - I've suffered a bit of brain fade here - what is the minimum version of OpenCL required for the SoG application to work?
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1954617 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1954619 - Posted: 10 Sep 2018, 9:04:50 UTC - in response to Message 1954617.  

Giys - I've suffered a bit of brain fade here - what is the minimum version of OpenCL required for the SoG application to work?


. . I thought it worked with 1.1 but it certainly works with 1.2 ...

Stephen

? ?
ID: 1954619 · Report as offensive     Reply Quote
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1954620 - Posted: 10 Sep 2018, 9:08:20 UTC - in response to Message 1954619.  

Giys - I've suffered a bit of brain fade here - what is the minimum version of OpenCL required for the SoG application to work?


. . I thought it worked with 1.1 but it certainly works with 1.2 ...

Stephen

? ?
It definitely works on 1.1.

Cheers.
ID: 1954620 · Report as offensive     Reply Quote
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22160
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1954625 - Posted: 10 Sep 2018, 10:40:26 UTC

Ta

Any votes (either way) for 1.0??
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1954625 · Report as offensive     Reply Quote
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1954637 - Posted: 10 Sep 2018, 12:23:54 UTC - in response to Message 1954625.  

Ta

Any votes (either way) for 1.0??


Yes, it should work with 1.0 as well.
The code itself is 1.0.
It was the last time i asked Raistmer about it.


With each crime and every kindness we birth our future.
ID: 1954637 · Report as offensive     Reply Quote
Profile lunkerlander
Avatar

Send message
Joined: 23 Jul 18
Posts: 82
Credit: 1,353,232
RAC: 4
United States
Message 1961230 - Posted: 20 Oct 2018, 22:10:09 UTC - in response to Message 1842133.  

Last night, I decided to try downloading and installing the lunatics v0.45 app. I ran the installer and selected only the default options for the CPU and ATI gpu apps.

Stock Boinc was running the stock app v8.08 on my i5 4590 cpu and the app v8.22 on my RX 560 and 570s.

My stock times were usually 3900 seconds for cpu tasks and 1150 seconds on my RX560 and 650 seconds on my RX570.

Now with the lunatics app, my CPU tasks are quicker at about 3300 seconds each... but my GPU tasks increased their run times to about 1220-1250 seconds and 660 seconds each.

Is there a way to run the only lunatics CPU app and the stock boinc GPU app? Should I change any other settings?

Thanks for your help!
ID: 1961230 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1961232 - Posted: 20 Oct 2018, 22:29:01 UTC - in response to Message 1961230.  

I ran the installer and selected only the default options for the CPU and ATI gpu apps.

That's the problem.
When running stock, the manage will download all possible applications for your hardware & then process several WUs with each one. Usually (but not always) it will generally end up choosing the fastest one.

With Lunatics, for the CPU application the installer takes a stab at which is best (even so, it's worth checking the readme file to see which one is most likely to be best for your CPU). When it comes to the GPU application- the default is just that; the default. It's the most compatible application for the widest range of hardware. For the best application for your particular hardware you need to select the appropriate GPU application (check the readme file or wait for a response here by someone that knows- eg Mike). In the readme file are also suggestions for command line options to further improve the performance- higher end hardware can cope with more aggressive settings than the lower end hardware of the same architecture.
Grant
Darwin NT
ID: 1961232 · Report as offensive     Reply Quote
Previous · 1 . . . 13 · 14 · 15 · 16 · 17 · 18 · 19 . . . 23 · Next

Message boards : Number crunching : SETI@home v8.22 Windows GPU applications support thread


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.