Message boards :
Number crunching :
SETI@home v8.22 Windows GPU applications support thread
Message board moderation
Previous · 1 . . . 13 · 14 · 15 · 16 · 17 · 18 · 19 . . . 23 · Next
Author | Message |
---|---|
Mike Send message Joined: 17 Feb 01 Posts: 34253 Credit: 79,922,639 RAC: 80 |
I don`t use this app arg because my GPU don`t get an improvement from it. Sadly i had no chance to test it on a 1080. It certainly would be interesting to see how it performs in high production mode. With each crime and every kindness we birth our future. |
Marco Vandebergh ( SETI orphan ) Send message Joined: 27 Aug 10 Posts: 39 Credit: 12,630,994 RAC: 9 |
Ok, ill leave this running for a while, to see how it perfoms on the longer run. In GPU-Z the GPU load is now around 96 to 99 % and memory controller load about 86 / 91 ish. It was really low before when running with standard parameters, even when used 3 WU's at a time. I do have to mention the GPU gets hotter, so becareful. My system is all watercooled i have no issues. max GPU temp is 46 degree C. my app_config: <app_config> <app_version> <app_name>setiathome_v8</app_name> <plan_class>opencl_nvidia_SoG</plan_class> <avg_ncpus>1</avg_ncpus> <ngpus>0.33</ngpus> <cmdline>-sbs 1024 -total_GPU_instances_num 5 -pref_wg_num_per_cu 20 -cpu_lock -period_iterations_num 1 -hp -high_prec_timer -high_perf -tt 1500</cmdline> </app_version> <app_version> <app_name>astropulse_v7</app_name> <plan_class>opencl_nvidia_100</plan_class> <avg_ncpus>1</avg_ncpus> <ngpus>1</ngpus> <cmdline>-unroll 28 -oclFFT_plan 256 16 256 -ffa_block 12288 -ffa_block_fetch 6144 -tune 1 64 4 1 -tune 2 64 4 1 -hp -instances_per_device 1</cmdline> </app_version> </app_config> |
ab1vl-pi Send message Joined: 11 Jun 18 Posts: 4 Credit: 68,014 RAC: 0 |
Looking for Fermi or later GPU Nvidia hardware. As I understand it only ASTROPULSE cares about buggy post 340.x drivers for pre-fermi hardware so I might be open to that too. I have a handful of ASUS mobos with Athlon 64 2x and 4x cores. The 2X with Gt-220 GPU runs blc04_2bit_guppi WUs (showing as v8.8.01 cuda60)on linux and win7, same box. If you have a used GPU you'd like to sell, PCI or PCIe, that is faster than the 1.2 cc the GT-220 offers, I'm interested in talking. If you can recommend a low cost NEW card that offers decent performance, I'd like to hear your recommendations. I don't game, I just want a few decent co-processors to stuff into these systems I rescued from the transfer center. Please use PM. tnx, chuck |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13720 Credit: 208,696,464 RAC: 304 |
If you can recommend a low cost NEW card that offers decent performance, I'd like to hear your recommendations. I don't game, I just want a few decent co-processors to stuff into these systems I rescued from the transfer center. For cost, performance and power use it's hard to go past the GTX 750Ti. Grant Darwin NT |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Ok, ill leave this running for a while, to see how it perfoms on the longer run. . . So you are still running 3 at a time? With the change in settings how does it utilise the GPU running one or two at a time? Stephen ?? |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
If he's on windows running SoG on 1080Ti then running 2 at a time is slower than running 3 at a time (average time) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
If he's on windows running SoG on 1080Ti then running 2 at a time is slower than running 3 at a time (average time) . . But he is running a parameter I haven't heard anything about before - -pref_wg_num_per_cu 20 and he says it increases the GPU utilisation. I was wondering if it might increase it to the max even when running 2 or maybe even one. That would be useful. Stephen ? ? |
ab1vl-pi Send message Joined: 11 Jun 18 Posts: 4 Credit: 68,014 RAC: 0 |
Tnx Grant, Amazon has one by NXDA (China) labelled as 2BG GTX 750Ti, for $70.90, about 1/5 of the "pure" NVIDIA price. Any intelligence on NXDA? This would be fulfilled from China ARO 30-45 days. Another concern is trade wars, Amazon boilerplate leaves us open to an new duties. Other than that I'd give it a shot. Unless it was DOA it would probably be a big improvement. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
It's in the read me file of the lunatics installer |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . Yes but have you used it? Because he feels it increases utilisation and I was interested to know if that increase is enough to run only one or two tasks at a time and still have high and efficient use of the 1080/1080ti. Stephen ? ? |
Marco Vandebergh ( SETI orphan ) Send message Joined: 27 Aug 10 Posts: 39 Credit: 12,630,994 RAC: 9 |
I am not running it 24/7 due to work and such, i do not leave it unattended, but as long as i can. On regular basis. It depends on the workunit type what the results are, there are work units that finish in 4 ish minutes when running 1 at a time with -pref_wg_num_per_cu 1, and 9 ish minutes when running 3 at a time with -pref_wg_num_per_cu 20. I think it is an improvement., slightly. GPU load is 99% most of the time, between 97 and 99%. Interesting to see is that the memory controller usage did go up when using 3 at a time, its 88% to 89 ish Lowering -pref_wg_num_per_cu 20 also decreases gpu and memory controller usage, so something is happening there. This is lower when running only 1 or 2 wu's at a time. I also found, when using cpu_lock and no hyperthreading stuff the GPU more too. Still trying some settings out though. I made a screenshot how do i post that? picture upload |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
It depends on the workunit type what the results are, there are work units that finish in 4 ish minutes when running 1 at a time with -pref_wg_num_per_cu 1, . .Thanks for that reply, that was enlightening. Stephen :) |
Marco Vandebergh ( SETI orphan ) Send message Joined: 27 Aug 10 Posts: 39 Credit: 12,630,994 RAC: 9 |
Allright, a little update. This will be my final settings for a while. ( or if someone has a suggestion, i can try that ) Now im completing most guppies in 4.25 / 4.40 minutes ish range, when running 2 instances together on the gtx 1080Ti. In comparison, a single guppi would finish in 2.30 / 2.40 ish minutes range on my system. The GPU usage is about the same as in my previous post picture above with 3 instances running. Increasing -pref_wg_num_per_cu 32 any higher does not have a positive effect anymore on my system. Decreasing results in slower completing times of wu. ( on my system ) Afterall i think its best to run just 2. My Os is win10. Appconfig.cfg: <app_config> <app_version> <app_name>setiathome_v8</app_name> <plan_class>opencl_nvidia_SoG</plan_class> <avg_ncpus>0.4</avg_ncpus> <ngpus>0.5</ngpus> <cmdline>-sbs 1024 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 512 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 -pref_wg_num_per_cu 32 -period_iterations_num 1 -hp -high_prec_timer -high_perf -tt 1500 -cpu_lock -instances_per_device 2</cmdline> </app_version> <app_version> <app_name>astropulse_v7</app_name> <avg_ncpus>0.4</avg_ncpus> <ngpus>0.33</ngpus> <plan_class>opencl_nvidia_100</plan_class> <cmdline>-unroll 28 -oclFFT_plan 256 16 256 -ffa_block 12288 -ffa_block_fetch 6144 -tune 1 64 4 1 -tune 2 64 4 1 -hp -cpu_lock -instances_per_device 3</cmdline> </app_version> <app_version> <app_name>astropulse_v7</app_name> <avg_ncpus>0.4</avg_ncpus> <ngpus>0.33</ngpus> <plan_class>cuda_opencl_100</plan_class> <cmdline>-unroll 28 -oclFFT_plan 256 16 256 -ffa_block 12288 -ffa_block_fetch 6144 -tune 1 64 4 1 -tune 2 64 4 1 -hp -cpu_lock -instances_per_device 3</cmdline> </app_version> <app_version> <app_name>astropulse_v7</app_name> <avg_ncpus>0.4</avg_ncpus> <ngpus>0.33</ngpus> <plan_class>opencl_nvidia_cc1</plan_class> <cmdline>-unroll 28 -oclFFT_plan 256 16 256 -ffa_block 12288 -ffa_block_fetch 6144 -tune 1 64 4 1 -tune 2 64 4 1 -hp -cpu_lock -instances_per_device 3</cmdline> </app_version> <app_version> <app_name>astropulse_v7</app_name> <avg_ncpus>0.4</avg_ncpus> <ngpus>0.33</ngpus> <plan_class>cuda_opencl_cc1</plan_class> <cmdline>-unroll 28 -oclFFT_plan 256 16 256 -ffa_block 12288 -ffa_block_fetch 6144 -tune 1 64 4 1 -tune 2 64 4 1 -hp -cpu_lock -instances_per_device 3</cmdline> </app_version> </app_config> |
rob smith Send message Joined: 7 Mar 03 Posts: 22160 Credit: 416,307,556 RAC: 380 |
Giys - I've suffered a bit of brain fade here - what is the minimum version of OpenCL required for the SoG application to work? Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Giys - I've suffered a bit of brain fade here - what is the minimum version of OpenCL required for the SoG application to work? . . I thought it worked with 1.1 but it certainly works with 1.2 ... Stephen ? ? |
Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489 |
It definitely works on 1.1.Giys - I've suffered a bit of brain fade here - what is the minimum version of OpenCL required for the SoG application to work? Cheers. |
rob smith Send message Joined: 7 Mar 03 Posts: 22160 Credit: 416,307,556 RAC: 380 |
Ta Any votes (either way) for 1.0?? Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Mike Send message Joined: 17 Feb 01 Posts: 34253 Credit: 79,922,639 RAC: 80 |
Ta Yes, it should work with 1.0 as well. The code itself is 1.0. It was the last time i asked Raistmer about it. With each crime and every kindness we birth our future. |
lunkerlander Send message Joined: 23 Jul 18 Posts: 82 Credit: 1,353,232 RAC: 4 |
Last night, I decided to try downloading and installing the lunatics v0.45 app. I ran the installer and selected only the default options for the CPU and ATI gpu apps. Stock Boinc was running the stock app v8.08 on my i5 4590 cpu and the app v8.22 on my RX 560 and 570s. My stock times were usually 3900 seconds for cpu tasks and 1150 seconds on my RX560 and 650 seconds on my RX570. Now with the lunatics app, my CPU tasks are quicker at about 3300 seconds each... but my GPU tasks increased their run times to about 1220-1250 seconds and 660 seconds each. Is there a way to run the only lunatics CPU app and the stock boinc GPU app? Should I change any other settings? Thanks for your help! |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13720 Credit: 208,696,464 RAC: 304 |
I ran the installer and selected only the default options for the CPU and ATI gpu apps. That's the problem. When running stock, the manage will download all possible applications for your hardware & then process several WUs with each one. Usually (but not always) it will generally end up choosing the fastest one. With Lunatics, for the CPU application the installer takes a stab at which is best (even so, it's worth checking the readme file to see which one is most likely to be best for your CPU). When it comes to the GPU application- the default is just that; the default. It's the most compatible application for the widest range of hardware. For the best application for your particular hardware you need to select the appropriate GPU application (check the readme file or wait for a response here by someone that knows- eg Mike). In the readme file are also suggestions for command line options to further improve the performance- higher end hardware can cope with more aggressive settings than the lower end hardware of the same architecture. Grant Darwin NT |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.