Message boards :
Number crunching :
Ryzen 1700x Build
Message board moderation
Previous · 1 · 2 · 3 · 4 · Next
Author | Message |
---|---|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13904 Credit: 208,696,464 RAC: 304 ![]() ![]() |
if you eliminate HT with SMT, you have a 6 core unit running 9 SETI units. That is a 50% increase over the number of the physical cores. My Ryzen system is running 12 SETI units, witch is a 50% increase over the number of physical cores. The number of threads isn't relevant. It's the single thread performance (yes having some threads not being used will boost the performance of some threads). When it finishes several minutes ahead of my Intel system across both file types, what conclusion do you draw? That you're not comparing the same type of WU. The APR for the Intel system is higher than that for the Ryzen, so for a given WU, the Intel is crunching it faster. The only time I've seen the APR not be indicative of the actual processing rate is on GPUs running more than 1 WU at a time. Grant Darwin NT |
Un4given ![]() Send message Joined: 1 May 04 Posts: 19 Credit: 7,983,035 RAC: 13 ![]() ![]() |
When it finishes several minutes ahead of my Intel system across both file types, what conclusion do you draw? Would you be so kind as to tell me where you are finding these numbers? Yes, I realize this Intel system has been crunching for several years, and the Ryzen for (realistcally) a few days, but I know not where you are gathering these numbers you claim. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13904 Credit: 208,696,464 RAC: 304 ![]() ![]() |
Would you be so kind as to tell me where you are finding these numbers? Yes, I realize this Intel system has been crunching for several years, and the Ryzen for (realistcally) a few days, but I know not where you are gathering these numbers you claim. Click on your name on any of these posts, next to Computers click on View, click on Details, down near the bottom of the screen is Application details, click on Show, scroll down to the bottom of the page to SETI@home v8 8.05 windows_x86_64 (your CPU application) and near the bottom is Average Processing Rate (APR). EDIT- your i7 has actually gone up from 18.98 to 19.08 during all these posts, so that shows that the value is updated regularly. And as the work mix changes, so too will the APR value. Grant Darwin NT |
Un4given ![]() Send message Joined: 1 May 04 Posts: 19 Credit: 7,983,035 RAC: 13 ![]() ![]() |
Would you be so kind as to tell me where you are finding these numbers? Yes, I realize this Intel system has been crunching for several years, and the Ryzen for (realistcally) a few days, but I know not where you are gathering these numbers you claim. OK, I see that. So, how do I associate that with a specific system? I have three computers running multiple DC programs. I'm curious because my fastest GPU can't run Milkyway units because it crashes; it worked for awhile but I get bored with modifying .INI files to fix their crappy ass programming. |
Un4given ![]() Send message Joined: 1 May 04 Posts: 19 Credit: 7,983,035 RAC: 13 ![]() ![]() |
When it finishes several minutes ahead of my Intel system across both file types, what conclusion do you draw? Oh, and what the hell is APR? |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13904 Credit: 208,696,464 RAC: 304 ![]() ![]() |
OK, I see that. So, how do I associate that with a specific system? Click on the Details link for that specific system. Oh, and what the hell is APR? Average Processing Rate (APR). Given in GFLOPs (Giga Floating Operations per second) Like Drystone & Whetstone & MIPS and all sorts of other numbers, usually of not much use or relevance or even particularly representative of what they're meant to be measuring. However in this instance it does allow you to make an accurate comparison between systems and applications (allowing of course for different work mixes- some people re-schedule work between the GPU & CPU so their APR may not be as high as it otherwise would be, or as low as it otherwise would be, depending on just what work it is they're shuffling about). Grant Darwin NT |
Stephen "Heretic" ![]() ![]() ![]() ![]() Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 ![]() ![]() |
. . Hi Keith, . . I have read those posts 4 times now and I am sure they are not about -sbs values. They are about tuning iterations and tt values to keep performance high but eliminate lag as much as possible. And now, how to use all this info for guided optimization? 1) Look at 'max' values. If they are significally higher than 'mean' values then increase -period_iterations_num value to reduce lag. This is an indication that the starting point value requires correction. 2) Look at 'mean' and 's_mean' values. The first few lines show the hardest cases for the GPU so concentrate on them. If you see 'mean' times ~60ms (at default settings) that means your setup will respond to a -tt value increase (there is optimization potential here). If even in the first few lines 's_mean' is lower than the default of 60 (or lower than the set -tt value) it means there isn't enough work to fully load GPU. Any further increase of -tt value will not help, so consider increasing -sbs value instead or just try running multiple tasks on such GPU. On the other hand, if listed 's_mean' times are already considerably lower than the set -tt value but lag persists then playing with -period_iterations_num/tt will not help in this case. Try describing your config in forum support thread and maybe some specific solution can be found. It is possible that on slow devices with -tt set to a low value the few first lines will show 's_mean' about equal to tt value, but the lines closer to the bottom (with bigger FFT sizes) will still have 's_mean' greater than than desired tt value and then GUI lags cannot be reduced. If you experience such a situation please describe your config in the forum support thread (for 8.19 it's https://setiathome.berkeley.edu/forum_thread.php?id=80381 ; others will be created for new releases). EDIT: it seems things have started to change regarding pre-emption on GPU devices. NV Pascal architecture has adequate preemption mechanisms acording to the architecture description: http://www.anandtech.com/show/10325/the-nvidia-geforce-gtx-1080-and-1070-founders-edition-review/10 Still have to prove that experimentally though. . . The values listed down the left side are pulse find kernel runlengths and not buffer sizes. The only reference to -sbs value is a recommendation to increase it from the default (As of 6 June 2016 default value for -sbs N is 128) if the GPU utilisation is too low. In several posts Raistmer has stated he didn't see much point in increasing -sbs over 512 but that was earlier on. I think on more modern cards 512 is probably where the default is or should be. . . But if you are happy with whatever settings you are using then that is fine too. :) . . But I have learnt something by poring over these posts. I need to change my iterations and tt values :) Also that high_perf remark on the right indicates when the tt allows kernels to be merged to reduce overheads and the 'sum' values in this case will be higher but that is not a bad thing. Stephen :) |
![]() ![]() ![]() Send message Joined: 17 Feb 01 Posts: 34468 Credit: 79,922,639 RAC: 80 ![]() ![]() |
On modern Nvidia cards 1060 and above -sbs 1024 and more is the way to go. Especially with -period_iterations_num <30. On AMD cards it doesn`t improve much. With each crime and every kindness we birth our future. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Hi Stephen, I hear what you are saying. You are going to make me try and dig out somewhere Raistmer's answer to some of my questions I had regarding -SBS back when he wrote the document. He primarily wrote that document focusing on the lowest common denominator for running the SoG app. He was more interested in getting the app to run on low end hardware without too much impact to the system. I think he figured that the high-end cards would take care of themselves. When questioned he said his -SBS values were probably too low for high-end cards and to mainly just experiment around to find what works in my system. The SoG app as deployed to Main as the stock app is configured for the "set and forget" squad. And that seems to have borne out as expected. I very often look at my wingmen's stderr.txt output to see where they have configured the app. Almost invariably, it has been run as stock as evident with the -SBS buffer size of 201KB. It is up to the end-user to tune it for higher grade hardware. As Mike has said earlier, for 1060 >> level cards with 4GB-8GB memory, 1024 SBS buffer is a good and sensible starting point. I did in fact try a couple of different buffer sizes at first starting with 512. I saw a big jump from 512 to 1024. Not so big a jump in performance from 1024 to 2048. From his original ReadMe_MultiBeam_OpenCL_NV_SoG.txt document: ============= NV specific info ================ Known issues: - With NV drivers past 267.xx GPU usage can be low if CPU fully used with another loads. App performance can be increased by using -cpu_lock switch in this case, and CPU time savings are possible with -use_sleep switch. Suggested command line switches: Entry Level cards NV x20 x30 x40 -sbs 128 -spike_fft_thresh 2048 -tune 1 2 1 16 (*requires testing) If you experience screen lags or driver restarts use -sbs 128 -period_iterations_num 250 -no_defaults_scaling Mid range cards x50 x60 -sbs 192 -spike_fft_thresh 2048 -tune 1 64 1 4 (*requires testing) Super clocked x50TI / x60TI -sbs 256 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 32 -oclfft_tune_cw 32 (*requires testing) High end cards x8x x80TI Titan / Titan Z -sbs 384 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 (*requires testing) =============================================== I'm happy with where I'm at with the 2048 buffer size. :) Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() Send message Joined: 12 Dec 08 Posts: 51 Credit: 16,807,684 RAC: 0 ![]() |
So I just put the AMD Windows patch onto the machine for the Power Settings change so Windows 10 will handle the Ryzen a little better. I also have just set the CPU clock to 3.6GHz so we will see if it stays stable while crunching. ![]() Next I will play with some memory timings and see what can be done there. "I know I am insignificant...Just look how many stars their are...!" ![]() |
Stephen "Heretic" ![]() ![]() ![]() ![]() Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 ![]() ![]() |
. . Well that is the main thing. Stephen .. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
So I just put the AMD Windows patch onto the machine for the Power Settings change so Windows 10 will handle the Ryzen a little better. The memory thing is a crap shoot right now. Only will know where we really stand when the 1.0.0.5 AGESA microcode releases this month by AMD. There is good evidence that not only can a motherboard have an influence on how fast you can run your memory, there is good experimental evidence by overclockers that the chips themselves play a part in how fast you can run your memory. Identical chips swapped in and out of the same system will run the memory at different max clocks and timings. Thinking is it is dependent on the quality of the IMC and Data Fabric backbone of each individual chip. I have had no issue tightening down my timings by two clocks just by changing them in the BIOS. No change to stock RAM voltage necessary. I'm sure it has a lot to do with having Samsung B-die chips. Ryzen REALLY LIKES Samsung B-dies. Really improved my memory latency and benchmark scores are higher and app loading is snappier. But the latest BIOS' have also seen big improvements in getting SK Hynix and Micron die sticks running at much faster memory clocks too. So progress is being made. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() Send message Joined: 12 Dec 08 Posts: 51 Credit: 16,807,684 RAC: 0 ![]() |
So I just put the AMD Windows patch onto the machine for the Power Settings change so Windows 10 will handle the Ryzen a little better. Thanks for that info. Definitely food for thought. I will continue watching with interest development of BIOS/Ram etc for Ryzen. I am thinking of putting an AIO cooler onto the processor if the processor stays stable at the current speed and if heat becomes an issue. I love it when new equipment comes out.... Brings out the inner child in me....!! "I know I am insignificant...Just look how many stars their are...!" ![]() |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
I know what you mean. I haven't been this interested in computing hardware for a long time. Good that the status quo was upset by AMD. I have had fun learning about the new platform and have definitely learned some new tricks. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
Un4given ![]() Send message Joined: 1 May 04 Posts: 19 Credit: 7,983,035 RAC: 13 ![]() ![]() |
OK, I see that. So, how do I associate that with a specific system? Gotcha. Well, I would dare say there may be other reasons for the differences I'm seeing. 1. My Intel box is my main computer with all of my installed software, so there definitely background services running there that aren't present on my AMD systems. They have Windows, drivers, and BOINC. 2. Because my Intel box is also my gaming system, it has my most powerful GPU, handling quite a few more GPU units than the AMD systems. 3. Because the Intel is my main box, I will also run SETI while doing e-mail, light browsing etc., and not have my dedicated boxes running at all. 4. Due to some issues my GPU has had with some other projects, I almost exclusively run SETI on this Intel box, and will bounce my AMD systems between all of the projects I run depending on where I am at within those projects. In any case, for the ~$630 (shipped) I paid for the chip, motherboard, and RAM, I couldn't be happier with the performance I'm getting. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
I notice a difference in APR between Pipsqueek and Keith-Windows7. Even though both systems are configured almost identically, Pipsqueek usually does more work even with a 200 Mhz core clock deficit compared to Keith-Windows7 which is my daily driver and has all the programs I normally use. Pipsqueek is just a cruncher with very minimal additional programs loaded. There is noticeable difference in responsiveness between the two systems. The amount of extra overhead with background services running on Keith-Windows7 does impact its APR compared to Pipsqueek. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() Send message Joined: 12 Dec 08 Posts: 51 Credit: 16,807,684 RAC: 0 ![]() |
So I have installed Lunatics now and have the machine crunching two WU's per GPU. Will let it run and see what happens. If anyone want to go over my config, which I shamelessly pinched from Keith and his Ryzen setup, please feel free to do so. "I know I am insignificant...Just look how many stars their are...!" ![]() |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
I think that the 1050Ti only has 4GB of VRAM on each card. Correct? If that is so, you need to correct the -SBS 2048 parameter you pinched from me. You need to knock that down to -SBS 1024. You're using more ram at two tasks per card than the card has onboard. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() Send message Joined: 12 Dec 08 Posts: 51 Credit: 16,807,684 RAC: 0 ![]() |
I think that the 1050Ti only has 4GB of VRAM on each card. Correct? If that is so, you need to correct the -SBS 2048 parameter you pinched from me. You need to knock that down to -SBS 1024. You're using more ram at two tasks per card than the card has onboard. Nice to know.... Done. Thanks Keith. "I know I am insignificant...Just look how many stars their are...!" ![]() |
![]() Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 ![]() |
Replying to someone in the help desk forum, who was looking for a good temperature monitor that supports Ryzen CPUs, I found this one: AMD Ryzen Master Utility for Overclocking Control. Perhaps of use for someone here? |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.