Ryzen and Threadripper

Message boards : Number crunching : Ryzen and Threadripper
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 69 · Next

AuthorMessage
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5126
Credit: 276,046,078
RAC: 462
Message 1988131 - Posted: 31 Mar 2019, 13:29:41 UTC - in response to Message 1988126.  

Update after a week running with 28 threads. I am sorry to have to report that - on my system with this hardware and software - this is NOT the sweet spot. My criterion is to monitor the daily credit on the three AMD machines (two 1800X and 2990wx) because I believe that eliminates extraneous factors such as internet or BOINC problems as all three are on the same network. When running at 50 threads then the comparison averaged 1.25 * ryzen1 + ryzen2. After seven days at 28 threads this has fallen to less than 1.00. I am conscious of advice re other operators but that is my finding.
I am now going to repeat the test before trying anything else including changing the software by running at max threads for a week. It will be interesting to see if the throughput increases or falls.


Data drives all guesses. Whatever maximizes your RAC (aka lowers your average processing time) is what you should be going for. We offered our own experience and advice. But you should go with whatever maximizes your production. Period. Anything else you are not getting the most benefit.

I will freely admit, if you drop in a used gtx 1030 3GB ($130~ USD) on each machine, the RAC will jump. :) But adding hardware changes the equation(s).

Thank you for your patience and reports.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1988131 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1988152 - Posted: 31 Mar 2019, 15:44:16 UTC - in response to Message 1988126.  

Still haven't really controlled all the variables. Do you have the same memory and speed in the 1800X boxes compared to the 2990WX box? Running the same clock speeds? Do realize everyone's RAC has been dropping because of the project upsets during your test and also because the mix of work has been changing to reduce or eliminate the Arecibo tasks. The Arecibo tasks recently for the past 3 months have been raising everyone's RAC.

If you tried the benchMT tool, that would eliminate the variables. Still, it is up to you how to run your machines. You have to accommodate no one else but yourself. What ever makes you happy.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1988152 · Report as offensive     Reply Quote
jsm

Send message
Joined: 1 Oct 16
Posts: 124
Credit: 51,135,572
RAC: 298
Isle of Man
Message 1988161 - Posted: 31 Mar 2019, 16:04:43 UTC - in response to Message 1988152.  

Not had much spare time to experiment but referring to external factors such as a change in mix or BOINC problems that is why I am comparing the production between the machines rather than specifically for the 2990. I suggest that provided I do not change anything on the ryzens such as load or different tasks that if I run the 2990 at different threads and obtain the credit each day on all m/cs that a change in the ratio is evidence of overall performance.
Now if next weekend the ratio is less than one I will have to think again. I suppose it is possible that my set of hardware/software has a sweet spot somewhere higher than 28 but lower than 60?
jsm
ID: 1988161 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5126
Credit: 276,046,078
RAC: 462
Message 1988241 - Posted: 1 Apr 2019, 2:27:54 UTC - in response to Message 1988161.  


I suppose it is possible that my set of hardware/software has a sweet spot somewhere higher than 28 but lower than 60?
jsm


It is certainly "possible" that's why the experimentation is warranted.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1988241 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13915
Credit: 208,696,464
RAC: 304
Australia
Message 1988267 - Posted: 1 Apr 2019, 5:48:27 UTC - in response to Message 1988126.  
Last modified: 1 Apr 2019, 5:52:13 UTC

My criterion is to monitor the daily credit on the three AMD machines

Which is the worst possible way as significant system changes will take 4-8 weeks for RAC to stabilise- if there are no Server or other issues in that time frame. Even minor changes can take several weeks for RAC to settle around it's new level- once again if nothing else occurs.
The present situation is a good example- due to the change in work mix and Seti server issues RAC is presently falling for most people, for some it will be a big drop. For others only slight.

What maters- and is the best indicator of work being done- is the number of WUs processed each hour. Of course even then, you need to make sure you're comparing the same types of WU. Different types of WUs will have different run times.
Grant
Darwin NT
ID: 1988267 · Report as offensive     Reply Quote
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22742
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1988280 - Posted: 1 Apr 2019, 9:23:22 UTC

...however if one were to do it manually by recording the total credit at a fixed time then one might be able to get some indication in which direction things are moving. But that ignores the fact that many tasks aren't validated for several days which could easily swing the data one way or the other.

Gran't technique of tasks per hour is only a guide, but, as he says, it needs to be taken with a pinch of salt.

Long-term averages, like Recent Average Credit, tend to smooth out the impact of late validation and task-type, but have their own problems.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1988280 · Report as offensive     Reply Quote
jsm

Send message
Joined: 1 Oct 16
Posts: 124
Credit: 51,135,572
RAC: 298
Isle of Man
Message 1988555 - Posted: 3 Apr 2019, 6:14:46 UTC - in response to Message 1988267.  

" My criterion is to monitor the daily credit on the three AMD machines


Which is the worst possible way .................."

I beg to differ! Of course external factors will affect the WUs on all machines but I suggest that if there is a problem, say, with ratification it is likely to affect all my three large computers if not world wide. Provided you give a reasonable time period to assess and average then the ratio (not the quantities) is quite a good measure of whether a single change on one m/c is beneficial or otherwise. Even after three days I observe that the ratio i.e. 2990 credit / (ryzen1+ryzen2) credit has jumped from less than 1 with 28 threads to 1.15 with 60 threads.
jsm
ID: 1988555 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5126
Credit: 276,046,078
RAC: 462
Message 1988590 - Posted: 3 Apr 2019, 14:56:28 UTC
Last modified: 3 Apr 2019, 15:05:27 UTC

Here are some theoretical calculations for the kind of RAC you might get depending on the number of threads and the average time to process a task.

(minutes in a day / avg. minutes needed to process a task ) * estimated credits per task * number of threads = possible RAC.

(1440 / 120) * 62 * 60 =~ 44,000
(1440 / 90) * 62 * 40 =~ 39,000
(1440 / 75) * 62 * 40 =~ 47,000
(1440 / 60 ) * 62 * 28 =~ 41,000
(1440 / 47 ) * 62 * 28 =~ 53,000

As you can see, both average processing times and number of threads drive the RAC up/down.
And because of the changing mix of data the average processing time will go up/down irrespective of the # of threads.

I didn't pick 26-28 threads out of thin air. There was a published article that did benchmarking while watching how busy the pcie channels were (I think). The benchmark they were using showed a "cliff effect" at 26+- threads. The congestion went up and the time the benchmark took went up radically. I think it was running Windows. This was in a CNET publication (I think).

There was another published benchmark of repeated runs with the same app that showed a "plateau" between 30 and 40 threads (Linux). Basically any number of threads in that region didn't show much effect up or down on the average time to process. This was in a technical publishing website.

So it will take iron patience to see which settings maximize your production on your 2990wx cpu.

I think I have published links to both of the above in "My 2990wx" thread.
I found them by patiently googling, again and again "2990wx benchmarks".
--edit--
Lately I have been averaging 1 hour and 36 minutes (It has been as high as just under 2 hours) on cpu tasks on an Intel box with a maximum of 40 threads (20c/40t) set to 95% of the available cpu threads in the Boinc Manager. Since this system only turbo's to 3.3Ghz and yours Turbo's to 3.6+ Ghz you should be able to beat these #. https://setiathome.berkeley.edu/show_host_detail.php?hostid=8676008
--edit--
I hope this post was useful.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1988590 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5126
Credit: 276,046,078
RAC: 462
Message 1988670 - Posted: 4 Apr 2019, 4:17:12 UTC

Here is a useful Linux cpu speed tracker that Keith told me about back when I had a 2990wx. I have run it on "several" different Linux systems.

watch -n1 "cat /proc/cpuinfo | grep \"^[c]pu MHz\"" 


It will let you see how fast each cpu thread is running.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1988670 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1988679 - Posted: 4 Apr 2019, 5:54:55 UTC

Yes, that is very helpful in showing how Ryzen and Threadripper constantly dynamically move tasks around the cores to redistribute the hot spots on the package. You will see large variances in core clocks as it processes a task on a cpu. The only way to keep that from happening is to either use a fixed clock multiplier or use one of the more aggressive PBO or Performance Enhancement settings. Or set up several FID/PID P-state overclocking levels at various clocks and voltages for different loads.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1988679 · Report as offensive     Reply Quote
jsm

Send message
Joined: 1 Oct 16
Posts: 124
Credit: 51,135,572
RAC: 298
Isle of Man
Message 1988773 - Posted: 4 Apr 2019, 19:29:25 UTC - in response to Message 1988670.  

Would you mind just checking this cmd line please? After typing it I got unmatched '"' so I copied and pasted from your post and got exactly the same error. I must say they look matched to me so maybe a space is missing somewhere?
jsm
ID: 1988773 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1988776 - Posted: 4 Apr 2019, 19:46:44 UTC - in response to Message 1988773.  

Would you mind just checking this cmd line please? After typing it I got unmatched '"' so I copied and pasted from your post and got exactly the same error. I must say they look matched to me so maybe a space is missing somewhere?
jsm

Just copied and pasted in Terminal from that code quote in the post. Worked as expected. Something in your environment must not be set correctly.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1988776 · Report as offensive     Reply Quote
jsm

Send message
Joined: 1 Oct 16
Posts: 124
Credit: 51,135,572
RAC: 298
Isle of Man
Message 1988779 - Posted: 4 Apr 2019, 19:56:50 UTC - in response to Message 1988776.  

just tried again and same error. I wonder what could be adrift in the environment.
jsm
ID: 1988779 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1988789 - Posted: 4 Apr 2019, 20:21:45 UTC - in response to Message 1988779.  

just tried again and same error. I wonder what could be adrift in the environment.
jsm

Look at the entry for bash.rc in environment
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1988789 · Report as offensive     Reply Quote
jsm

Send message
Joined: 1 Oct 16
Posts: 124
Credit: 51,135,572
RAC: 298
Isle of Man
Message 1988792 - Posted: 4 Apr 2019, 20:53:32 UTC - in response to Message 1988789.  

solved I always use tcsh so shifting to bash allowed it to run
Output for all threads seems to be between 3300 and 3400 every second with an occasional drop to 1700.
I can't highlight it to copy so as to show you a sample.
jsm
ID: 1988792 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5126
Credit: 276,046,078
RAC: 462
Message 1988800 - Posted: 4 Apr 2019, 21:35:17 UTC - in response to Message 1988792.  
Last modified: 4 Apr 2019, 21:36:15 UTC

solved I always use tcsh so shifting to bash allowed it to run
Output for all threads seems to be between 3300 and 3400 every second with an occasional drop to 1700.
I can't highlight it to copy so as to show you a sample.
jsm


The "occasional drop to 1700" indicates either "speedstep" aka: Cool 'n Quiet is throttling, or some other similar thing. Or it is an indication of Pcie memory congestion.

Check to see if anything with C states is still enabled or auto and try disabling it. Seems to me you already reported that Cool 'n Quiet/TSS? had gone away. Is precision boost on? And CPB boost?

In theory, your 2990wx "should" be able to run at least 3.7Ghz. And people have had luck as high as 4.0Ghz. It gets tricky though. You have to have the right ibncrease cpu voltage plus the right LLC(sp?) setting.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1988800 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1988808 - Posted: 4 Apr 2019, 22:22:53 UTC - in response to Message 1988800.  
Last modified: 4 Apr 2019, 23:02:49 UTC

The LLC becomes important when you increase the cpu loading and want to maintain or run higher than stock clocks. But every board manufacturer defines their LLC differently in their BIOS. One mfr might have LLC 1 be the highest level of LLC and produce the lowest voltage sag under load. Another mfr might have LLC 8 level be the highest level of LLC. You need to read your manual, visit your motherboard forums and ask questions of other users with the same board.

The Ryzen and TR boost algorithms are pretty smart and functional. They will automatically boost clocks within the power and temperature profile limits of the board. Cool the cpu and VRM's better and the board will boost the clocks higher. Read the manual for Core Performance Boost and Performance Boost Override functions and again ask questions in your board forums from other users. If I remember correctly from the 2990WX reviews, the reviewers were able to maintain at least 3.9Ghz all cores under all 64 core loading with liquid cooling.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1988808 · Report as offensive     Reply Quote
Profile bloodrain
Volunteer tester
Avatar

Send message
Joined: 8 Dec 08
Posts: 231
Credit: 28,112,547
RAC: 1
Antarctica
Message 1989014 - Posted: 6 Apr 2019, 8:45:30 UTC - in response to Message 1988808.  

on Kieth yes they did. i also ask some one that had a set up like that.
even on the 1950x both of their gen top of the line cpu runs hot.
i got a stiller air cooler. but once i put dual gpus in case the cpu got hotter do to the overall heat in case.
its a solid case to very good air flow.
ID: 1989014 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5126
Credit: 276,046,078
RAC: 462
Message 1989033 - Posted: 6 Apr 2019, 13:03:58 UTC - in response to Message 1989014.  

on Kieth yes they did. i also ask some one that had a set up like that.
even on the 1950x both of their gen top of the line cpu runs hot.
i got a stiller air cooler. but once i put dual gpus in case the cpu got hotter do to the overall heat in case.
its a solid case to very good air flow.


Sometimes when things are running "too hot" a quick Seti fix is to take off the side of the case and put a floor fan/other directed flow fan to blow directly on the motherboard.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1989033 · Report as offensive     Reply Quote
jsm

Send message
Joined: 1 Oct 16
Posts: 124
Credit: 51,135,572
RAC: 298
Isle of Man
Message 1989089 - Posted: 7 Apr 2019, 10:01:52 UTC - in response to Message 1988808.  

OK 7 day report. At 95% cpu setting in Boinc my ratio has increased from .9 with 45% cpu to 1.142. Now set at 85% to see change next week.
Seti down again?
jsm
ID: 1989089 · Report as offensive     Reply Quote
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 69 · Next

Message boards : Number crunching : Ryzen and Threadripper


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.