Multi-core AMD stories (aka: Ryzen 7, Threadripper and Threadripper2)

Message boards : Number crunching : Multi-core AMD stories (aka: Ryzen 7, Threadripper and Threadripper2)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · Next

AuthorMessage
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1965382 - Posted: 15 Nov 2018, 17:44:40 UTC - in response to Message 1965368.  

OK, that sounds about right for 3.9Ghz and at 1.2V. That is the objective to keep away from 1.3V to keep the thermals in check.

Pretty sure that kit is not B-die but Hynix unfortunately. So if you are able to run then at 3200Mhz with the memory preset, that is probably all you will get out of them. Especially if they are a mixed set and not matched. If they are stable then just sit pat. Or you could see if you can drop the latencies to CL14 to speed up the system.


Running at 4GHz on 1.2 volts (so far).
Tried 14-14-14-14-34 just now, it didn't make it out of the bios.
Tried 16-18-18-18-38 and it came right up.

So I am a fairly happy. Oops, my 3rd lockup. Time to bump the cpu voltage to 1.2125.

At least I can run a "realistic" stress test. Hey, if it won't run stable in BoincMgr/Seti then it needs "tuning" :)

Just caught a task looking like it will run in 50 minutes. That's more like it ;)

Tom
A proud member of the OFA (Old Farts Association).
ID: 1965382 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1965414 - Posted: 15 Nov 2018, 20:19:03 UTC - in response to Message 1965382.  

Yes, that timing of 16-18-18-18-38 is absolutely indicative of Hynix memory. To use CL14 you almost have to exclusively use Samsung B-die. Though there are new sticks being made with Hynix CJR dies that are proving to be very overclockable with good latencies. But they have only been around for a couple of months now. Your used set had no chance other than being the normal Hynix E dies or similar ilk.

That 1.2125V is still very safe. Remember to use your LLC settings and power phase delivery controls to prevent too much voltage sag under load.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1965414 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1965419 - Posted: 15 Nov 2018, 20:42:17 UTC - in response to Message 1965414.  

Yes, that timing of 16-18-18-18-38 is absolutely indicative of Hynix memory. To use CL14 you almost have to exclusively use Samsung B-die. Though there are new sticks being made with Hynix CJR dies that are proving to be very overclockable with good latencies. But they have only been around for a couple of months now. Your used set had no chance other than being the normal Hynix E dies or similar ilk.

That 1.2125V is still very safe. Remember to use your LLC settings and power phase delivery controls to prevent too much voltage sag under load.


I am still a moron studying to be an idiot here. What is LLC? (I am sure it is not a "Limited Liability Corporation" ;) And while I have a lot of settings, I am not sure what ones are "power phase delivery" controls. I am depending on a lot of "auto" here but I am studying :)

Tom
A proud member of the OFA (Old Farts Association).
ID: 1965419 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1965423 - Posted: 15 Nov 2018, 21:50:56 UTC - in response to Message 1965419.  

I'll start with somebody that answered the question long ago.
https://www.pcgamer.com/what-is-load-line-calibration-in-my-bios-and-how-can-i-use-it/
If you really want to understand the topic and concept view Buildzoids YT video where he delves into the subject in great detail.
https://www.youtube.com/watch?v=NMIh8dTdJwI

Power phase delivery is in the Digi+ Power Control section of the AMD BIOS. You are needing to set the CPU Load-Line Calibration to a more aggressive setting other than Auto. The trouble comes with each vendor definitions of the available levels. The ASUS ZE X399 had nine levels where other brands may only have 4 or 5. Also ASUS may define LLC5 as the highest setting while MSI may define LLC1 as the highest setting. READ your manual or the BIOS help prompts.

What you are looking for is for the Vcpu readback voltage in your system monitor to be exactly what you set it to be in the BIOS. While under your typical crunching load. Try increasing levels of LLC until you observe no droop. But don't set it too high or you will spike the voltage when you remove the load.

Power phase delivery usually has several choices from Auto, Optimized or Manual. It determines how many phases are being used to deliver the voltages to the cpu. You can also change the conversion frequency to be higher to make smaller voltage ripples. READ you manual or use the BIOS help prompts.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1965423 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1965443 - Posted: 15 Nov 2018, 23:33:01 UTC - in response to Message 1964167.  
Last modified: 15 Nov 2018, 23:34:04 UTC

After playing with my Intel systems while waiting on it I am more than half convinced that I should try the following and sit and watch it for maybe a couple of weeks or more.

1) No OverClocking.
2) Enable XFR (sp?)
3) Run the default CPU thingy that does CPU boost.
4) Disable Quiet&Cool.
5) Run BOINC/Seti at 90% of threads/cores.
6) Use -nobs on the GPUs.
I see you are following your own advise as closely as others'.
ID: 1965443 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1965496 - Posted: 16 Nov 2018, 4:46:19 UTC - in response to Message 1965443.  

After playing with my Intel systems while waiting on it I am more than half convinced that I should try the following and sit and watch it for maybe a couple of weeks or more.

1) No OverClocking.
2) Enable XFR (sp?)
3) Run the default CPU thingy that does CPU boost.
4) Disable Quiet&Cool.
5) Run BOINC/Seti at 90% of threads/cores.
6) Use -nobs on the GPUs.
I see you are following your own advise as closely as others'.


Yup, ran it long enough to establish the trends and then got my "new" to me ram that was supposed to arrive in another week or so. Since it is supposed to be faster I installed it. The first thing I noticed is it didn't change the automagic overclocking at all.

And did the manual OC thing. And then it occurred to me to have a "real" comparison I needed to set my cores equal to my Intel box.

I was able to establish to my satisfaction that the "auto-magic" OCing that my setup used would not exceed the turbo boost that my Intel e5-2690v2 was using. Since the "general" idea has been to build a box with cpu processing equal to or faster than my 40 core/thread box I was hoping to get performance equal based on the same core amounts.

I will plead guilty on the "sit on it a couple of weeks" charge however ;)

Tom
A proud member of the OFA (Old Farts Association).
ID: 1965496 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1965497 - Posted: 16 Nov 2018, 4:50:20 UTC - in response to Message 1965423.  

Thank you.

READ your manual or the BIOS help prompts.

Guess I will have to read the BIOS help prompts and Google after I read up the rest of the stuff you added to my bulk shelf. My manual has NO detail on this subject.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1965497 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1965501 - Posted: 16 Nov 2018, 5:16:12 UTC - in response to Message 1965414.  

Yes, that timing of 16-18-18-18-38 is absolutely indicative of Hynix memory. To use CL14 you almost have to exclusively use Samsung B-die. Though there are new sticks being made with Hynix CJR dies that are proving to be very overclockable with good latencies. But they have only been around for a couple of months now. Your used set had no chance other than being the normal Hynix E dies or similar ilk.

That 1.2125V is still very safe. Remember to use your LLC settings and power phase delivery controls to prevent too much voltage sag under load.


Darn. I thought I was using the compiled list of Samsung B-die ram chip sets. So either the packaging I copied to answer your question is not really what is in the machine or I have a "sub-par" chip set. Or the search I was using slipped a gear. :(

Tom
A proud member of the OFA (Old Farts Association).
ID: 1965501 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1965503 - Posted: 16 Nov 2018, 5:37:46 UTC - in response to Message 1961640.  

Once you get up on the 396 or 410 drivers, you can run the static library special app which is a lot faster than the 390 driver one.


I am re-reading the thread and saw this. Is the latest "Tbar all in one" running with the static driver? I believe I am running the 396 driver on my 2990WX.

Thank you.
Tom
A proud member of the OFA (Old Farts Association).
ID: 1965503 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1965513 - Posted: 16 Nov 2018, 6:17:21 UTC - in response to Message 1965501.  

Darn. I thought I was using the compiled list of Samsung B-die ram chip sets. So either the packaging I copied to answer your question is not really what is in the machine or I have a "sub-par" chip set. Or the search I was using slipped a gear.

To get Samsung B-die you generally need to either get CL14 latency at any speed rating, or get any kit rated at 3600Mhz or greater. That is the broad strokes rule of thumb. Your limitations primarily come from needing a quad kit as the choices are limited below CL16 latencies. It would have been smarter to get two 2 stick 3200 CL14 kits.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1965513 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1965515 - Posted: 16 Nov 2018, 6:27:54 UTC - in response to Message 1965503.  
Last modified: 16 Nov 2018, 6:28:59 UTC

Once you get up on the 396 or 410 drivers, you can run the static library special app which is a lot faster than the 390 driver one.


I am re-reading the thread and saw this. Is the latest "Tbar all in one" running with the static driverapplication? I believe I am running the 396 driver on my 2990WX.

Thank you.
Tom

Any app other than the zi3v CUDA9 version app is the static compiled application. The zi3v app required you to either download the cudart and cufft libraries from the external link in the docs or to get the later package which included those libraries. The current 0.97 CUDA91 and CUDA92 are statically compiled to include the cudart and cufft libraries into the executable. That is what makes them faster as they don't have to go outside the executable to refer to the external CUDA libraries.

You need to be running the 396 or later drivers to use the static compiled executables.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1965515 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1965551 - Posted: 16 Nov 2018, 12:28:36 UTC - in response to Message 1965414.  


That 1.2125V is still very safe. Remember to use your LLC settings and power phase delivery controls to prevent too much voltage sag under load.


Woke up this morning and the machine was locked up. So bumped the cpu voltage up another step.

Would "sagging" voltages cause long term system instability or is that still the OC testing issue?

Tom
A proud member of the OFA (Old Farts Association).
ID: 1965551 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1965553 - Posted: 16 Nov 2018, 12:33:18 UTC - in response to Message 1965423.  


What you are looking for is for the Vcpu readback voltage in your system monitor to be exactly what you set it to be in the BIOS. While under your typical crunching load. Try increasing levels of LLC until you observe no droop. But don't set it too high or you will spike the voltage when you remove the load.

I apparently missed class when we were introduced to "system monitor". Which app is that? The one I can't get to work?

Tom
A proud member of the OFA (Old Farts Association).
ID: 1965553 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1965559 - Posted: 16 Nov 2018, 13:42:08 UTC - in response to Message 1965553.  


What you are looking for is for the Vcpu readback voltage in your system monitor to be exactly what you set it to be in the BIOS. While under your typical crunching load. Try increasing levels of LLC until you observe no droop. But don't set it too high or you will spike the voltage when you remove the load.

I apparently missed class when we were introduced to "system monitor". Which app is that? The one I can't get to work?

Tom


Just searched for and found the unknown SIO 0xd352. Found reference for installing nct6775 stuff, appear to have successfully compiled and installed it. Here are the results.

tom@EJS-GIFT:~/Downloads/nct6775$ sensors
iwlwifi-virtual-0
Adapter: Virtual device
temp1:        +26.0°C  

k10temp-pci-00d3
Adapter: PCI adapter
temp1:       +101.4°C  (high = +70.0°C)

k10temp-pci-00c3
Adapter: PCI adapter
temp1:       +101.2°C  (high = +70.0°C)

nct6795-isa-0a20
Adapter: ISA adapter
in0:                    +1.25 V  (min =  +0.00 V, max =  +1.74 V)
in1:                    +0.99 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in2:                    +3.34 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in3:                    +3.31 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in4:                    +1.01 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in5:                    +0.15 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in6:                    +0.15 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in7:                    +3.34 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in8:                    +3.26 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in9:                    +1.80 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in10:                   +0.69 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in11:                   +0.69 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in12:                   +1.15 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in13:                   +0.69 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in14:                   +1.53 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
fan1:                  2830 RPM  (min =    0 RPM)
fan2:                  1370 RPM  (min =    0 RPM)
fan3:                   996 RPM  (min =    0 RPM)
fan4:                  1140 RPM  (min =    0 RPM)
fan5:                  1157 RPM  (min =    0 RPM)
SYSTIN:                 +36.0°C  (high =  +0.0°C, hyst =  +0.0°C)  ALARM  sensor = CPU diode
CPUTIN:                 +56.0°C  (high = +110.0°C, hyst = +89.0°C)  sensor = thermistor
AUXTIN0:               +109.0°C  (high = +110.0°C, hyst = +89.0°C)  sensor = thermistor
AUXTIN1:                +43.0°C    sensor = thermistor
AUXTIN2:                +43.0°C    sensor = thermistor
AUXTIN3:                 -2.0°C    sensor = thermistor
SMBUSMASTER 0:         +101.0°C  
PCH_CHIP_CPU_MAX_TEMP:   +0.0°C  
PCH_CHIP_TEMP:           +0.0°C  
PCH_CPU_TEMP:            +0.0°C  
intrusion0:            ALARM
intrusion1:            ALARM
beep_enable:           disabled

k10temp-pci-00db
Adapter: PCI adapter
temp1:        +85.5°C  (high = +70.0°C)

k10temp-pci-00cb
Adapter: PCI adapter
temp1:        +93.0°C  (high = +70.0°C)

tom@EJS-GIFT:~/Downloads/nct6775$ 


I am interested in any comments you have. This was run well after the system was running under load. On the voltages front, do you have any recommendations?

I got no direction from the bios help on the LLC so I picked an entry (it had 8-9?) in the middle. Anything you see here should be reflecting that.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1965559 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1965596 - Posted: 16 Nov 2018, 16:53:11 UTC - in response to Message 1965551.  
Last modified: 16 Nov 2018, 16:53:23 UTC

Locked up host on Zen architecture is ALWAYS a sign of insufficient Vcpu voltage. Either bump the Vcore up some more or apply more aggressive LLC to keep from drooping the voltage too far under load.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1965596 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1965639 - Posted: 16 Nov 2018, 19:18:12 UTC - in response to Message 1965596.  

Locked up host on Zen architecture is ALWAYS a sign of insufficient Vcpu voltage. Either bump the Vcore up some more or apply more aggressive LLC to keep from drooping the voltage too far under load.


Is there any chance that pushing the ram with the 16-18-18-181-38 would also produce a lock up?

I am finding it won't boot even 2 steps below 1.3 Volts.

I have no guidance in the Bios or manual about which is "more aggressive" for the LLC. I guess I will have to test and see.

Thank you for your guidance!
Tom
A proud member of the OFA (Old Farts Association).
ID: 1965639 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1965641 - Posted: 16 Nov 2018, 19:26:03 UTC - in response to Message 1965639.  

Why don't you ask Google the question? Or at least ask the question in your motherboard's forums.

Too aggressive memory settings won't normally lock up a machine. It will just lead to a lot of task errors with segfaults.

Basic rule of overclocking. Set memory stock and find the highest stable cpu overclock. Then set cpu to stock clocks and find the highest stable memory settings. Then combine the two and back down one setting each.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1965641 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1965642 - Posted: 16 Nov 2018, 19:28:00 UTC

Did you ever run the Ryzen DDR4 RAM calculator to see where your sticks should be run?
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1965642 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1965647 - Posted: 16 Nov 2018, 20:21:32 UTC - in response to Message 1965642.  

Did you ever run the Ryzen DDR4 RAM calculator to see where your sticks should be run?


Yes I did. My attempt to apply "all" parameters took a long time (subjectively speaking) maybe 15 minutes and resulted in a non-booting machine that wouldn't even make it out of the bios.

Since I am impatient at times, I have been using the memory presets builtin to the bios.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1965647 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1965648 - Posted: 16 Nov 2018, 20:24:37 UTC - in response to Message 1965641.  

Why don't you ask Google the question? Or at least ask the question in your motherboard's forums.


Normally I start up the BoincMgr and then start up the browser. The last half dozen times I have had a locked up system before I could get in any useful research.

Right now I am not running the BoincMgr while I attempt some research.

I am aware that I may end up back at 3.9GHz which, I think, was stable. Or maybe not.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1965648 · Report as offensive
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · Next

Message boards : Number crunching : Multi-core AMD stories (aka: Ryzen 7, Threadripper and Threadripper2)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.