My 2990WX

Message boards : Number crunching : My 2990WX
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 7 · Next

AuthorMessage
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1966879 - Posted: 24 Nov 2018, 18:32:02 UTC
Last modified: 24 Nov 2018, 18:34:27 UTC

I went back and re-read the article where I got the "46 cores" recommendation and realized he actually had multiple results that would turn into likely testing points.
The article is here: https://www.reddit.com/r/Amd/comments/9mhr5q/amd_threadripper_2990wx_for_scientific_workloads/

If I am understanding it more clearly this time my test points are 20 cores, 32 cores, 40 cores and 46 cores (since I am already testing that number).

Since I have already experimented for a couple of days at 24 cores and 56 cores I will not re-try those for longer periods at this time.

The distinction between 40 cores and 46 cores lies in the plateau side each represents. 40 is the beginning of that plateau and 46 is the end. After that he reports results that go down.

Right now I am getting an estimated processing time for 46 cores of between 1 hour 40 minutes and 1 hour 36 minutes. This is the lowest time estimate I have gotten since I started running at 3.7GHz without interruptions. Next Tuesday sounds like a good time to move it to the 40 core setting and see if anything shifts.

I am not expecting a shift. After the new ram is installed I am going to continue running at 3.7GHz and 40 cores long enough to see if anything shifts.

Assuming nothing shifts I will be running some Memtest86 time at baseline & 3.7GHz. I need to understand exactly what the utility can help me test for and under what conditions before I want to push it any higher.

Eventually I am going to want to test 32 cores and 20 cores. To make the comparison useful I need to continue running at 3.7Ghz. Once I have gotten that testing done I will take the Memtest86 results into account as well as my previous attempts at 3.8GHz, 3.9GHz and 4GHz to decide what I want to try next.

One other test I want to do is try 4.1GHz. Why? Because previously it wouldn't boot at that setting in the MB Automatic OC mode. Maybe the ram will do the trick. Or maybe not :) Even if it boots I am not sure I want to test at that speed. It is on the published "outer limits" and every OC advice I have read says take it back a "notch" for long term settings. The Automatic/Builtin OC increments are in 100Mhz jumps so it is more likely I would stay at 4GHz if the testing supports trying it in production.

If the next bottle neck turns out to be MB voltage handling I am going to have to wait till after New Years and see if the seasonal, temporary retirement job I am in is going to evaporate or not (they do layoff seasonally).

I think the above plans could take a bit of time depending on exactly how the results of the Memtest86 play out. If I need to run the Memorytest86 for a day at a time, the Seti maintenance period on Tuesday surely beckons :)

Tom
A proud member of the OFA (Old Farts Association).
ID: 1966879 · Report as offensive
Profile StFreddy
Avatar

Send message
Joined: 4 Feb 01
Posts: 35
Credit: 14,080,356
RAC: 26
Hungary
Message 1966887 - Posted: 24 Nov 2018, 18:54:35 UTC - in response to Message 1966879.  

I think the best motherboard for your beast is: https://www.msi.com/Motherboard/MEG-X399-CREATION
Maximum Power: 19 phases digital power with heat-pipe heatsink design


I wouldn't try overclocking above 3.8GHz without this mobo.
ID: 1966887 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1966953 - Posted: 25 Nov 2018, 4:48:43 UTC - in response to Message 1966887.  

I think the best motherboard for your beast is: https://www.msi.com/Motherboard/MEG-X399-CREATION
Maximum Power: 19 phases digital power with heat-pipe heatsink design


I wouldn't try overclocking above 3.8GHz without this mobo.


I have been looking around and I think that is close to the consensus. Even AMD sent that out (I think I read that) with one of its review kits for the 2990XW.

Its going to be a while before the OC experimenting starts. I have to get through the other core changes to see how the time to process on the cpu varies.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1966953 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1967487 - Posted: 28 Nov 2018, 3:53:26 UTC - in response to Message 1966953.  


Its going to be a while before the OC experimenting starts. I have to get through the other core changes to see how the time to process on the cpu varies.

Tom


Based on my notes the "average estimated time to complete" looks like it is actually drifting down a couple of minutes. Or the data mix has changed again. My other machine seems to be taking longer than its usually 1 hour 8 minutes so maybe the data we are processing is heterogeneous, again :)

Given the possibility that a change from 46 cores to 40 is actually going to make a difference. I am going to have to wait on installing that ram upgrade until I hit another Tuesday checkpoint. By then, I hope to be able to see/record another fairly stable performance trend so that when I upgrade the ram it will be possible to distinguish if it makes any difference at my current OC speed. (3.7GHz).

Tom
ps. "I want patience and I want it RIGHT NOW!"
A proud member of the OFA (Old Farts Association).
ID: 1967487 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1967639 - Posted: 29 Nov 2018, 3:06:58 UTC - in response to Message 1967487.  


Based on my notes the "average estimated time to complete" looks like it is actually drifting down a couple of minutes. Or the data mix has changed again.


On 11/25/2018 I had just finished 2 weeks of cpu processing and the average estimated time to completion was ranging from 1 hour and 32 minutes to 1 hour and 33 minutes. On 46 cores@3.7GHz.
Switched to 40 cores on that date (11/25).
11/28/2018 (40 cores) average estimated time has dropped to a range of 1 hr 25 minutes to 1 hr 27 minutes. (this morning?)
11/28/2018 6:30pm (40 cores) 1 hr 17 minutes to 1 hr 21 minutes.
==============================

And I have my CL14 ddr4-3200 ram waiting. Showed up in the mail today. But I still want to wait till next Tuesday. I want to see if I have a short trend or a longer one. I can't commit to the ram upgrade on Tuesday until I see what the cpu avg processing time estimates are doing.
A proud member of the OFA (Old Farts Association).
ID: 1967639 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1968205 - Posted: 1 Dec 2018, 20:49:37 UTC

A block diagram of how a 2990WX accesses its memory looks like this (Thank you Anandtech).


What I am wondering is why sometimes I get 6-10 cpu tasks that are running 44-49 minutes rather than upto 32 tasks running like that. It seems like all tasks with direct access to memory (and running under SMT) should run at that kind of speed.

This means that while the CPU will likely fill up the cores close to memory first, it will not be a simple case of filling up all of those cores first – the system may get to 12-14 cores loaded before going out to the two new bits of silicon.
https://www.anandtech.com/show/13124/the-amd-threadripper-2990wx-and-2950x-review/2

The above quote would tend to support that only 12-14 cpu tasks are going to get very low processing times in comparison to the rest.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1968205 · Report as offensive
Profile Kissagogo27 Special Project $75 donor
Avatar

Send message
Joined: 6 Nov 99
Posts: 715
Credit: 8,032,827
RAC: 62
France
Message 1968282 - Posted: 2 Dec 2018, 12:20:18 UTC

peharps you have to test which cores directly access to memory with process lasso or other utility like this one ( bill's process manager works fine for me on a X2 athlon ^^ )

and then configuration of many cores makes the process stay at better crunching time ... ?
ID: 1968282 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1968295 - Posted: 2 Dec 2018, 14:15:13 UTC - in response to Message 1968282.  
Last modified: 2 Dec 2018, 14:32:32 UTC

peharps you have to test which cores directly access to memory with process lasso or other utility like this one ( bill's process manager works fine for me on a X2 athlon ^^ )

and then configuration of many cores makes the process stay at better crunching time ... ?


Unless I am confused, process lasso is a windows only product. I am running Linux. I will see if I can find a Linux version of "bill's process manager".

Edit - Both products are Windows only. Thanks for the idea. Seems like I read something someplace let me chase that down. Yet another testing cycle :)

Tom
A proud member of the OFA (Old Farts Association).
ID: 1968295 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1968546 - Posted: 3 Dec 2018, 19:28:51 UTC

I have installed the new cl14/3200 memory. At the time I installed it the cpu's taskswere averaging about 1 hour 20 minutes This is with 40 cores at Automated OC of 3.7GHz.

After watching it for a couple of days I decided to restart my testing with the highest feasible # of cores. So I reset the cores to 60 out of 64.

So far things are as expected. I am seeing signs of some cpu tasks taking more than 2 hours. The gpus had to start running under the "-nobs" command line or they appear to be slowing down maybe 20%. I am not sure if the 8-10 cpu tasks that have been running in the 40-50 minute range will disappear or not.

Might have some more useful results in a week or more.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1968546 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20147
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1968554 - Posted: 3 Dec 2018, 20:38:35 UTC - in response to Message 1968546.  

... highest feasible # of cores. So I reset the cores to 60 out of 64...

What is limiting the number of cores?

Or are you pulling back from a CPU resource or memory bandwidth bottleneck?


Happy cool crunchin'!
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1968554 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1968644 - Posted: 4 Dec 2018, 6:54:14 UTC - in response to Message 1968554.  

... highest feasible # of cores. So I reset the cores to 60 out of 64...

What is limiting the number of cores?

Or are you pulling back from a CPU resource or memory bandwidth bottleneck?


Happy cool crunchin'!
Martin


What is going on is there is evidence across multiple brands and models of machines that you get more production if you don't use all the cpu cores. It basically leaves "some" of the system resources available for "other" non-app activity and/or to provide additional cpu resources to the app.

This practice apparently is useful on Intel-based systems and presumably is useful on this "bi-memory" model system too.

I do know that if I drop the total number of cores from 64 to 60 the task manager goes from 99%-100% to 94%. What I don't have is recorded/empirical proof that it works on this memory model too.

There is a published review that tends to indicate that 40 cores is the most productive compromise between # of cores and lowest cpu seconds per task.

The Threadripper 2 series apparently suffers from a compromise that was caused by promising to keep the TR2 socket for a couple of generations of cpus. That compromise means half the cpu cores do not have direct access to ram. It takes "2 hops" (sp) instead of the single hop of the cpus with direct access. Since Seti tasks are larger than the available cache for the non-direct access cpus, they are going to slow down significantly in processing.

I am a Linux newbie and don't know if there are tools to instrument and get at the "memory band width" question. I have read a statement that the 2 hop access was at least 30% slower than the direct access. I have every reason to believe that when all or nearly all the cpu cores are going "full tilt" there will be a slow down of total processing due to memory access congestion.

The reason for my last memory upgrade to CL14/3200 was because it has been reported that the Threadripper 2's are very sensitive memory speed and CL for intense processing.

I hope I made sense. I am very aware of the amount of information I don't have and/or understand.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1968644 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1968646 - Posted: 4 Dec 2018, 7:02:59 UTC - in response to Message 1968546.  


Might have some more useful results in a week or more.

Tom


Can't tell if this result is stable but the "estimated processing time" has ballooned from 40 cores and maybe 1 hour 7 minutes to 60 cores and 1 hour 30 odd minutes in less than 12 hours.

I can also see that none of the tasks are running in the sub hour range. I would say that shows pretty good evidence that when the memory channels get fully used then the high speed memory by the direct memory access cpus is slowed down. Otherwise the below 1 hour tasks would continue to show up.

I am not convinced that the "estimated time" will stop at 1 hour 30 minutes so total production could easily slow to less than the 40 core production level.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1968646 · Report as offensive
Profile bloodrain
Volunteer tester
Avatar

Send message
Joined: 8 Dec 08
Posts: 231
Credit: 28,112,547
RAC: 1
Antarctica
Message 1968656 - Posted: 4 Dec 2018, 10:42:01 UTC - in response to Message 1968646.  

at 4ghz. it will stay stable. the issue is the vrm on mobo(been some interesting talk on both intel and amd)and the ram. tr is very picky with ram. compare to base ryzen chip set.
i have my 1950x running 3.5 atm with 64gb of ram. with running seti,primegrid and plex.. again you got to get the 2 thing i mention in before dialed in.
also air flow. most mobo with the cpu cooler on this are so dam close to hitting mobo parts and gpu. the one issue i had with this socket.
ID: 1968656 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1968677 - Posted: 4 Dec 2018, 20:39:29 UTC - in response to Message 1968656.  

at 4ghz. it will stay stable. the issue is the vrm on mobo(been some interesting talk on both intel and amd)and the ram. tr is very picky with ram. compare to base ryzen chip set.
i have my 1950x running 3.5 atm with 64gb of ram. with running seti,primegrid and plex.. again you got to get the 2 thing i mention in before dialed in.
also air flow. most mobo with the cpu cooler on this are so dam close to hitting mobo parts and gpu. the one issue i had with this socket.


I am running a water cooled cpu setup. I have had no luck at 4Ghz. It boots, starts app and then either locks up or crashes. Since I am testing other changes (like different # of cores in processing) I have been ragged to only change one thing at a time. So right now I am running at 3.7GHz.

I would dearly love to hear of a voltage, CCL(?) combination with the cpu multiplier that works at 4GHz but so far haven't found one. It is very possible that my MB simply won't do that because it's VRM isn't up to it.

I think I have notes that say 3.9GHz is probably stable. But I am currently doing other testing so I will be waiting.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1968677 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1969131 - Posted: 7 Dec 2018, 19:16:03 UTC

I am currently testing 30 cores/threads@3.7Ghz and the average estimated processing time has reached 1 hour. I think the time estimate is still going down.
My previous best combination was 40 cores/threads@3.7GHz with previous CL16? ram. It was running 1 hour and about 7 minutes.

It is running slightly hotter according to Psensor. And this morning when I got up it had locked up. :( It has been a long time (more than a month I think) since I have had any trouble at 3.7GHz.

Darn.

I just spotted a task that is on track to run 38 minutes.

I am inferring that running nearly any of the non-direct access cpu cores (aka: CCX?) slows down the memory access of the direct access cpus.

If the average estimated processing time gets low enough 30 cores will become the new top producer.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1969131 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1969368 - Posted: 8 Dec 2018, 18:48:07 UTC

I think this article does a good job of summarizing the results of running a 2990WX under a heavy multi-app load and argues for buying a lower cost/lower core AMD CPU to get the same or probably better results in Seti processing.

https://www.techspot.com/review/1680-threadripper-2-mega-tasking/

Since I am still in love with the idea of having a 32c/64t CPU and probably can't re-coup my investment if I were to sell the hardware, I will continue to work to see what the optimum # of cores are to maximize the total processing of the CPU.

Right now I am guessing it is still 40 threads. That was tested with CL16 memory. So I could be wrong. It might actually be a higher core count with CL14 memory.

I am still testing the 30 thread level. The results are encouraging. I think I was seeing 56 minutes / task average estimated times before the Seti server stopped delivering tasks. It will have to get much lower for me to get more production than the 40 core setting was showing.

For anyone else pondering buying a 2990WX for Seti CPU processing, think long and hard about the cost / tasks issue.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1969368 · Report as offensive
Profile Kissagogo27 Special Project $75 donor
Avatar

Send message
Joined: 6 Nov 99
Posts: 715
Credit: 8,032,827
RAC: 62
France
Message 1969470 - Posted: 9 Dec 2018, 11:12:47 UTC

trying to understand your productivity

3.7ghz 80mn x 40  > 24*60/80 > 18 tasks/core /day  > 720 tasks /day
x.xghz 90mn x 60  > 24*60/90 > 16 tasks/core /day  > 960 tasks /day 
3.7ghz 60mn x 30  > 24*60/60 > 24 tasks/core /day  > 720 tasks /day 


is it right ?
ID: 1969470 · Report as offensive
Profile lunkerlander
Avatar

Send message
Joined: 23 Jul 18
Posts: 82
Credit: 1,353,232
RAC: 4
United States
Message 1969472 - Posted: 9 Dec 2018, 11:55:24 UTC - in response to Message 1969368.  

Would the TR 2950 have been a better investment, with its direct access to memory? Also, if the rumors are true, there may be a Ryzen 9 CPU coming next year with 16C/32T for $450 that uses the AM4 socket so motherboards would be cheaper as well.
ID: 1969472 · Report as offensive
Profile lunkerlander
Avatar

Send message
Joined: 23 Jul 18
Posts: 82
Credit: 1,353,232
RAC: 4
United States
Message 1969508 - Posted: 9 Dec 2018, 15:37:16 UTC - in response to Message 1969472.  

Here's where I saw the info on the ryzen 3000 series: https://www.digitaltrends.com/computing/amd-ryzen-3000-cpu-everything-you-need-to-know/
ID: 1969508 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1969573 - Posted: 9 Dec 2018, 18:26:22 UTC

I would take the rumors about the next generation Ryzen with a good dose of salt.

I wanted the HEDT platform to support four graphics cards. That forced me into either X99 or X399. So being mainly a AMD shop, that decided for me to go with Threadripper. The benefits of Ryzen+ in the latencies and support for higher memory clocks decided for me to go with the second generation Threadripper. The cheapest 2nd Gen. TR available now was the 2920X so that is what I went with. I don't need massive amounts of cpu threads, just enough to drive the four gpus which do the majority of the work.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1969573 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 7 · Next

Message boards : Number crunching : My 2990WX


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.