Message boards :
Number crunching :
My 2990WX
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · Next
Author | Message |
---|---|
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
With memory interweaving off: Auto Ghz. Mem "try it" 3200/CL14 tom@EJS-GIFT:~/Downloads/ml_test/Linux$ sudo ./mlc [sudo] password for tom: Intel(R) Memory Latency Checker - v3.6 Measuring idle latencies (in ns)... Numa node Numa node 0 1 2 3 0 64.3 - 103.2 - 1 104.5 - 102.4 - 2 103.3 - 64.5 - 3 102.9 - 106.0 - With memory interweating auto: tom@EJS-GIFT:~/Downloads/ml_test/Linux$ sudo ./mlc [sudo] password for tom: Intel(R) Memory Latency Checker - v3.6 Measuring idle latencies (in ns)... Numa node Numa node 0 1 2 3 0 64.0 - 103.3 - 1 106.0 - 103.1 - 2 103.4 - 64.1 - 3 103.1 - 105.9 - with socket tom@EJS-GIFT:~/Downloads/ml_test/Linux$ sudo ./mlc [sudo] password for tom: Intel(R) Memory Latency Checker - v3.6 Measuring idle latencies (in ns)... Numa node Numa node 0 1 2 3 0 63.1 - 101.9 - 1 106.0 - 102.9 - 2 103.1 - 64.1 - 3 103.1 - 106.0 - with Die tom@EJS-GIFT:~/Downloads/ml_test/Linux$ sudo ./mlc [sudo] password for tom: Intel(R) Memory Latency Checker - v3.6 Measuring idle latencies (in ns)... Numa node Numa node 0 1 2 3 0 63.9 - 102.9 - 1 105.7 - 102.9 - 2 103.0 - 64.0 - 3 102.8 - 105.7 - With channel Intel(R) Memory Latency Checker - v3.6 Measuring idle latencies (in ns)... Numa node Numa node 0 1 2 3 0 64.0 - 103.3 - 1 105.9 - 102.9 - 2 103.2 - 62.7 - 3 102.8 - 105.9 - I don't believe I have any bios setting that says "uma" on it. I suppose it might be you can't get UMA because of the bi-memory model. Tom A proud member of the OFA (Old Farts Association). |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I think with all your testing variables which match mine in my BIOS, you are correct. With your quad die architecture, only NUMA model is available. When I try all the memory interleaving options in my BIOS, I get a change in each parameter test Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
I think with all your testing variables which match mine in my BIOS, you are correct. With your quad die architecture, only NUMA model is available. When I try all the memory interleaving options in my BIOS, I get a change in each parameter test Well. As someone someplace has said "It has been a learning experience." :) To revisit a question previously raised on the best choice for highest speed cpu crunching under Seti I believe that we can say that the 2920x/2950x are the top consumer choices by far. (Best bang for the buck). The 2920x/2950x provide their cpus with full speed access to memory which is the issue with cpu crunching with Seti. Right now it looks as if I can get 26 cores of cpu processing and however many gpus I have and that is the most production I can manage. Without -nobs. With a 2950x I should be able to run a full 90% cpu crunching core plus x number of gpus running -nobs. That looks to be 27-28 cpu cores. Yes, I can see that it is only 1-2 more cores than I am now running. Might be able to run 100% which would give 32 cores plus the gpu threads. But if someone is upgrading to this level instead of down grading it would save in the neighborhood of $800 which could be devoted to another high end Nvidia gpu. If the e5-2670v1 cpus from data center upgrades had not hit the market with those incredible $100 prices I don't think we would have ever got up into this high a core count till AMD introduced the Ryzen 7/Threadripper products. I think I will still try retests at 30 cores and 40 cores because I want to see if I can beat my last, best production with the slower ram. (67 minutes at 40 cores). With the current mix of slower tasks I am running about 52 minutes which is up from the near 47 minutes I was running. I was trying out a cpu multiplier at 3.5GHz earlier and it was "unstable" (it ran but the cpu frequency was not staying near 3.5). I still have the ambition to return to 3.7GHz that was stable at one time. :) Tom A proud member of the OFA (Old Farts Association). |
ML1 Send message Joined: 25 Nov 01 Posts: 20289 Credit: 7,508,002 RAC: 20 |
Thanks for a good summary. If you're into tweaking a Linux kernel, then you could try these two kernel configs: │ CONFIG_NUMA: │ │ Enable NUMA (Non Uniform Memory Access) support. │ │ The kernel will try to allocate memory used by a CPU on the │ local memory controller of the CPU and add some more │ NUMA awareness to the kernel. │ │ For 64-bit this is recommended if the system is Intel Core i7 │ (or later), AMD Opteron, or EM64T NUMA. │ Symbol: NUMA [=n] │ Type : bool │ Prompt: Numa Memory Allocation and Scheduler Support │ Location: │ -> Processor type and features │ Defined at arch/x86/Kconfig:1515 │ Depends on: SMP [=y] && (X86_64 [=y] || X86_32 [=n] && HIGHMEM64G [=n] && X86_BIGSMP [=n]) and, I don't know if s@h (or the kernel opportunistically for an app) would take advantage of this: │ CONFIG_KSM: │ │ Enable Kernel Samepage Merging: KSM periodically scans those areas │ of an application's address space that an app has advised may be │ mergeable. When it finds pages of identical content, it replaces │ the many instances by a single page with that content, so │ saving memory until one or another app needs to modify the content. │ Recommended for use with KVM, or with other duplicative applications. │ See Documentation/vm/ksm.rst for more information: KSM is inactive │ until a program has madvised that an area is MADV_MERGEABLE, and │ root has set /sys/kernel/mm/ksm/run to 1 (if CONFIG_SYSFS is set). │ │ Symbol: KSM [=n] │ Type : bool │ Prompt: Enable KSM for page merging │ Location: │ -> Memory Management options │ Defined at mm/Kconfig:297 │ Depends on: MMU [=y] That would be very cool if that could save you a few repeated MBytes of CPU cache for the same one s@h app running multiple times to then allow more memory bandwidth for the compute data... Happy fast crunchin', Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
ML1 Send message Joined: 25 Nov 01 Posts: 20289 Credit: 7,508,002 RAC: 20 |
Not sure I should post this here... :-P Building a Compact Monster PC: Threadripper Meets Micro-ATX and Custom Liquid Cooling ... we set out to build a powerful and quiet desktop beast that lives in a relatively svelte form factor while still packing ... high-end performance... Does the "w" in "threadripper-2990wx" mean "water"?... Strictly for comparison :-) Enjoy, Happy cool fast crunchin',! Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Not sure I should post this here... :-P No the WX suffix for TR means "workstation" or professional high core count cpus. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
Not sure I should post this here... :-P Some article/review said it meant "way extreme" :) A proud member of the OFA (Old Farts Association). |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
Thanks for a good summary. I would say "it above my pay grade" except, wait. I'm not getting paid for this so I can't even claim that ;/ I think if it was possible to shoehorn the data and working space into the cpu cache we would see amazing performance. But this is so far outside of what I have studied in Seti that I can't even hazard a guess except the negative one. I suppose I should start saving for an 64 core, high clock EPYC :( Tom A proud member of the OFA (Old Farts Association). |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I'll post here as well in the GPUUG forum. Anyone using the latest LTS release 4.19 kernel? I am interested in getting the correct temperatures to be reported with my 2920X cpu. Seems you need at least 4.18 kernels to get the updated k10temp driver which correctly reports the Threadripper 2 cpus because the fixed k10temp driver isn't going to be backported to the stock 4.15 kernel in Ubuntu 18.04? Anyone tried it yet with the Nvidia 410 series drivers? Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
I have a C state listing for my cpu parameters. On the Intel cpu when I turn down/off the C states they run faster. Does anyone have any experimence one way or the other on the AMD cpus we are running? Tom Yes, I have re-used a 3.7GHz OC button profile I have and so far, so good. A proud member of the OFA (Old Farts Association). |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
I'll post here as well in the GPUUG forum. Anyone using the latest LTS release 4.19 kernel? I am interested in getting the correct temperatures to be reported with my 2920X cpu. Seems you need at least 4.18 kernels to get the updated k10temp driver which correctly reports the Threadripper 2 cpus because the fixed k10temp driver isn't going to be backported to the stock 4.15 kernel in Ubuntu 18.04? Would another good place to ask be over in the Windows to Linux thread? I am not sure how many Linux people are reading here but certainly there should be even more over there. I know there are issues with cross posting, just hoping it won't moderate us here and on Stephens thread. Tom A proud member of the OFA (Old Farts Association). |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I have a C state listing for my cpu parameters. If you want to keep you cpu clocks up at full load all the time, don't use C-states as that gives the OS the ability to downclock opportunistically when it thinks the cpu thread is under light loading. So when a task finishes up and before the next occupies it, the OS can knock the clocks down. But the OS can't transition back to full clocks that fast and has some rather large latencies to get back to speed. Better to keep the C-State always at 0 for compute loads. So for many seconds, the task starts computing at reduced clocks before it moves back to high gear. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I'll post here as well in the GPUUG forum. Anyone using the latest LTS release 4.19 kernel? I am interested in getting the correct temperatures to be reported with my 2920X cpu. Seems you need at least 4.18 kernels to get the updated k10temp driver which correctly reports the Threadripper 2 cpus because the fixed k10temp driver isn't going to be backported to the stock 4.15 kernel in Ubuntu 18.04? I didn't want to spam multiple threads so thought this one was targeted at high performance TR. I already looked through the Top 100 hosts lists and didn't see anyone running more than the 4.15 kernels. But there could be a lightly used TR way down the lists that may be on the newer kernels. I didn't page through a hundred pages trying to find the needle in the haystack. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
But there could be a lightly used TR way down the lists that may be on the newer kernels.You could try the data export files for off site stats. That might be the easiest way to search. |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
I have a C state listing for my cpu parameters. Thank you. I have disabled the C state parameter for now. I have managed to get 4 gpu's racked onto it but the cpu was running hot and became unstable in its cpu frequency at 3.7GHz. So its back down at 3.35GHz. I guess I will go back and get the case(s) you recommended onto my wish list. It is pretty clear that I am going to need the original recommendation for cooling that you made. And to do that, I am going to need a bigger case. Either that or downgrade to a 2950x and buy another faster gpu after I sell the 2990WX :) Tom A proud member of the OFA (Old Farts Association). |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
This article puts some numbers on the reason for the massive ballooning in processing time when I run 56-60 cores of Seti processing. That is a very good question. If I turn off SMT but don't change the number of threads, will it get faster, slower or stay the same? I will bet it gets slower because more physical cores that don't have direct memory access would have to be engaged. Even though the SMT penalty goes away. Tom A proud member of the OFA (Old Farts Association). |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
No, that isn't the way it works. The physical cores still are the same . . . how could they not. You still have four dies with 8 cores and the dies with direct memory access are still connected to memory in exactly the same way. You just will only have one instruction pipeline going through each core instead of two. That means the task occupying the core has exclusive access to the core's registers without having to share timeslices with another thread. By turning off SMT, the task should speed up because it doesn't have to share resources. Try it and see. I would be very surprised if it stayed the same or got worse. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
I was off reading and came back by and system had crashed. Turned on the C state thingy (it is on by default). Will try it again later. Tom A proud member of the OFA (Old Farts Association). |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
You know with all the things you have tried on your host, I come away with the feeling that the motherboard just isn't up to the task. MSI of some flavor, correct? I think you would likely have much better luck with another brand of board a little higher up the price point. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
You know with all the things you have tried on your host, I come away with the feeling that the motherboard just isn't up to the task. MSI of some flavor, correct? I think you would likely have much better luck with another brand of board a little higher up the price point. I think I agree but the jump is $200 higher cost for the apparent best choice. And I have good evidence that I may be able to run at least 3.7GHz (but not 4.0) if I can get the cpu temperature to stay down out of the "outer limits" so it seems like a new case and better cooling is the first step. Then if my seasonal job lasts longer enough I probably will go for that top end MB. Tom A proud member of the OFA (Old Farts Association). |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.