Ryzen and Threadripper

Message boards : Number crunching : Ryzen and Threadripper
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 69 · Next

AuthorMessage
jsm

Send message
Joined: 1 Oct 16
Posts: 124
Credit: 51,135,572
RAC: 298
Isle of Man
Message 1984184 - Posted: 8 Mar 2019, 19:42:50 UTC - in response to Message 1984134.  

Not familiar with github but managed to create a repository and a text file with the output in it. If you could check it is visible to the guys who might need to see it I would be thankful.
With respect to your queries about sensors I did not build this m/c and am not familiar with the board layout. Given time I might be able to identify the location s and see if there are any sensors connected.

However, the prime objective is to see if Ubuntu is significantly better than WIN10 with this hardware and so far the answer appears to be 'yes' but it is by how much that interests me.
jsm
ID: 1984184 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1984188 - Posted: 8 Mar 2019, 20:03:08 UTC - in response to Message 1984184.  

OK, guess I should have been more specific. I wanted you to post your sensors output on the electrified/asus-wmi-sensors repository Issues page as a new issue questioning whether the outputs for the temp sensors is correct.

https://github.com/electrified/asus-wmi-sensors/issues

You would just click the New Issue green button on that page and fill in a subject and the sensors output text. That way the developer can look at it and then direct you to provide further details and information about your system.

I tried searching github for your Seti user name jsm but did not find you. You will have to provide the link to where you created your repository and issue so I can find it.

Repositories are for developed code created by the developer. The issues related to each repositories code should be input against that specific code repository on the Issues page for proper attention and tracking.

Since you did not build the machine you will either have to ask the builder if they included any temp sensors or you will have to look for sensors plugged into the motherboard. Your motherboard manual has the locations of all headers on the motherboard.

Your motherboard manual is here https://dlcdnets.asus.com/pub/ASUS/mb/socketTR4/ROG_ZENITH_EXTREME/E13369_ROG_ZENITH_EXTREME_UM_V2_WEB.pdf?_ga=2.218648795.375815061.1552074954-701731815.1552074954

You need to look at pages 1-19 and 1-28 for the location of your temp sensors and see if anything is plugged in. I suspect unless you specified to your system builder that you wanted temp sensors that they did not include any. Only way to know is either ask or physically inspect for the sensors.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1984188 · Report as offensive     Reply Quote
jsm

Send message
Joined: 1 Oct 16
Posts: 124
Credit: 51,135,572
RAC: 298
Isle of Man
Message 1984189 - Posted: 8 Mar 2019, 20:27:48 UTC - in response to Message 1984188.  

I searched for ASUS, sensors, threadripper etc but did not get any hits which is why I set up a new repository. As you say having the name of the existing one is a help.
I used name captainiom because jsm was taken and I named the repository to try and convey that was interested in sensor output.
I am a bit tied up with a charity project (ramseypier.im) but can look at the manual when I get a moment. Almost certainly the builder will not have added any extra sensors as I didn't specify same.
jsm
ID: 1984189 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1984191 - Posted: 8 Mar 2019, 20:55:37 UTC - in response to Message 1984189.  

Ok, I created a new issue for you linking to your repository. The developer will answer you there I assume.

https://github.com/electrified/asus-wmi-sensors/issues/10
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1984191 · Report as offensive     Reply Quote
jsm

Send message
Joined: 1 Oct 16
Posts: 124
Credit: 51,135,572
RAC: 298
Isle of Man
Message 1984974 - Posted: 13 Mar 2019, 19:39:39 UTC - in response to Message 1984191.  

Unfortunately no response yet.
Update now that SETI is on line without dropping out. I am achieving roughly double the work units of the 1800X Ryzen m/cs which is much better than with WIN10 which was about 1.25. However there are four times as many cores rather than twice as many which is still disappointing in view of the cost difference.
I wonder Tom whether I should cut back the number of cpus in BOINC in Ubuntu as I did for Windows?
jsm
ID: 1984974 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1984978 - Posted: 13 Mar 2019, 20:14:21 UTC - in response to Message 1984974.  

Unfortunately no response yet.
Update now that SETI is on line without dropping out. I am achieving roughly double the work units of the 1800X Ryzen m/cs which is much better than with WIN10 which was about 1.25. However there are four times as many cores rather than twice as many which is still disappointing in view of the cost difference.
I wonder Tom whether I should cut back the number of cpus in BOINC in Ubuntu as I did for Windows?
jsm


All my experience with a 2990wx were under Linux/CUDA91. The cpu results I suggested for down around 24-26 were from a mixture of results from the top performing 2990wx here (who gets cpu processing below an hour, down to 47 odd minutes) and from some of the stuff I have read via googling lots of 2990wx reviews/benchmarks.

Even if you are not running the Cuda91 with only its cpu apps, Linux cpu tasks should run significantly faster if you are using the Boinc Manager to restrain them to 24-26 threads.
My results were with SMT turned on.

I have looked at your processing time for cpus. It doesn't look like you are using the CUDA91 project. I think if you start running 24-26 threads you will see a significant drop in the average cpu processing time. Your goal is to see at least 10 of your cores running sub-47 minute processing times. If you are not getting at least 10 cores/threads doing that, you are probably still running more than 24-26 threads.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1984978 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1984993 - Posted: 13 Mar 2019, 22:15:39 UTC

Since I see no coprocessor listed for the host, I assume he is using the default Nouveau drivers which don't support compute. So don't know whether he has an AMD or a Nvidia.

Or possibly he has configured Seti with no gpu support in Seti@home Preferences on the website.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1984993 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1985003 - Posted: 13 Mar 2019, 23:08:06 UTC - in response to Message 1984993.  
Last modified: 13 Mar 2019, 23:14:03 UTC

Since I see no coprocessor listed for the host, I assume he is using the default Nouveau drivers which don't support compute. So don't know whether he has an AMD or a Nvidia.

Or possibly he has configured Seti with no gpu support in Seti@home Preferences on the website.


I haven't seen any gpu processing. He hasn't told us what video card he is using. Apparently, he is only interested in CPU processing on the 2990wx box. I will admit, it would muddy the waters if he starts gpu processing on the 2990wx.

He started processing under Linux about 10 days ago. And I don't think he is using Tbars-all-in-One so I am not sure if he has the fastest CPU tasks available or not.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1985003 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1985035 - Posted: 14 Mar 2019, 4:00:03 UTC

I haven't checked all the way through this thread, but have we confirmed the number of memory modules & that they are in the correct memory slots?
We have had situations in the past with Quad channel & multi CPU socket systems where the CPU performance was woeful because the memory modules weren't in the correct slots.
Grant
Darwin NT
ID: 1985035 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1985037 - Posted: 14 Mar 2019, 5:16:52 UTC - in response to Message 1985035.  

I haven't checked all the way through this thread, but have we confirmed the number of memory modules & that they are in the correct memory slots?
We have had situations in the past with Quad channel & multi CPU socket systems where the CPU performance was woeful because the memory modules weren't in the correct slots.

He started with two sticks of 8GB or 2 X 8GB for 16GB. He did not build the system. A completely unknowledgeable system builder did.

He then purchased two more sticks and put them in but installed them into the same side of the cpu. He did not realize he needed to install them into the other banks of memory on the other side of the socket. He corrected this after I informed him. So theoretically he has installed all 4 sticks of memory into the proper slots of the motherboard according to the manual and normal convention. He has said he is now running in proper 4 channel mode.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1985037 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1985041 - Posted: 14 Mar 2019, 5:44:10 UTC - in response to Message 1985037.  

He has said he is now running in proper 4 channel mode.

No worries.
Looking at the run times is rather odd. Some are OK, others are a bit long (but not too bad), yet others are way, way, way too long.
Grant
Darwin NT
ID: 1985041 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1985042 - Posted: 14 Mar 2019, 5:48:58 UTC - in response to Message 1985041.  

That is an indication of what happens in a heavily loaded 2990WX system that Tom had documented very well. He determined the highest core loading is with about 24-26 tasks to get equal runtimes on all tasks. The issue arises out of the fact that two 8 core dies do not have direct memory access but have to access system memory by passing through another 8 core die to get to the memory. So the memory latency for those indirectly attached dies is double the dies that have direct memory access.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1985042 · Report as offensive     Reply Quote
jsm

Send message
Joined: 1 Oct 16
Posts: 124
Credit: 51,135,572
RAC: 298
Isle of Man
Message 1985044 - Posted: 14 Mar 2019, 7:25:50 UTC - in response to Message 1985042.  

I confirm the following:-
4 x 8gb memory is in correct slots as per manual and quad channel operation
Only a 2gb video card installed as I am not interested in gpu processing. BOINC itself dismissed this as beneath its consideration!
Standard Ubuntu BOINC at the moment. I need a baseline
Operating at 60 threads at moment to again give baseline.

When discussing how many cpus/threads to specify surely we are not trying to get the average processing time for a task down but to get the overall throughput up? i.e. it doesnt matter if many tasks running simultaneously run slower if the total throughput over time is greater than a smaller number of tasks running faster.

I am suggesting that my approach of giving each operation sufficient time to run is the correct way to assess performance in order to ascertain whether a tweak is beneficial or not.
jsm
ID: 1985044 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1985045 - Posted: 14 Mar 2019, 7:34:46 UTC - in response to Message 1985044.  
Last modified: 14 Mar 2019, 7:40:59 UTC

I am suggesting that my approach of giving each operation sufficient time to run is the correct way to assess performance in order to ascertain whether a tweak is beneficial or not.

Yep, but making use of the lessons learnt by others is also a valid method.
Given that the goal is to process as many WUs per hour as possible, the fact that presently some WUs take 4 times longer than they should is a good indication that things could be a lot better than they presently are.
Fine tuning will take many days (or even weeks). Coarse tuning on the other hand can be done over a matter of hours.


Edit-
When discussing how many cpus/threads to specify surely we are not trying to get the average processing time for a task down but to get the overall throughput up? i.e. it doesnt matter if many tasks running simultaneously run slower if the total throughput over time is greater than a smaller number of tasks running faster.

And my systems are a good example of that. With hyperthreading off, run times would be faster, however the actual amount of work done per hour would be less. However, and it's a big however, my run times are consistent for a given type of WU. Your runtimes vary from OK, to extremely not OK.
Having stable run times will result in much greater throughput- once you've got the runtimes stable then you can see if less or more threads running gives more work per hour. But with such variable runtimes, determining how much work is being done per hour is near on impossible, and would have to be done over very long time frames.
Grant
Darwin NT
ID: 1985045 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1985046 - Posted: 14 Mar 2019, 7:42:10 UTC - in response to Message 1985044.  

Yes, you can test for maximum production by looking at the daily cpu task count reported at various cpu core counts. But since different types of tasks take longer or shorter to crunch based on their innate difficulty and having nothing to do with which device crunches them, they get credited with a very large difference in credit. Some tasks only credit a few if they are early or late overflows. Some Breakthrough Listen tasks only award 40-50 credits and some Arecibo tasks award over a hundred credits.

BOINC or Seti credit takes approximately two months to stabilize. So your testing tier timeframes are going to be long. The general rule of thumb which comes from crunching multiple gpu tasks on gpu versus single task crunching is that if the total time to crunch two tasks is longer than half the time to crunch the single task, you will crunch more tasks per time period or more tasks per day by crunching single tasks. Same rule applies to cpu tasks. If you can turn in more tasks per day crunching at 30 tasks at one time than trying to crunch 60 tasks at one time, you will receive more credit.

But what I should point out is that pioneers before you have tested this scenario already and are very familiar with which method and at what number of tasks to crunch concurrently on the 2990WX is most productive. You are certainly welcome to repeat the test yourself to verify the outcome. But you might also consider that may be unnecessary as others have already mapped that path and you can simply follow in their footsteps.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1985046 · Report as offensive     Reply Quote
jsm

Send message
Joined: 1 Oct 16
Posts: 124
Credit: 51,135,572
RAC: 298
Isle of Man
Message 1985052 - Posted: 14 Mar 2019, 10:02:34 UTC - in response to Message 1985046.  

I thought there is little experience logged for the 2990wx as it is pretty new? Could it be that this new architecture requires as much data as possible to ensure optimisation? After all 64 threads has not been available in desktop mode before as far as I know,
I note the info from looking at individual task times and will have to pay more notice to that but my main marker is how much daily credit shows in the FC stats page for each of my computers on the grounds that this is a direct comparison between the 1800X m/cs and the threadripper as the environment is identical eg internet loading etc.
jsm
ID: 1985052 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1985053 - Posted: 14 Mar 2019, 10:05:10 UTC

There is a "plateau" of sorts between 30-40 threads, based on both my experience and some benchmarking in another article. Setting the total threads in that area gets about the same level of production.

I think that 24-26 produces the maximum production. But I can't confirm that.

I can confirm the actual time it takes to process cpu tasks changes "the shape of a graph" with more showing much faster processing when the # of threads gets below 30. That is why it can't hurt to step from 60 threads down to 30 threads a week at a time. The RAC will eventually follow suit.

An Estimated RAC formula
(minutes per day divided by average processing time in minutes) X # of threads X estimated credit award = estimated RAC

(1440 / 47 ) X 26 X 55 credits =~ 43,812
(1440 / 60 ) X 26 X 55 =~ 34, 320
(1440 / 180) X 40 X 55 =~ 17,600
(1440 / 240) X 60 X 55 =~ 19,800

These are some examples of possible estimated RAC's based on average processing time vs. # of threads.

So basically, if JSM will tell us what his 60 thread estimated processing time results currently are we can get an idea of where the RAC might settle. I think the credit # of 55 per task may now be too low. But that is what another Setizen has been using to figure out a prediction of where the RAC would land.

HTH,
Tom
A proud member of the OFA (Old Farts Association).
ID: 1985053 · Report as offensive     Reply Quote
jsm

Send message
Joined: 1 Oct 16
Posts: 124
Credit: 51,135,572
RAC: 298
Isle of Man
Message 1985060 - Posted: 14 Mar 2019, 10:48:17 UTC - in response to Message 1985053.  

I am not sure what info you need but this is the link to the stats I use on a daily basis.
https://stats.free-dc.org/stats.php?page=user&proj=sah&name=10381838

I propose to reduce the cpus available under BOINC to 50 at the weekend to see what happens.
jsm
ID: 1985060 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1985061 - Posted: 14 Mar 2019, 10:58:38 UTC

For what it is worth Kevin Olley's 2990wx is running 57 Gflops processing speed on his standard cpu tasks. He reports he is running 24 threads (I think).
https://setiathome.berkeley.edu/show_host_detail.php?hostid=8528566

And you are running about 13 Gflops.

Kevin is running 128 Gflops on his AP cpu tasks. You are running 16 or 50 Gflops on your AP tasks.
https://setiathome.berkeley.edu/show_host_detail.php?hostid=8680050

Both of you are running Linux. Both of you are running 2990wx cpus. You are running two different versions of the Boinc client(s). (Repository, I think vs. Tbars-all-in-One). I know he is running a fairly fast cpu, maybe he is stable at 4.0Ghz.

He is running 24 threads, you are running 60.

If you want an exact "apples to apples" comparison on the threads number you would need to install Tbars-all-in-one. No one has had good luck going from the Repository Boinc Manager to the "all-in-one" without an OS re-install. You should be able to merge the resulting 2 seti id's with no trouble afterwards.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1985061 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1985062 - Posted: 14 Mar 2019, 11:00:42 UTC - in response to Message 1985060.  
Last modified: 14 Mar 2019, 11:08:20 UTC

I am not sure what info you need but this is the link to the stats I use on a daily basis.
https://stats.free-dc.org/stats.php?page=user&proj=sah&name=10381838

I propose to reduce the cpus available under BOINC to 50 at the weekend to see what happens.
jsm


Step by step :)
Make sure you take notes on your current estimated time to process (display all tasks, take the # from the not yet started apps). Before and after (time/date stamp too).

7 days of processing will show a solid trend. I think you will notice a difference within a couple of days but not much of one at that # of threads. If you were to drop to 30 there should be a much larger difference :)

Your RAC should continue upwards no matter what. However the slope could get steeper as the # of threads goes down to 24-26. Yes, you can go below that # but I believe that even with the average speed up going up, total production will go down.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1985062 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 69 · Next

Message boards : Number crunching : Ryzen and Threadripper


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.