From FX to Ryzen

Message boards : Number crunching : From FX to Ryzen
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 14 · Next

AuthorMessage
Profile RueiKe Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 14 Feb 16
Posts: 492
Credit: 378,512,430
RAC: 785
Taiwan
Message 1854616 - Posted: 11 Mar 2017, 3:56:04 UTC - in response to Message 1854613.  

What is the ambient temperature?


22C
GitHub: Ricks-Lab
Instagram: ricks_labs
ID: 1854616 · Report as offensive
Profile RueiKe Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 14 Feb 16
Posts: 492
Credit: 378,512,430
RAC: 785
Taiwan
Message 1854620 - Posted: 11 Mar 2017, 4:11:40 UTC

I just posted a video on my experience with the R7 1700 so far. It includes benchmarks with the FX8370, i73930K, and i7-6950X. I have included the AVX MB app in my benchmarks. https://youtu.be/YpEOmxG3zV4
GitHub: Ricks-Lab
Instagram: ricks_labs
ID: 1854620 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1854667 - Posted: 11 Mar 2017, 7:03:35 UTC - in response to Message 1854616.  

What is the ambient temperature?


22C

Nice!!. That is what I expected from a real motherboard/system environment with typical ambient. I knew that the readings posted so far were not typical of real work. I assume those readings were with what you described earlier as 15 MB CPU tasks running currently and one CPU core supporting one GPU task?

I have good feelings about my project now. Should have it up and running this next week after all the bits arrive.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1854667 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1854672 - Posted: 11 Mar 2017, 7:30:38 UTC - in response to Message 1854620.  

I just posted a video on my experience with the R7 1700 so far. It includes benchmarks with the FX8370, i73930K, and i7-6950X. I have included the AVX MB app in my benchmarks. https://youtu.be/YpEOmxG3zV4


Subbed and will be watching for those GPU loading things. Been chatting with Wendell and have seen PCPer's article testing that it's not Windows' scheduler slowing gaming performance (as some has surmised). Will be interesting to see where the bottlenecks really lie, and if they impact us at all. My money's on that the microcode still needs some work, with respect to cache, but I guess that'll come out in the wash.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1854672 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1854677 - Posted: 11 Mar 2017, 7:42:06 UTC

As was mentioned in a comment in your YT video, Ryzen really responds to higher memory clocks since the Infinity Fabric foundation is what everything else is built on. Higher memory clocks bring up the transfer rates and reduces latency in the L1>L2>L3 cache hits. Definitely update to the 0902 BIOS next week for stability. You might want to look over these links explaining the general overclocking guide and the detail of the clock domains.

asus-rog-crosshair-vi-hero-extreme-overclocking-guide

amd-ryzen-clock-domains-detailed

Also, a tip about setting an offset in the CPU SOC Voltage to help with memory overclocking and also to prevent the CPU SOC voltage from increasing too far when overclocked in Auto. This offset also helps prevent BIOS issues.

CPU SOC voltage offset mode
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1854677 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1854701 - Posted: 11 Mar 2017, 9:13:58 UTC
Last modified: 11 Mar 2017, 9:16:28 UTC

Rick i suggest to use the SSE4.1 app on your Ryzen.
The AVX version seems rather slow.

Edit: I`m guessing running 15 instances isn`t the best idea.
Check with just 8 instances.


With each crime and every kindness we birth our future.
ID: 1854701 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1854704 - Posted: 11 Mar 2017, 9:20:51 UTC - in response to Message 1854701.  

Rick i suggest to use the SSE4.1 app on your Ryzen.
The AVX version seems rather slow.


That was predicted once the details of Ryzen became known, SSE and SSSE code was expected to give the best performance given the limitations of Ryzen's AVX implementation.
Zen offers more FP flexibility than Sandy Bridge and will deliver much better performance on SSE code. Haswell and Skylake, however, provide twice the flops per clock using AVX FMA instructions and, more importantly, twice the cache bandwidth to feed the FP and SIMD execution units.

Grant
Darwin NT
ID: 1854704 · Report as offensive
Profile RueiKe Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 14 Feb 16
Posts: 492
Credit: 378,512,430
RAC: 785
Taiwan
Message 1854721 - Posted: 11 Mar 2017, 10:03:16 UTC - in response to Message 1854701.  

Rick i suggest to use the SSE4.1 app on your Ryzen.
The AVX version seems rather slow.

Edit: I`m guessing running 15 instances isn`t the best idea.
Check with just 8 instances.


Hi Mike, Where do I get that app? I checked your site and only found SSE2, SSE3, and AVX.
GitHub: Ricks-Lab
Instagram: ricks_labs
ID: 1854721 · Report as offensive
Profile RueiKe Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 14 Feb 16
Posts: 492
Credit: 378,512,430
RAC: 785
Taiwan
Message 1854733 - Posted: 11 Mar 2017, 11:35:07 UTC - in response to Message 1854701.  

Rick i suggest to use the SSE4.1 app on your Ryzen.
The AVX version seems rather slow.

Edit: I`m guessing running 15 instances isn`t the best idea.
Check with just 8 instances.


I have tried the 0.46 Beta6 installer and found when I select SSE4.1 or SSE4.2, the only CPU app available after install is MB_win_x64_SSE3_VS2008_r3330. I have bench tested AVX against SSE2 and SSE3 and found AVX is fastest.
GitHub: Ricks-Lab
Instagram: ricks_labs
ID: 1854733 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1854735 - Posted: 11 Mar 2017, 11:38:26 UTC - in response to Message 1854733.  

Rick i suggest to use the SSE4.1 app on your Ryzen.
The AVX version seems rather slow.

Edit: I`m guessing running 15 instances isn`t the best idea.
Check with just 8 instances.


I have tried the 0.46 Beta6 installer and found when I select SSE4.1 or SSE4.2, the only CPU app available after install is MB_win_x64_SSE3_VS2008_r3330. I have bench tested AVX against SSE2 and SSE3 and found AVX is fastest.


From the scant details I've been able to glean, some or another of the AVX implementations should still be superior, despite the half clockrate effective. Fewer instructions per byte of data.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1854735 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1854743 - Posted: 11 Mar 2017, 12:30:07 UTC - in response to Message 1854721.  

Rick i suggest to use the SSE4.1 app on your Ryzen.
The AVX version seems rather slow.

Edit: I`m guessing running 15 instances isn`t the best idea.
Check with just 8 instances.


Hi Mike, Where do I get that app? I checked your site and only found SSE2, SSE3, and AVX.


You are right there is no SSE 4.1 for win.
I mixed up with Linux versions.


With each crime and every kindness we birth our future.
ID: 1854743 · Report as offensive
Kiska
Volunteer tester

Send message
Joined: 31 Mar 12
Posts: 302
Credit: 3,067,762
RAC: 0
Australia
Message 1854746 - Posted: 11 Mar 2017, 12:37:33 UTC
Last modified: 11 Mar 2017, 12:38:20 UTC

I believe I read somewhere that Ryzen does:
8 AVX(128 bit) ops/cycle and 8 AVX2(256 bit) ops/cycle
Whereas Intel's does:
16 AVX(128 bit) ops/cycle and 8 AVX2(256 bit) ops/cycle

So AMD's AVX2 implementation is as good as Intel's, but AVX is half that of Intel
ID: 1854746 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1854893 - Posted: 11 Mar 2017, 23:48:44 UTC - in response to Message 1854620.  

I just posted a video on my experience with the R7 1700 so far. It includes benchmarks with the FX8370, i73930K, and i7-6950X. I have included the AVX MB app in my benchmarks. https://youtu.be/YpEOmxG3zV4

It looks like normal AR Arecibo tasks are running ~2hrs with the AVX app.
Is that at the 3.5GHz OC you mentioned?

I would have hoped for run times in the 1-1.5hr range at that speed. Given my dual E5-2670 system @ 3.0GHz takes ~2.5hrs when running 32 at a time.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1854893 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1854896 - Posted: 11 Mar 2017, 23:57:45 UTC - in response to Message 1854746.  
Last modified: 11 Mar 2017, 23:58:21 UTC

I believe I read somewhere that Ryzen does:
8 AVX(128 bit) ops/cycle and 8 AVX2(256 bit) ops/cycle
Whereas Intel's does:
16 AVX(128 bit) ops/cycle and 8 AVX2(256 bit) ops/cycle

So AMD's AVX2 implementation is as good as Intel's, but AVX is half that of Intel


There is a lot more to it, which I've discussed in the past with Joe Segur (who authored the AVX code in the stock CPU app). Our processing on modern hardware tends to be memory bound, because of its sparsity. This means a great equalising limit you will see between both parties, in that the storage can't keep up with the processing. Stalls like this on both vendors are costly.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1854896 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1854902 - Posted: 12 Mar 2017, 0:11:35 UTC - in response to Message 1854896.  
Last modified: 12 Mar 2017, 0:14:56 UTC

I believe I read somewhere that Ryzen does:
8 AVX(128 bit) ops/cycle and 8 AVX2(256 bit) ops/cycle
Whereas Intel's does:
16 AVX(128 bit) ops/cycle and 8 AVX2(256 bit) ops/cycle

So AMD's AVX2 implementation is as good as Intel's, but AVX is half that of Intel


There is a lot more to it, which I've discussed in the past with Joe Segur (who authored the AVX code in the stock CPU app). Our processing on modern hardware tends to be memory bound, because of its sparsity.

From memory the caches for Ryzen are about the same size as Intel, but due to the way they are arranged the bandwidth available for AVX on Ryzen is half that of Intel's.

A close look at Ryzen's architecture.
David Kanter on Ryzen
Grant
Darwin NT
ID: 1854902 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1854906 - Posted: 12 Mar 2017, 0:27:30 UTC - in response to Message 1854893.  

It looks like normal AR Arecibo tasks are running ~2hrs with the AVX app.
Is that at the 3.5GHz OC you mentioned?

I would have hoped for run times in the 1-1.5hr range at that speed. Given my dual E5-2670 system @ 3.0GHz takes ~2.5hrs when running 32 at a time.

For reference- My i7 2600 with HyperThreading on and 2 cores reserved for GPU use running at 3.4GHz using the Lunatics AVX application is knocking WUs over in about 2hrs (2Hrs 20min for VLARs) 6 at a time.
Grant
Darwin NT
ID: 1854906 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1854914 - Posted: 12 Mar 2017, 0:45:31 UTC - in response to Message 1854906.  

It looks like normal AR Arecibo tasks are running ~2hrs with the AVX app.
Is that at the 3.5GHz OC you mentioned?

I would have hoped for run times in the 1-1.5hr range at that speed. Given my dual E5-2670 system @ 3.0GHz takes ~2.5hrs when running 32 at a time.

For reference- My i7 2600 with HyperThreading on and 2 cores reserved for GPU use running at 3.4GHz using the Lunatics AVX application is knocking WUs over in about 2hrs (2Hrs 20min for VLARs) 6 at a time.

1hr 10-15mins on my i5 3570K locked @ 3.4GHz, about 5mins longer on my i5 2500K which is also locked @ 3.4GHz.

Cheers.
ID: 1854914 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1854959 - Posted: 12 Mar 2017, 3:26:43 UTC

Took forever to find one of my CPU tasks that were either VLAR or standard .42-.44 AR. No sign of any BLC CPU task or standard AR .42-.44 task still hanging around. What I remember is that the BLC usually run 2 hour 10 minutes to 2 hours 15 minutes. The standard Arecibo CPU task runs 1 hour 55 minutes. Found 3 of those in my finished tasks. Also found several VLAR CPU tasks at .009-.010 AR and they crunched in almost the same 1 hour 55 minutes as I wanted to try and compare what Rick is getting out of his 1700 at 3.6 Ghz versus my FX-8370 at 4.6 Ghz . Still can't do an apples-apples comparison. Hopeful that by the end of next week I will have my Ryzen system put together and I can make some comparisons to what it replaced, my FX-8300 at 4.0 Ghz.

What I think that Rick and myself are shooting for is an overall increase in productivity, not because of the speed of the chip but that it has double the cores of what it replaces. In Rick's case, I believe he is hoping that it has enough throughput to support his 5 ATI cards. I think he is bottlenecking 1 card currently. I am thinking I might be able to eventually support more than 2 GPU tasks per card since I would have enough cores to support an increase in GPU tasks along with maybe more standalone CPU task production. Should not have to limit myself to 4 real CPU cores on the FX processors and run 8 real CPU cores on my 1700X because the FPU registers on the Ryzen aren't shared in the same fashion they were on the FX.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1854959 · Report as offensive
Profile RueiKe Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 14 Feb 16
Posts: 492
Credit: 378,512,430
RAC: 785
Taiwan
Message 1854979 - Posted: 12 Mar 2017, 7:28:20 UTC - in response to Message 1854746.  

I believe I read somewhere that Ryzen does:
8 AVX(128 bit) ops/cycle and 8 AVX2(256 bit) ops/cycle
Whereas Intel's does:
16 AVX(128 bit) ops/cycle and 8 AVX2(256 bit) ops/cycle

So AMD's AVX2 implementation is as good as Intel's, but AVX is half that of Intel


I have compared AVX performance between the 4 CPU's I have:
FX-8370@4.7GHz: 5604s
R7-1700@3.6GHz: 3372s
i7-3930K@4.1GHz: 3931s
i7-6950X@4.2GHz: 3666s
Using MB_win_x64_AVX_VS2010_r3330 on WU 21jl16ad.13182.18067.14.41.184_vlar_CPU.
So it does well even against the 6950x. But this was with only a single task running on a single thread. I probably need to compare performance on recent data sets in production on this systems.
GitHub: Ricks-Lab
Instagram: ricks_labs
ID: 1854979 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1854982 - Posted: 12 Mar 2017, 7:47:06 UTC - in response to Message 1854979.  

Hi Rick, yes I watched your video and the numbers you came up for scaling looks mighty fine for Ryzen 1700. Are you still feeling out the system or have you started to tear down the 7-way block for your ATI cards? Curious how that turns out and whether the 1700 and CH6 board solves your bottlenecking on the GPU's.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1854982 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 14 · Next

Message boards : Number crunching : From FX to Ryzen


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.