• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD's Ryzen Cache Analyzed - Improvements; Improveable; CCX Compromises

Im seeing too many badly theorycrafted reasons for that bad gaming performance (that disabling smt fixes)
We actually covered that in an update on sunday if you are interested : http://www.hardware.fr/articles/956-8/retour-smt-mode-high-performance.html

Long story short, some Windows 10 Anniversary Update scheduler settings aren't set the same way for Ryzen and Intel CPUs. We tested that and updated our article accordingly.

My understanding from hardware.fr is that the CCX complex runs at the same frequency as the memory, and somehow the bandwidth is shared between inter module communication and memory access.

This is the reason for which higher memory frequency will provide much better results as the bandwidth for inter-module communication increases with frequency. From 2133 to 3200 the bandwidth for internal communication increases from 34GB/s to 51GB/s, and that's why the witcher 3 benchmark posted above scales so well, not necessarily due to faster memory, which by itself has little impact as we saw numerous times, but because the communication between modules increases drastically with better memory frequency.
Actually the Witcher 3 "bench" is from a MSI/Intel advert (if I remember correctly).

But your overall point is exactly correct : data fabric clock is set with memory (so DDR4-2400 = 1200 MHz clock for that bus), so if you are limited there, you'll see a componding effect by pushing memory higher.
 
Last edited:

When was final BIOS?

Because here's what Stilt originally got which is no where near what they got. Though this is from last Thursday and I've not been keeping up with how often BIOSes have been released.

Code:
Logical Processor to Cache Map:
*---------------  Data Cache          0, Level 1,   32 KB, Assoc   8, LineSize  64
*---------------  Instruction Cache   0, Level 1,   64 KB, Assoc   4, LineSize  64
*---------------  Unified Cache       0, Level 2,  512 KB, Assoc   8, LineSize  64
*---------------  Unified Cache       1, Level 3,   16 MB, Assoc  16, LineSize  64
-*--------------  Data Cache          1, Level 1,   32 KB, Assoc   8, LineSize  64
-*--------------  Instruction Cache   1, Level 1,   64 KB, Assoc   4, LineSize  64
-*--------------  Unified Cache       2, Level 2,  512 KB, Assoc   8, LineSize  64
-*--------------  Unified Cache       3, Level 3,   16 MB, Assoc  16, LineSize  64
--*-------------  Data Cache          2, Level 1,   32 KB, Assoc   8, LineSize  64
--*-------------  Instruction Cache   2, Level 1,   64 KB, Assoc   4, LineSize  64
--*-------------  Unified Cache       4, Level 2,  512 KB, Assoc   8, LineSize  64
--*-------------  Unified Cache       5, Level 3,   16 MB, Assoc  16, LineSize  64
---*------------  Data Cache          3, Level 1,   32 KB, Assoc   8, LineSize  64
---*------------  Instruction Cache   3, Level 1,   64 KB, Assoc   4, LineSize  64
---*------------  Unified Cache       6, Level 2,  512 KB, Assoc   8, LineSize  64
---*------------  Unified Cache       7, Level 3,   16 MB, Assoc  16, LineSize  64
----*-----------  Data Cache          4, Level 1,   32 KB, Assoc   8, LineSize  64
----*-----------  Instruction Cache   4, Level 1,   64 KB, Assoc   4, LineSize  64
----*-----------  Unified Cache       8, Level 2,  512 KB, Assoc   8, LineSize  64
----*-----------  Unified Cache       9, Level 3,   16 MB, Assoc  16, LineSize  64
-----*----------  Data Cache          5, Level 1,   32 KB, Assoc   8, LineSize  64
-----*----------  Instruction Cache   5, Level 1,   64 KB, Assoc   4, LineSize  64
-----*----------  Unified Cache      10, Level 2,  512 KB, Assoc   8, LineSize  64
-----*----------  Unified Cache      11, Level 3,   16 MB, Assoc  16, LineSize  64
------*---------  Data Cache          6, Level 1,   32 KB, Assoc   8, LineSize  64
------*---------  Instruction Cache   6, Level 1,   64 KB, Assoc   4, LineSize  64
------*---------  Unified Cache      12, Level 2,  512 KB, Assoc   8, LineSize  64
------*---------  Unified Cache      13, Level 3,   16 MB, Assoc  16, LineSize  64
-------*--------  Data Cache          7, Level 1,   32 KB, Assoc   8, LineSize  64
-------*--------  Instruction Cache   7, Level 1,   64 KB, Assoc   4, LineSize  64
-------*--------  Unified Cache      14, Level 2,  512 KB, Assoc   8, LineSize  64
-------*--------  Unified Cache      15, Level 3,   16 MB, Assoc  16, LineSize  64
--------*-------  Data Cache          8, Level 1,   32 KB, Assoc   8, LineSize  64
--------*-------  Instruction Cache   8, Level 1,   64 KB, Assoc   4, LineSize  64
--------*-------  Unified Cache      16, Level 2,  512 KB, Assoc   8, LineSize  64
--------*-------  Unified Cache      17, Level 3,   16 MB, Assoc  16, LineSize  64
---------*------  Data Cache          9, Level 1,   32 KB, Assoc   8, LineSize  64
---------*------  Instruction Cache   9, Level 1,   64 KB, Assoc   4, LineSize  64
---------*------  Unified Cache      18, Level 2,  512 KB, Assoc   8, LineSize  64
---------*------  Unified Cache      19, Level 3,   16 MB, Assoc  16, LineSize  64
----------*-----  Data Cache         10, Level 1,   32 KB, Assoc   8, LineSize  64
----------*-----  Instruction Cache  10, Level 1,   64 KB, Assoc   4, LineSize  64
----------*-----  Unified Cache      20, Level 2,  512 KB, Assoc   8, LineSize  64
----------*-----  Unified Cache      21, Level 3,   16 MB, Assoc  16, LineSize  64
-----------*----  Data Cache         11, Level 1,   32 KB, Assoc   8, LineSize  64
-----------*----  Instruction Cache  11, Level 1,   64 KB, Assoc   4, LineSize  64
-----------*----  Unified Cache      22, Level 2,  512 KB, Assoc   8, LineSize  64
-----------*----  Unified Cache      23, Level 3,   16 MB, Assoc  16, LineSize  64
------------*---  Data Cache         12, Level 1,   32 KB, Assoc   8, LineSize  64
------------*---  Instruction Cache  12, Level 1,   64 KB, Assoc   4, LineSize  64
------------*---  Unified Cache      24, Level 2,  512 KB, Assoc   8, LineSize  64
------------*---  Unified Cache      25, Level 3,   16 MB, Assoc  16, LineSize  64
-------------*--  Data Cache         13, Level 1,   32 KB, Assoc   8, LineSize  64
-------------*--  Instruction Cache  13, Level 1,   64 KB, Assoc   4, LineSize  64
-------------*--  Unified Cache      26, Level 2,  512 KB, Assoc   8, LineSize  64
-------------*--  Unified Cache      27, Level 3,   16 MB, Assoc  16, LineSize  64
--------------*-  Data Cache         14, Level 1,   32 KB, Assoc   8, LineSize  64
--------------*-  Instruction Cache  14, Level 1,   64 KB, Assoc   4, LineSize  64
--------------*-  Unified Cache      28, Level 2,  512 KB, Assoc   8, LineSize  64
--------------*-  Unified Cache      29, Level 3,   16 MB, Assoc  16, LineSize  64
---------------*  Data Cache         15, Level 1,   32 KB, Assoc   8, LineSize  64
---------------*  Instruction Cache  15, Level 1,   64 KB, Assoc   4, LineSize  64
---------------*  Unified Cache      30, Level 2,  512 KB, Assoc   8, LineSize  64
---------------*  Unified Cache      31, Level 3,   16 MB, Assoc  16, LineSize  64
 
When was final BIOS?
The Asus BIOS (5704) is dated 23/02 (it wasn't available publicly then obviously, but same bios is available on Asus's website now, we checked checksums to confirm it's the same). This is the BIOS that includes the "final" (before launch) microcode update from AMD. Cache is shown correctly there as wark0 posted earlier in the thread (here : http://forum.hardware.fr/hfr/Hardware/hfr/dossier-1800x-retour-sujet_1017196_20.htm#t10089095 )

To be clear, 5704 was not the BIOS given to reviewers (I think 5702 ?) on the motherboards by AMD, you had to flash it yourself but that's pretty common with launchs and AMD gave many heads up on that.
 
Last edited:
Im seeing too many badly theorycrafted reasons for that bad gaming performance (that disabling smt fixes)

I can confirm that Windows 10 is a big issue and could cause 10FPS throughout most games.
SMT is also an issue but Windows 10 is worse than Windows 7 and Linux in terms of performance, most games have been benched in Windows 10.
I cannot find the same findings with a Xeon 2680 V2, it only drops a fps in windows 10 instead of 10 and in csgo 20 for me.

I can confirm something is iffy.
 
I can confirm that Windows 10 is a big issue and could cause 10FPS throughout most games.
SMT is also an issue but Windows 10 is worse than Windows 7 and Linux in terms of performance, most games have been benched in Windows 10.
I cannot find the same findings with a Xeon 2680 V2, it only drops a fps in windows 10 instead of 10 and in csgo 20 for me.

I can confirm something is iffy.
Again, we confirmed that scheduler isn't configured the same way for Ryzen and Intel CPUs in Windows 10, which explains the discrepencies between SMT OFF and ON in games, check my link above.
 
So...... Is this AMD's equivalent to Nvidia not doing Async? And can software coding help address this?

this is windows load balancing working like it id on nehalems and first gen skylakes

basicly windows treats ryzen as a massive 16 core cpu instead of 8c 16t
and that basicly creates all of the other problems that this cpu has because normally windows throw all of the heavy workloads into the physical cores and let the rest on the logical ones but here windows throw everything at everything resulting on the cpu to have to rely on "stealing" ram from the system ram because windows thinks it has a massive 138mb l3

and due to the nature of the smt some times when windows keeps a thread on the cpu (remember amd says that a ccx is a cpu not a core) the data on the l3 gets "lost" and thus windows re issues a new load to the said thread but the data is already on l3 thus resulting on the core parking bug because the cpu needs to pause the new workload to flush the identical one that is already on the l3
 
this is windows load balancing working like it id on nehalems and first gen skylakes

basicly windows treats ryzen as a massive 16 core cpu instead of 8c 16t
and that basicly creates all of the other problems that this cpu has because normally windows throw all of the heavy workloads into the physical cores and let the rest on the logical ones but here windows throw everything at everything resulting on the cpu to have to rely on "stealing" ram from the system ram because windows thinks it has a massive 138mb l3

and due to the nature of the smt some times when windows keeps a thread on the cpu (remember amd says that a ccx is a cpu not a core) the data on the l3 gets "lost" and thus windows re issues a new load to the said thread but the data is already on l3 thus resulting on the core parking bug because the cpu needs to pause the new workload to flush the identical one that is already on the l3
How did you come to this conclusion?
 
One does wonder if the 4 core parts will suffer the same fate since it will be one straight core complex.

The quad cores might be a beast!
 
How did you come to this conclusion?
we already have the full picture of the problems

and we already have similiar problems in the past(identical to be honest ) its not really hard to connect the dots especially when we know that the smt taps into all the three caches

also a really good video to watch
 
Last edited:
If this is the case, why on earth didn't AMD just send an email to Microsoft to modify the scheduler in the way they wanted, just before the launch or even better, why they didn't release a driver. In the old days for Athlon X2 there was a driver called dual core optimizer.
Good question. Also even all of that makes a lot of sense might be just load of BS who knows. But like I've said to many times already, for the love of God please somebody disable smt and one of the CCX and bench games and compare. If the scores sucks the same, means thread/cache shuffle has nothing to do with it.
 
In their review they either the found the answer to the poor gaming performance of Ryzen or they are doing something very wrong with the 7700k lol. Because in their test ryzen is matching 7700k more or less. And that is not bad taking into consideration ryzen will dominate everything else.
 
The Asus BIOS (5704) is dated 23/02 (it wasn't available publicly then obviously, but same bios is available on Asus's website now, we checked checksums to confirm it's the same). This is the BIOS that includes the "final" (before launch) microcode update from AMD. Cache is shown correctly there as wark0 posted earlier in the thread (here : http://forum.hardware.fr/hfr/Hardware/hfr/dossier-1800x-retour-sujet_1017196_20.htm#t10089095 )

To be clear, 5704 was not the BIOS given to reviewers (I think 5702 ?) on the motherboards by AMD, you had to flash it yourself but that's pretty common with launchs and AMD gave many heads up on that.

Ok I understand you now, thanks.

How much performance is left in Ryzen once things get tweaked out? 10%?
 
I can confirm that Windows 10 is a big issue and could cause 10FPS throughout most games.
SMT is also an issue but Windows 10 is worse than Windows 7 and Linux in terms of performance, most games have been benched in Windows 10.
I cannot find the same findings with a Xeon 2680 V2, it only drops a fps in windows 10 instead of 10 and in csgo 20 for me.

I can confirm something is iffy.

My windows 10 constantly keeps switching to Power Saving Mode setting.
 
I highly recommend everyone to take a look at this:


Some vary salient points in there and also showing that Bulldozer seems to have overtaken 2500k in gaming over time ...
 
Last edited:
It all depends on how you see things really. If you want absolutely the best gaming performance then stick with Intel. If you need a CPU with lots of threads that can also game pretty well then go for the R7s.

My main gaming rig still has a 3770k, haven't found a reason to upgrade yet. Same with my Steambox build, running a 4590. I'll replace all my crunchers with 1700s, that's a given.
 
Getting GTX970 feels for some reason. Also, Bulldozer "modularity."
Anyone tried to disable 4 cores (or 7, to be safe) in BIOS and see how it fares?


It all depends on how you see things really. If you want absolutely the best gaming performance then stick with Intel. If you need a CPU with lots of threads that can also game pretty well but doesn't cost an arm and a leg, then go for the R7s.

Fixed that for you.
Admittedly, first time I saw the Zen reviews I thought that it'd kill intel in anything outside games. Then I remembered that even industry standard productivity software can be -and often is- embarrassingly single threaded, so a combination of both (core count and per-core IPC) is often needed, and here Intel still reigns, as long as price isn't a factor.


I highly recommend everyone to take a look at this:


Some vary salient points in there and also showing that Bulldozer seems to have overtaken 2500k in gaming over time ...

IMO, saying that that channel is merely an AMD apologist would be an understatment.
(And I have a nagging feeling that I've already said that somewhere around here....)
 
If Skylake-E and Kaby Lake-E samples are finished I don;t know how much Intel could change to improve his tragic position where his 1700$ worth CPU lost from 500$ AMD with 2 core less and much less power consumption, almost half.
Even if Intel catch AMD that would be with 8 and 10 cores processors and 150W power consumption.
Because of that upgrade on AMD is good choice at the moment.
Special if someone want small PC, mATX mobo, fanless 500W PSU and RX 580 + 1800X.

I don;t want to comment at all rumors about some strange lags, and some hidden problems of AMD.
Their CPU on paper shine, numbers are fantastic. If powerfull Intel fall so low that need to justify his presents with i7-7700K and
4.5GHz in games locked on 2 and 4 cores and on that way distract customers from AMD, than really no word. No one will help you except i7-7700K.
Everyone who sabotage real picture of AMD processor is enemy of enthusiasts and improvements and shoot in own legs.
Because AMD give you CPU capable to beat i7-6950X on LN2 for 500$, you can buy world recorder for 500$, with 2 core less, and far smaller power consumption.

In Windows 10 and DX12 people could get far better performance than Intel Broadwell-E. But Intel didn;t do nothing to provide that. We non stop listen about some walls and no space for improvements. No space to drain same architecture 5 years, everything what they done with X79 and X99 could fit in single socket, but there is space for new generations.

Well said, enthusiast should be thankful we have a real competition in high end CPU. This was my first AMD CPU and I am clearly impressed by 1800X and amazing by how much money I have thrown at Intel for better IPC. And AMD just gave me a 6900K/6850K equivalent for $499.
 
When one core accesses another core's memory, it behaves like an L4 instead of an L3. The extra latency at L4 is still much better than the system RAM so, I really don't see the cause for the fuss.

In fact, it does look like 1800X's dedicated L3 (~15 ns) is faster than i7-6900K (~17ns).
 
Last edited:
When one core accesses another core's memory, it behaves like an L4 instead of an L3. The extra latency at L4 is still much better than the system RAM so, I really don't see the cause for the fuss.

In fact, it does look like 1800X's dedicated L3 (~15 ns) is faster than i7-6900K (~17ns).

I think people were bored and haven't needed to talk about AMD for like 5 years and it all came out at once. As the workstation CPU that it is, it is great and handling some gaming on the side pretty damn well. If we get 4 core /8 thread CPUs from AMD that still can't clock higher than 4.0 GHz and falling well behind 3000 series i5 CPUs, everyone can panic. Until then its bug squishing and industry adjustment time!
 
I think people were bored and haven't needed to talk about AMD for like 5 years and it all came out at once. As the workstation CPU that it is, it is great and handling some gaming on the side pretty damn well. If we get 4 core /8 thread CPUs from AMD that still can't clock higher than 4.0 GHz and falling well behind 3000 series i5 CPUs, everyone can panic. Until then its bug squishing and industry adjustment time!

You'll have to wait till the refresh for (hopefully) better clocks.
 
You'll have to wait till the refresh for (hopefully) better clocks.

I am going to wait to see Ryzen 5 in action. We have not concrete information about those chips and how they OC. Overclocking 8 cores is a different animal than overclocking 4 cores...historically at least. And I don't think the limit in OC is entirely the architecture, but we will find out.
 
I think people were bored and haven't needed to talk about AMD for like 5 years and it all came out at once. As the workstation CPU that it is, it is great and handling some gaming on the side pretty damn well. If we get 4 core /8 thread CPUs from AMD that still can't clock higher than 4.0 GHz and falling well behind 3000 series i5 CPUs, everyone can panic. Until then its bug squishing and industry adjustment time!
CPU matters less and less with the rise of Vulkan and D3D12. I am utterly unconcerned about it.

Proof: http://www.techspot.com/review/1348-amd-ryzen-gaming-performance/

In games that are well multithreaded, Ryzen does fine. In games that aren't, it does well enough. The higher the resolution and detail, the more Ryzen closes the gap with Intel.
 
Last edited:
Back
Top