# AMD Ryzen Infinity Fabric Ticks at Memory Speed



## btarunr (Mar 17, 2017)

Memory clock speeds will go a long way in improving the performance of an AMD Ryzen processor, according to new information by the company, which reveals that Infinity Fabric, the high-bandwidth interconnect used to connect the two quad-core complexes (CCXs) on 6-core and 8-core Ryzen processors with other uncore components, such as the PCIe root-complex, and the integrated southbridge; is synced with the memory clock. AMD made this revelation in a response to a question posed by Reddit user CataclysmZA.

Infinity Fabric, a successor to HyperTransport, is AMD's latest interconnect technology that connects the various components on the Ryzen "Summit Ridge" processor, and on the upcoming "Vega" GPU family. According to AMD, it is a 256-bit wide bi-directional crossbar. Think of it as town-square for the chip, where tagged data and instructions change hands between the various components. Within the CCX, the L3 cache performs some inter-core connectivity. The speed of the Infinity Fabric crossbar on a "Summit Ridge" Ryzen processor is determined by the memory clock. When paired with DDR4-2133 memory, for example, the crossbar ticks at 1066 MHz (SDR, actual clock). Using faster memory, according to AMD, hence has a direct impact on the bandwidth of this interconnect. 





*View at TechPowerUp Main Site*


----------



## eidairaman1 (Mar 17, 2017)

So in other news it is memory bandwidth intensive. Can it utilize the bandwidth properly and show considerable gains in memory performance unlike Piledriver?


----------



## ratirt (Mar 17, 2017)

Hmm. one thought then. So those low latencies were actually architectural not like some people said windows scheduler problem? Wonder if they will release refurbished Ryzen now or how that is going to work.. also it would be great if the "crossbar" connection You mentioned tick not with half the speed of the memory but full speed. That would kick things up a notch I'd say.


----------



## eidairaman1 (Mar 17, 2017)

ratirt said:


> Hmm. one thought then. So those low latencies were actually architectural not like some people said windows scheduler problem?
> Wonder if they will release refurbished Ryzen now or how that is going to work.. also it would be great if the "crossbar" connection You mentioned tick not with half the speed of the memory but full speed. That would kick things up a notch I'd say.




There is a Reason the Ryzen Logo is an Incomplete circle, it means the arch is open to improvements big and small.


----------



## Legacy-ZA (Mar 17, 2017)

This is the only thing I don't like what I read about the Ryzen CPU's.

Is there room for improvement? Yep. Will it cost you a new motherboard and CPU in the near future? Yep.


----------



## Caring1 (Mar 17, 2017)

Why couldn't they make it 512 bit instead to increase the bandwidth?


----------



## IceScreamer (Mar 17, 2017)

Legacy-ZA said:


> This is the only thing I don't like what I read about the Ryzen CPU's.
> 
> Is there room for improvement? Yep. Will it cost you a new motherboard and CPU in the near future? Yep.


New CPU sure, but I don't think you'll need a new board, seeing how AMD is staying on this platform for about 4 years, and Zen 2 is supposedly coming out sooner than that.


----------



## ratirt (Mar 17, 2017)

Caring1 said:


> Why couldn't they make it 512 bit instead to increase the bandwidth?


That's the point. Maybe they couldn't do it. But since they see were it is now maybe they want this somewhere else.  Meaning they will increase the bandwidth


----------



## geon2k2 (Mar 17, 2017)

This was already known, and some application see a real benefit from improved interconnect/memory speed:


----------



## Taloken (Mar 17, 2017)

ratirt said:


> Hmm. one thought then. So those low latencies were actually architectural not like some people said windows scheduler problem? Wonder if they will release refurbished Ryzen now or how that is going to work.. also it would be great if the "crossbar" connection You mentioned tick not with half the speed of the memory but full speed. That would kick things up a notch I'd say.



It actually tick with the full speed. The real frequency of a DDR module is always half its DDR-rating (eg DDR4-3200 -> 1600 MHz).


----------



## ratirt (Mar 17, 2017)

Taloken said:


> It actually tick with the full speed. The real frequency of a DDR module is always half its DDR-rating (eg DDR4-3200 -> 1600 MHz).


Yeah right. Forgot it is dual channel DDR.
Thanks for clarification. Well then in this case the only thing is to get the memory with higher frequency although I'm wondering now if it is worth additional money? Will this better performing memory really make noticeable difference. From a consumer stand point this difference should be noticeable if you wanna go with good 3200Mhz mem. Otherwise it's pointless.


----------



## chaosmassive (Mar 17, 2017)

AMD need to drop this "CPU block style", interface between between 'group' of CPUs tend to be bottlenecked by bandwidth
look at back, Intel C2Q, Pentium D linked via FSB speed, but ultimately dropped it 
AMD need to make real 'individual' cores, with shared L3 cache across 8 cores like Intel do

I dont know, maybe AMD try to save R&D cost by making 'blue print' of 4 cores configuration and simply 'copy-paste' cores to silicon


----------



## Legacy-ZA (Mar 17, 2017)

IceScreamer said:


> New CPU sure, but I don't think you'll need a new board, seeing how AMD is staying on this platform for about 4 years, and Zen 2 is supposedly coming out sooner than that.


Dual Channel seems to be one of the problems, if they brought it out with Triple / Quad, these would have performed way better.


----------



## erek (Mar 17, 2017)

Why did they even decide against a Monolithic design?   Can't believe we're talking about two separate modules called CCXs (CPU Complex)... just seems like an obsolete design back to the first dual cores that had to reach out to the FSB to communicate between each other.   This is unbelievable to me, I know it's better than going out to the FSB, but it imagine how crazy Ryzen could of been with a Monolithic design... it'd be crazy fast I imagine...

Tired of anything related to modules with slow interconnects.


----------



## the54thvoid (Mar 17, 2017)

Very happy now that I hunted for 3200 GSkill memory for my build.  I knew it responded better to frequency but i also knew the compatibility was an issue.


----------



## uuuaaaaaa (Mar 17, 2017)

Zen's uarch makes sense from a server perspective, also Naples (32C/64T server zen) runs on eight channel memory. At least there will be a reason to buy high end ultra fast ram now! (let's wait for motherboard support...)


----------



## NC37 (Mar 17, 2017)

erek said:


> Why did they even decide against a Monolithic design?   Can't believe we're talking about two separate modules called CCXs (CPU Complex)... just seems like an obsolete design back to the first dual cores that had to reach out to the FSB to communicate between each other.   This is unbelievable to me, I know it's better than going out to the FSB, but it imagine how crazy Ryzen could of been with a Monolithic design... it'd be crazy fast I imagine...
> 
> Tired of anything related to modules with slow interconnects.



Could be limitations in what is and isn't patented. As well as other limitations we don't know about and AMD's engineers might.

Look back at the old G4 chips in classic Macs. In the generation where there was a switch from the 7410 to the 7450, the 10s held a performance advantage due to shorter data paths. The 50s didn't get anywhere till the 55s when they brought in L3 and found ways to negate the longer paths. But Motorola couldn't just go back to the 7410s at the time. They'd only clock up to 600-650Mhz. The pathways being so short causes problems with running faster than that. Apple was in the big push to 1Ghz back then so, they opted to go with the less optimal 50s in order to get the Mhz. 

Limitations in a design, forced the engineers to adopt a less optimal design. People really didn't know about it till the more hardcore Mac clockers got into the designs and really analyzed it. Which took longer back then than these days.


----------



## nem.. (Mar 17, 2017)

i do guess than intel platform have highter support of ram than ryzen , but its not , ryzen have highter native support for run 2667mhz without OC .

AMD X370

Support for DDR4 3600(O.C.) / 3400(O.C.) / 3200(O.C.) / 2933(O.C.) /* 2667* / 2400 / 2133 MHz memory modules*

GA-Z270-Gaming K3

Support for DDR4 3866(O.C.) / 3800(O.C.) / 3733(O.C.) / 3666(O.C.) / 3600(O.C.) / 3466(O.C.) / 3400(O.C.) / 3333(O.C.) / 3300(O.C.) /3200(O.C.) / 3000(O.C.) / 2800(O.C.) / 2666(O.C.) /* 2400 / 2133 MHz memory modules*

link. http://www.gigabyte.com/Motherboard/GA-Z270-Gaming-K3-rev-10#sp





link. http://www.gigabyte.com/Motherboard/GA-AX370-Gaming-K7-rev-10#sp


----------



## medi01 (Mar 17, 2017)

eidairaman1 said:


> So in other news it is memory bandwidth intensive.



Huh? 
This is a core cluster to core cluster thing, normalmemory isn't even involved.

Runs at memory frequency is quite a revelation, actually, what you state doesn't cover it like, at all.




chaosmassive said:


> AMD need to make real 'individual' cores, with shared L3 cache across 8 cores like Intel do


They might do that, once people will actually start buying their products and they have more money to spend on R&D.


----------



## Aenra (Mar 17, 2017)

@nem.. dude how many more threads are you going to post that in? We got it


----------



## Evildead666 (Mar 17, 2017)

edit3 : talking out my arse.


----------



## deu (Mar 17, 2017)

ratirt said:


> Hmm. one thought then. So those low latencies were actually architectural not like some people said windows scheduler problem? Wonder if they will release refurbished Ryzen now or how that is going to work.. also it would be great if the "crossbar" connection You mentioned tick not with half the speed of the memory but full speed. That would kick things up a notch I'd say.




The latency is due to architechtural differences but can be solved in making the scheduler handle task differently. Basically Ryzen have core-complexes and the latency is due to handing task from on complex to another. Ryzen is 4(8)+(4(8) or 3+3 or 2+2. This is not necessary a problem if the scheduler KNOWS to minimize taskhandling across.  So if scheduler identifies a Ryzen CPU i could handle all gaming on one complex (and everything other on the other complex and the issues that have caused problems in performance can somewhat be corrected) (Ryzen still have a clock disadvantage), but in everything above 1080p this should be miniscule.)

Feel free to correct me if im wrong but do it in an nice way


----------



## deu (Mar 17, 2017)

chaosmassive said:


> AMD need to drop this "CPU block style", interface between between 'group' of CPUs tend to be bottlenecked by bandwidth
> look at back, Intel C2Q, Pentium D linked via FSB speed, but ultimately dropped it
> AMD need to make real 'individual' cores, with shared L3 cache across 8 cores like Intel do
> 
> I dont know, maybe AMD try to save R&D cost by making 'blue print' of 4 cores configuration and simply 'copy-paste' cores to silicon



All this is done to make a cheaper CPU (contra 1100 dollars) you get a 399 dollars. If taken into account there is as I understand it REALLY few downsides except inter-complex communication latencies, but unless you have ONE application that needs 16 cores intertwined it should not be a problem. (U want to keep your task on the same core anyway) Im not saying that you cant create an application that cant expose this "bottleneck", but it would not make sense to code an application that way) (If we talk gaming / everyday user applications.)

In fact can anyone name application where this WILL be a problem (granted that the scheduler understand the CCX architecture.) Im not trying to be smart or anything but I cant come up with one where this would actually be a limit (granted that the architecture was taken into account.)


----------



## mastrdrver (Mar 17, 2017)

Good video showing how talking across the CCXs through the fabric hurts performance. This also shows that MS Windows 10 scheduler need some tweaking.


----------



## BiggieShady (Mar 17, 2017)

deu said:


> Feel free to correct me if im wrong but do it in an nice way





deu said:


> The latency is due to architechtural differences but can be solved in making the scheduler handle task differently.


I'll be making up numbers to illustrate the point here, the core reason is that you only can tweak the scheduler to improve the performance in one use case (gaming) by 0.5% and at the same time degrade performance in the other use cases (productivity) by 20% ... and you can't have both behaviors in the scheduler because games and other software run often at the same time.


Caring1 said:


> Why couldn't they make it 512 bit instead to increase the bandwidth?


It would be great if latency issues could be fixed by increasing the bandwidth, but sadly it ain't so


----------



## IceScreamer (Mar 17, 2017)

Legacy-ZA said:


> Dual Channel seems to be one of the problems, if they brought it out with Triple / Quad, these would have performed way better.


Yea, never thought about that actually, makes sense now that you mention it.

Also a question, could this Infinity Fabric in theory enable on-die Crossfire/SLI connection between two GPUs, removing (or reducing) the need for software?


----------



## uuuaaaaaa (Mar 17, 2017)

IceScreamer said:


> Yea, never thought about that actually, makes sense now that you mention it.
> 
> Also a question, could this Infinity Fabric in theory enable on-die Crossfire/SLI connection between two GPUs, removing (or reducing) the need for software?



I think that is exactly what they want to do, since Vega also supports connection to this infinite fabric thing. It also applies to inter cpu connection on their Naples server platform which sports an healthy 8 channel memory configuration.


----------



## fynxer (Mar 17, 2017)

So if you building Ryzen gaming rig on a budget

less is more 

better to use like 8 GB of expensive super fast memory to get more performance then.

What AMD should do is to revive their Radeon memory brand and sell super fast DDR4 memory with only Ryzen profiles at very low cost to push their Ryzen cpu business.

This way gamers are more inclined to upgrade to Ryzen if the can get maximum performance at a reasonable price. What they don't make in the memory business they will gain 10 fold in the cpu business.


----------



## RejZoR (Mar 17, 2017)

This also explains stability issues with super high clocked RAM.  It also clocked the Infinity Fabric bus very high...


----------



## bug (Mar 17, 2017)

Legacy-ZA said:


> Dual Channel seems to be one of the problems, if they brought it out with Triple / Quad, these would have performed way better.


The number of channels (or bandwidth) is not the issue here. The issue is the crossbar switch operates at the same frequency as the RAM. With slower RAM, the crossbars switch has higher latency -> interconnect is slower.

Not a big issue per se, but it depends whether memory speeds can be fixed with a simple BIOS update or they require hardware changes.


----------



## Hood (Mar 17, 2017)

So, the gist of this thread is, Ryzen should have been designed like an Intel CPU, with more memory channels, a monolithic core design, and a more capable IMC.  Of course, it would cost more to make (just like Intel).  So let's just make Ryzen into a clone of Intel's HEDT chips, and at the same price level.  But hey, at least we can put on an AMD case badge, to let everyone know how much we hate Intel...


----------



## papupepo (Mar 17, 2017)

The majority of game developers think all the cores of a processor are linked through the L3 cache because Intel's processors are so. They might carelessly share critical data and pay enormous cost in Ryzen.

Programmers must know about cache mechanisms. They can calculate concurrently but must update in one thread. If data are shared by many cores and they update it, there will be total mess, especially in Ryzen.


----------



## bug (Mar 17, 2017)

papupepo said:


> Programmers must know about cache mechanisms.



No, not really. Cache is there to assist while the programmers do their thing. In a world of virtualization, the software rarely knows what it's running on anyway.
Optimizing data chunks wrt cache size is required in some instances, but knowing the intricacies of cache's implementation is certainly not a requirement for a programmer.


----------



## deu (Mar 17, 2017)

mastrdrver said:


> Good video showing how talking across the CCXs through the fabric hurts performance. This also shows that MS Windows 10 scheduler need some tweaking.


 Good video to illustrate the issue!  Now we just need people to understand that this is something that is fixable  (like really fixable.)


----------



## papupepo (Mar 17, 2017)

bug said:


> Optimizing data chunks wrt cache size is required in some instances, but knowing the intricacies of cache's implementation is certainly not a requirement for a programmer.


This is a principle, not a technical detail. And this principle is obvious for anyone who have studied any kind of cache mechanism before.


----------



## bug (Mar 17, 2017)

papupepo said:


> This is a principle, not a technical detail. And this principle is obvious for anyone who have studied any kind of cache mechanism before.


You lost me.


----------



## cdawall (Mar 17, 2017)

chaosmassive said:


> AMD need to drop this "CPU block style", interface between between 'group' of CPUs tend to be bottlenecked by bandwidth
> look at back, Intel C2Q, Pentium D linked via FSB speed, but ultimately dropped it
> AMD need to make real 'individual' cores, with shared L3 cache across 8 cores like Intel do
> 
> I dont know, maybe AMD try to save R&D cost by making 'blue print' of 4 cores configuration and simply 'copy-paste' cores to silicon



You mean like the athlon x2, phenom and phenom II...


----------



## r9 (Mar 17, 2017)

So again the bottleneck is the connection between the two CCX and L3.
So if the Windows scheduler handles the threads and L3 cache properly, not moving threads between the two CCX this Infinity Fabric should not be an issue, and AMD said that Windows Scheduler is aware of the Ryzen Architecture.
I'm confused.
And wishful thinking but maybe in the next BIOS updates we could unlink the bus from the memory and overclock it.


----------



## papupepo (Mar 17, 2017)

bug said:


> You lost me.


Did I? Parallel programming is extremely difficult. You must know many principles for it. You can't benefit by multicore processors if you write a program freely.

If you are not skilled programmer and don't know much about parallel programming, you should write single-threaded programs. Caches always helps you in there.

And you should know the details of hardware if you write performance-critical software, like a game.


----------



## mcraygsx (Mar 17, 2017)

deu said:


> Good video to illustrate the issue!  Now we just need people to understand that this is something that is fixable  (like really fixable.)




That was fantastic Video. This also means each time benchmarking results can/will vary depending on how Windows is scheduling threads between two CCX.


----------



## bug (Mar 17, 2017)

r9 said:


> So again the bottleneck is the connection between the two CCX and L3.
> So if the Windows scheduler handles the threads and L3 cache properly, not moving threads between the two CCX this Infinity Fabric should not be an issue, and AMD said that Windows Scheduler is aware of the Ryzen Architecture.
> I'm confused.
> And wishful thinking but maybe in the next BIOS updates we could unlink the bus from the memory and overclock it.



AMD themselves said Win scheduler is not the issue, but what do they know? http://www.windowscentral.com/amd-says-windows-scheduler-isnt-blame-ryzen-performance



papupepo said:


> Did I? Parallel programming is extremely difficult. You must know many principles for it. You can't benefit by multicore processors if you write a program freely.
> 
> If you are not skilled programmer and don't know much about parallel programming, you should write single-threaded programs. Caches always helps you in there.
> 
> And you should know the details of hardware if you write performance-critical software, like a game.



Parallel programming is not extremely difficult. In fact, it can be fairly easy to do (look at Erlang or Go's goroutines). But most of the time it is more tedious to write and harder to test/maintain.
Caching has nothing to do with multi-threading. Caching is there to avoid memory read/writes, it doesn't actually care whether the CPU is running 1 or 1,000 threads.
L1 and L2 caches are always split and I know of no one trying to write multithreaded code in order not to upset L1 and L2 caches. If anything, that's a compiler's or a scheduler's job. I don't see why things would be any different when we're talking about L3 cache.


----------



## r9 (Mar 17, 2017)

bug said:


> AMD themselves said Win scheduler is not the issue, but what do they know? http://www.windowscentral.com/amd-says-windows-scheduler-isnt-blame-ryzen-performance
> 
> 
> 
> ...



Ryzen L3 cache is split between the two CCX. So it has 2x8MB instead of 1x16MB. Which means when if thread is moved from one CCX to the other the cached information needs to be moved to the appropriate L3.
That's where the Infinity Fabric bottleneck takes place and heavily affects performance.


----------



## bug (Mar 17, 2017)

r9 said:


> Ryzen L3 cache is split between the two CCX. So it has 2x8MB instead of 1x16MB. Which means when if thread is moved from one CCX to the other the cached information needs to be moved to the appropriate L3.
> That's where the Infinity Fabric bottleneck takes place and heavily affects performance.


I know that. But that's AMD's design decision. And when going for max performance, programmers will need to account for that. But I don't think it's fair to say programmers should (much less must) take into account cache implementation details when writing a game.
And even so, thread core affinity is typically the responsibility of the OS.


----------



## r9 (Mar 17, 2017)

bug said:


> I know that. But that's AMD's design decision. And when going for max performance, programmers will need to account for that. But I don't think it's fair to say programmers should (much less must) take into account cache implementation details when writing a game.
> And even so, thread core affinity is typically the responsibility of the OS.



And calling the interconnect  Infinite Fabric is like putting race stripe on a car and expecting it to go faster.
Something is not adding up here. From the information that was floating around it sounded like the Infinite Fabric is the bottleneck due to threads moving between CCX.
But with AMD releasing that statement that nothing wrong with the Windows scheduler it looks like that bus is the bottleneck in all scenarios.
And its sounds like all the memory issues are related to the bus being in sync with the memory. 
Looks like a huge overlook on AMD side.
But I'm willing to bet they will offer significant IPC improvement on Zen 2.0 and it will be largely due to addressing the bus speed.


----------



## Captain_Tom (Mar 17, 2017)

chaosmassive said:


> AMD need to drop this "CPU block style", interface between between 'group' of CPUs tend to be bottlenecked by bandwidth
> look at back, Intel C2Q, Pentium D linked via FSB speed, but ultimately dropped it
> AMD need to make real 'individual' cores, with shared L3 cache across 8 cores like Intel do
> 
> I dont know, maybe AMD try to save R&D cost by making 'blue print' of 4 cores configuration and simply 'copy-paste' cores to silicon



It is incredibly cheaper, and nearly infinitely scalable for AMD to do it this way.


You can thank this new Tech for Ryzen's low cost and massive 32-core brethren.   In fact I hope (And expect) AMD to apply this to their GPU archs within a year.  Imagine a 1200mm^2 10,000-SP monster gaming card.


----------



## OSdevr (Mar 17, 2017)

Different CPUs have different cache designs, and they have become quite complicated. A game developer may be able to _slightly _improve performance using good programming habits, but they could just as easily hinder it. Caches are designed to improve performance for the average program. Also memory allocation isn't the program's job. It's done by the OS and the OS indeed plays tricks with caches (page coloring for example).

It is possible to play the caches like a fiddle (Memtest86+ doesn't disable them), but it's quite difficult and not something that can be done under an OS.

BTW hasn't AMD done something like this before? I think they once had a FSB that was synced with main memory, or had to be for good performance.


----------



## prtskg (Mar 17, 2017)

erek said:


> Why did they even decide against a Monolithic design?   Can't believe we're talking about two separate modules called CCXs (CPU Complex)... just seems like an obsolete design back to the first dual cores that had to reach out to the FSB to communicate between each other.   This is unbelievable to me, I know it's better than going out to the FSB, but it imagine how crazy Ryzen could of been with a Monolithic design... it'd be crazy fast I imagine...
> 
> Tired of anything related to modules with slow interconnects.


I thought the reason was obvious. They don't have enough money and human resource for monolithic design to make cpus from 2 to 8 cores, apus from 2 to 4 cores, gpus from small to big size, custom chips for consoles, other embedded designs, etc. They also needed some interconnect for server as well as hpc apu. So they chose the best compromise for AMD, decided to choose the design that will help them with computational tasks aka server cpus and apus over gaming. And I think they did well. I never expected them to come so close to Intel. Zen cpus should serve them well in servers and this will give them enough money for better products down the line. I'm now happy enough with their product to assemble some am4 systems down the line, something I didn't do with their BD products.


----------



## AcesNDueces (Mar 17, 2017)

The Improvements seen have very little to do w actual memory bandwidth and more to with the side benefit of the higher memory speed increasing the speed of the infinity fabric clock. Essentially faster ram overclocks the Uncore/SouthBridge/cache speeds. Thats where the jump is coming from.


----------



## Legacy-ZA (Mar 17, 2017)

bug said:


> The number of channels (or bandwidth) is not the issue here. The issue is the crossbar switch operates at the same frequency as the RAM. With slower RAM, the crossbars switch has higher latency -> interconnect is slower.
> 
> Not a big issue per se, but it depends whether memory speeds can be fixed with a simple BIOS update or they require hardware changes.



I did say "one" of the issues. *sigh*


----------



## bug (Mar 17, 2017)

Legacy-ZA said:


> I did say "one" of the issues. *sigh*


You were still wrong.


----------



## bug (Mar 17, 2017)

r9 said:


> And calling the interconnect  Infinite Fabric is like putting race stripe on a car and expecting it to go faster.
> Something is not adding up here. From the information that was floating around it sounded like the Infinite Fabric is the bottleneck due to threads moving between CCX.
> But with AMD releasing that statement that nothing wrong with the Windows scheduler it looks like that bus is the bottleneck in all scenarios.
> And its sounds like all the memory issues are related to the bus being in sync with the memory.
> ...


From what I've read, AMD consciously compromised on the memory performance front to get the product out. My guess is they'll enable faster DDR for this generation and come up with an improved solution in the next iteration.

But this compromise is just like when we "compromise" and buy whatever CPU we can, even if we know a better one is just around the corner. If we'd wait for the perfect CPU, we'd never buy anything. The same as AMD, if they wanted to fix everything, they'd never release. Because once fixed, the bottleneck would simply move somewhere else, and once that was fixed the bottleneck would move again and so on.


----------



## nem.. (Mar 17, 2017)




----------



## Particle (Mar 17, 2017)

ratirt said:


> Yeah right. Forgot it is dual channel DDR.
> Thanks for clarification. Well then in this case the only thing is to get the memory with higher frequency although I'm wondering now if it is worth additional money? Will this better performing memory really make noticeable difference. From a consumer stand point this difference should be noticeable if you wanna go with good 3200Mhz mem. Otherwise it's pointless.



It doesn't have anything to do with being dual channel.  A DDR4-3200 module isn't named for its physical clock speed but rather its transaction rate.  The link between the module and memory controller uses DDR signalling, meaning each data line toggles out two data bits for each cycle of the master clock.  The master clock for the module is 1600 MHz (real).

The memory ICs on the DIMM itself run at an even lower clock speed.  In this example, is it 300 MHz.  DDR4 links at 4x the IC clock.  DDR3 also links at 4x the IC clock.  DDR2 links at 2x the IC clock.  DDR links at 1x the IC clock.  The underlying memory chips don't get much faster over time.  Mostly they just get more dense.


----------



## papupepo (Mar 18, 2017)

bug said:


> Parallel programming is not extremely difficult. In fact, it can be fairly easy to do (look at Erlang or Go's goroutines). But most of the time it is more tedious to write and harder to test/maintain.
> Caching has nothing to do with multi-threading. Caching is there to avoid memory read/writes, it doesn't actually care whether the CPU is running 1 or 1,000 threads.
> L1 and L2 caches are always split and I know of no one trying to write multithreaded code in order not to upset L1 and L2 caches. If anything, that's a compiler's or a scheduler's job. I don't see why things would be any different when we're talking about L3 cache.


Like I said, we should learn about cache coherency. When data is shared by multiple cores, if a core update the data, the other cores must be immediately notified and the caches of the cores must be invalidated.
When data are shared by multiple CCX, the cost is very high. When data are shared by the cores on one CCX, or on a Intel processor, the cost is much lower, but it's still costly. So you shouldn't update shared data carelessly. You should create an update thread and put tasks on it. This can't be done by any scheduler or compiler.


----------



## Nebulous (Mar 18, 2017)

Aenra said:


> @nem.. dude how many more threads are you going to post that in? We got it



@Aenra    He's multi-threaded, although his OS seems to have copy/paste issues


----------



## bug (Mar 18, 2017)

papupepo said:


> Like I said, we should learn about cache coherency. When data is shared by multiple cores, if a core update the data, the other cores must be immediately notified and the caches of the cores must be invalidated.
> When data are shared by multiple CCX, the cost is very high. When data are shared by the cores on one CCX, or on a Intel processor, the cost is much lower, but it's still costly. So you shouldn't update shared data carelessly. You should create an update thread and put tasks on it. This can't be done by any scheduler or compiler.


You might want to read this: https://arstechnica.com/gadgets/201...s-ryzen-at-games-but-how-much-does-it-matter/
It touches on how developers feel about optimizing for each and every CPU, but there's more interesting stuff in there.


----------



## Super XP (Mar 18, 2017)

Legacy-ZA said:


> This is the only thing I don't like what I read about the Ryzen CPU's.
> 
> Is there room for improvement? Yep. Will it cost you a new motherboard and CPU in the near future? Yep.


Nope, Socket AM4 is guaranteed to remain as the main socket up to at least mid to late 2019, when AMD transitions to Socket AM4+ via ZEN 3. So ZEN 1, ZEN 1+ and ZEN 2 are all based on Socket AM4. And ZEN 3 will be based on Socket AM4+ but will be backwards compatible with AM4, same scenario I would assume with AM3 & AM3+. This going by there Road Maps.



bug said:


> You might want to read this: https://arstechnica.com/gadgets/201...s-ryzen-at-games-but-how-much-does-it-matter/
> It touches on how developers feel about optimizing for each and every CPU, but there's more interesting stuff in there.



I haven't read the link yet, but seeing AMD's strategy, they are making it easy to optimize and develop around there Architecture, both the CPU & GPU. AMD wants to push Multi-Threading as much as possible, and now Dev's have the ammunition to do such a thing. Will this catch on? I hope so, we've been stuck with Single Threading for far too long, and 4-Core setups should have been dumped in the garbage by now.


----------



## cdawall (Mar 18, 2017)

Super XP said:


> Nope, Socket AM4 is guaranteed to remain as the main socket up to at least mid to late 2019, when AMD transitions to Socket AM4+ via ZEN 3. So ZEN 1, ZEN 1+ and ZEN 2 are all based on Socket AM4. And ZEN 3 will be based on Socket AM4+ but will be backwards compatible with AM4, same scenario I would assume with AM3 & AM3+. This going by there Road Maps.



Using terms like "guaranteed" in the PC world is never the best plan. AMD road maps have changed so many times it is honestly laughable to quote them as a truth. AMD will use whatever socket allows them to make the most money. If they end up with a massive change to design over the next couple of years or need to add additional PCI-e lanes expect that to change.


----------



## Super XP (Mar 18, 2017)

cdawall said:


> Using terms like "guaranteed" in the PC world is never the best plan. AMD road maps have changed so many times it is honestly laughable to quote them as a truth. AMD will use whatever socket allows them to make the most money. If they end up with a massive change to design over the next couple of years or need to add additional PCI-e lanes expect that to change.


I fully agree with your post. I am just going by what AMD said on National Television. When they launched Socket AM4 and Ryzen. They pretty much promised to keep AM4 for as long as possible.


----------



## cdawall (Mar 19, 2017)

Super XP said:


> I fully agree with your post. I am just going by what AMD said on National Television. When they launched Socket AM4 and Ryzen. They pretty much promised to keep AM4 for as long as possible.



They also said on national TV that Bulldozer was the way of the future and performed better than anything intel had ever produced.


----------



## newtekie1 (Mar 19, 2017)

Hmm... Kind of makes me wonder if this "Infinity Fabric" would go unstable with faster RAM.  I mean, AMD recommends 3200 and 3500MT/s RAM, so the Infinity Fabric is running at 1600/1750MHz.  But what about 4000MT/s RAM?  That would put the Infinity Fabric running at 2000MHz.  Will it be stable at that speed?  If we push the RAM further?

Yes, I realize memory speeds that high are not supported, and AFAIK you can't actually even select memory speeds that high yet.  But I'm thinking of the future.  Is this interconnect going to become a limitation to memory speed down the road.  If they have problems getting the interconnect stable at higher speeds, we might be limited to 3500MT/s memory in the future, which would suck.


----------



## erek (Mar 19, 2017)

newtekie1 said:


> Hmm... Kind of makes me wonder if this "Infinity Fabric" would go unstable with faster RAM.  I mean, AMD recommends 3200 and 3500MT/s RAM, so the Infinity Fabric is running at 1600/1750MHz.  But what about 4000MT/s RAM?  That would put the Infinity Fabric running at 2000MHz.  Will it be stable at that speed?  If we push the RAM further?
> 
> Yes, I realize memory speeds that high are not supported, and AFAIK you can't actually even select memory speeds that high yet.  But I'm thinking of the future.  Is this interconnect going to become a limitation to memory speed down the road.  If they have problems getting the interconnect stable at higher speeds, we might be limited to 3500MT/s memory in the future, which would suck.



is there any Multiplier settings that impact Infinity Fabric?  Also what about Voltage settings?


----------



## Super XP (Mar 19, 2017)

All is know is Infinity Fabric scales at the same speed as DDR4 Ram. The faster the Ram the better performance. AMD said they are working to increase RAM speed support as much as possible.  
I don't see why Infinity Fabric Would become unstable at faster speeds. 
Infinity Fabric Has a 256-Bit Quad Channel interface. Or a Bi directional channel. Something like that lok


----------



## newtekie1 (Mar 19, 2017)

erek said:


> is there any Multiplier settings that impact Infinity Fabric?  Also what about Voltage settings?



As far as I can tell, there is no multiplier settings for the Infinity Fabric.  It is just 1:1 with the memory speed.  But That might change in the future.  I don't know what voltage settings affect it either, but there might be the option to adjust its voltage.



Super XP said:


> The faster the Ram the better performance. AMD said they are working to increase RAM speed support as much as possible.
> I don't see why Infinity Fabric Would become unstable at faster speeds.



You very well might have answered your own question.  They need to work on added faster RAM support because the Infinity Fabric is becoming unstable at the faster speeds.

Increasing clock speed on something always runs at the risk of it becoming unstable.


----------



## Scrizz (Mar 19, 2017)

Hood said:


> So, the gist of this thread is, Ryzen should have been designed like an Intel CPU, with more memory channels, a monolithic core design, and a more capable IMC.  Of course, it would cost more to make (just like Intel).  So let's just make Ryzen into a clone of Intel's HEDT chips, and at the same price level.  But hey, at least we can put on an AMD case badge, to let everyone know how much we hate Intel...



That's the same vibe I'm getting.


----------



## erek (Mar 19, 2017)

Scrizz said:


> That's the same vibe I'm getting.




Can't they just make it so we can disable an entire CCX module without losing memory channels and L3 cache?  Why can't they just release a Ryzen 4-core that's a single CCX module... bet it would OC well, and also not suffer the horrible restrictions of the Infinite Fabric....   seriously it's really nutso that the 1500X is configured to be 2x 2-core CCX modules and not a single 4-core CCX module


----------



## cdawall (Mar 19, 2017)

erek said:


> Can't they just make it so we can disable an entire CCX module without losing memory channels and L3 cache?  Why can't they just release a Ryzen 4-core that's a single CCX module... bet it would OC well, and also not suffer the horrible restrictions of the Infinite Fabric....   seriously it's really nutso that the 1500X is configured to be 2x 2-core CCX modules and not a single 4-core CCX module



That should leave it with a 16MB L3 instead of an 8MB. I would imagine in the future we will see an "athlon" style product using a single CCX


----------



## newtekie1 (Mar 19, 2017)

erek said:


> Can't they just make it so we can disable an entire CCX module without losing memory channels and L3 cache?  Why can't they just release a Ryzen 4-core that's a single CCX module... bet it would OC well, and also not suffer the horrible restrictions of the Infinite Fabric....   seriously it's really nutso that the 1500X is configured to be 2x 2-core CCX modules and not a single 4-core CCX module



The L3 cache is part of the CCX, so you can't disable a CCX and keep all the L3 cache.

But I guarantee we will see a 4-core that is just a single CCX.  The CCX has no affect on memory channels, the memory controller is not part of the CCX.


----------



## Super XP (Mar 19, 2017)

Yes the Ryzen 3 is a Quad Core. 8threads coming soon. 

I read somewhere that AMD is going to release an update to enable a higher DDR4 multiplier to increase the RAM speed, hence increasing the Infinity Fabric.


----------



## erek (Mar 20, 2017)

Super XP said:


> Yes the Ryzen 3 is a Quad Core. 8threads coming soon.
> 
> I read somewhere that AMD is going to release an update to enable a higher DDR4 multiplier to increase the RAM speed, hence increasing the Infinity Fabric.




I want a full-bore (1800X-style) single module 4-core CCX chip to totally do away with Infinite Fabric interconnect across cores.   It is my thinking that such a chip could OC very well, and perform better in gaming without dealing with the Infinite Fabric bus in terms of core interconnect.


----------



## newtekie1 (Mar 20, 2017)

Super XP said:


> Yes the Ryzen 3 is a Quad Core. 8threads coming soon.
> 
> I read somewhere that AMD is going to release an update to enable a higher DDR4 multiplier to increase the RAM speed, hence increasing the Infinity Fabric.



I can almost guarantee the first batch of the Ryzen Quad-Cores will be 8 core chips with 4 cores disabled, 2 cores from each CCX disabled.

Eventually they will release the single CCX 4 cores, probably with a model number bump to something like R3 1250X or something instead of the 1200X.


----------



## erek (Mar 20, 2017)

newtekie1 said:


> I can almost guarantee the first batch of the Ryzen Quad-Cores will be 8 core chips with 4 cores disabled, 2 cores from each CCX disabled.
> 
> Eventually they will release the single CCX 4 cores, probably with a model number bump to something like R3 1250X or something instead of the 1200X.



Pretty sad about 2x 2-core CCX modules to make up a 4-core    Just want rid of Infinite Fabric slowness / limitations


----------



## cdawall (Mar 20, 2017)

erek said:


> Pretty sad about 2x 2-core CCX modules to make up a 4-core    Just want rid of Infinite Fabric slowness / limitations



Why would AMD not try to capitalize on half dead CCX's? It also gives customers a 16MB L3 cache on a quad core.


----------



## cadaveca (Mar 20, 2017)

cdawall said:


> Why would AMD not try to capitalize on half dead CCX's? It also gives customers a 16MB L3 cache on a quad core.


Look at the die shots and you'll know why that won't happen. 8 MB, maybe, 16 MB? Not very likely. It is almost as though it is 8x 2 MB, when you look at the die shots.


----------



## cdawall (Mar 20, 2017)

cadaveca said:


> Look at the die shots and you'll know why that won't happen. 8 MB, maybe, 16 MB? Not very likely. It is almost as though it is 8x 2 MB, when you look at the die shots.



The spec charts list full cache for the quads still? I mean they are prerelease, but still.


----------



## bug (Mar 20, 2017)

cdawall said:


> The spec charts list full cache for the quads still? I mean they are prerelease, but still.


Think about it: even if the whole L3 cache was left in there, the only way to address it would be over InfinityFabric. Same as reading from RAM.


----------



## cadaveca (Mar 20, 2017)

bug said:


> Think about it: even if the whole L3 cache was left in there, the only way to address it would be over InfinityFabric. Same as reading from RAM.


Exactly. If they kill 2 cores on each CCX, 1/2 the L3 would effectively be an L4 cache over the infinity fabric. Which might not be a bad thing. 







With this picture, you can see that the L3 cache is surrounded by 4 cores. Each core (with SMT) is effectively it's own device with its own L3, and we have 16 blocks of L3 in each CCX.


----------



## L'Eliminateur (Mar 20, 2017)

wow this is incredibly SHITTY, no matter how you spin it, at the tech level it's terribly bad, letme expand:

first you have a non-monolithic-ish CPU design -not quite MCM but you could call it "MCM on die"- that communicates with each other on a 256bit bus as well as the other uncore parts with a crossbar configuration, that by tiself already kills your performance as the inter CCX comm use this slow-ass bus, as pc perspective benchmarks showed it wreaks havoc on cache coherency and L3 access beyond local cache(intel does not have this issue).

And then you compound that by tying the bus to EXTERNAL RAM speed by making the memory controllers the "bus master" essentially, that's beyond bad design, it's appallingly bad

Intel with their Xeon HCC(high core count) does something similar as their ring bus max at 16 cores, so for 22 cores it has 2 ring busses that connect to eachother with bus bridges that have a very small impact of performance and each ring has a dedicated memory controller, in BIOS you can enable "cluster on die" mode which turns the single chip(it's still a monolithic core) into a numa-node for performance reasons(to stop the right ring from acesing the left rin ram space over the bridges), THAT is what you call an elegant and sophisticated design, the ring bus does not depend on the RAM nor core clocks

on one hand, AMD touts ryzen as the expensive intel killer... but they need to most expensive un-buyable memory to actually perform better?, top kek there AMD...

so Intel has not only a very big IPC lead with KBL, but they can maintain that IPC regardless of whatever shitty cheap ram you throw into the system.

Also remember that this is going into naples, and server ECC RDIMM memory tops at 2400, plus massive fabric overhead, as naples will be a MCM(mayeb even interposer to route the massive 256 bit bus) of 4 ryzen dies, each die provides 2 channels of ram so inter-chip ram/cache will be slow as molasses.


----------



## bug (Mar 20, 2017)

cadaveca said:


> Exactly. If they kill 2 cores on each CCX, 1/2 the L3 would effectively be an L4 cache over the infinity fabric. Which might not be a bad thing.
> 
> 
> 
> ...


Well, that "L4 cache" would be accessible at the same speed as RAM (because of InfinityFabric), so it would be pretty pointless. And that's on top of the thing that no OS or application is L4 cache aware in order to do anything useful with it.


----------



## cdawall (Mar 20, 2017)

L'Eliminateur said:


> so Intel has not only a very big IPC lead with KBL, but they can maintain that IPC regardless of whatever shitty cheap ram you throw into the system.



No they don't.



bug said:


> Think about it: even if the whole L3 cache was left in there, the only way to address it would be over InfinityFabric. Same as reading from RAM.



I don't disagree I am just letting you know how it is listed. It could be used as a fallover cash. Fill the L3 and use the crap over infinity fabric, similar to Nvidia's 3.5gb+512mb 970.


----------



## L'Eliminateur (Mar 20, 2017)

cdawall said:


> No they don't.


yes they do, KBL IPC is far far higher than ryzen, ryzen IPC compares somewhat to broadwell and KBL is 2 gens above and all benchs do sustain that.


----------



## cdawall (Mar 20, 2017)

L'Eliminateur said:


> yes they do, KBL IPC is far far higher than ryzen, ryzen IPC compares somewhat to broadwell and KBL is 2 gens above and all benchs do sustain that.





> It's not surprising - but it is a bit shocking to see in this form: *Kaby Lake truly does offer zero to the consumer in terms of clock for clock performance.* (In fact, a couple of the results show it slower than the Skylake, but these are within the margin of error.) Enthusiasts and analysts have often lamented the "slow" progression of IPC changes on Intel's Core architecture since the introduction of Sandy Bridge, increasing just 3-6% on the product release cadence.



source

There isn't even a 10% difference when one says "far far higher" that is like saying the IPC is "far far higher" between bulldozer and skylake something to the terms of 50%. Single digit differences are not "far far higher"


----------



## newtekie1 (Mar 20, 2017)

erek said:


> Pretty sad about 2x 2-core CCX modules to make up a 4-core    Just want rid of Infinite Fabric slowness / limitations



The only real limitation is when a core on one CCX has to access cache on the other CCX.  In that case, the L3 on the other CCX acts more like an L4, it is still faster than accessing system RAM.  There is no getting rid of the Infinite Fabric, it is an interconnect between the CPU core and the rest of the system.


----------



## Super XP (Mar 21, 2017)

They need to somehow increase the speed of Infinity Fabric. Hence faster DDR4 ram. I am wondering why AMD didn't make Infinity Fabric's speed based on the CPU frequency. And why they chose the Ram.


----------



## erek (Mar 21, 2017)

https://www.hardocp.com/news/2017/03/20/amd_ryzen_iommu_b350_chipset_challenges


----------



## erek (Mar 21, 2017)

*AMD Confirms It's Issuing a Fix To Stop New Ryzen Processors From Crashing Desktops *


----------



## __isomorph__ (Apr 6, 2017)

L'Eliminateur said:


> wow this is incredibly SHITTY, no matter how you spin it, at the tech level it's terribly bad, letme expand:
> 
> first you have a non-monolithic-ish CPU design -not quite MCM but you could call it "MCM on die"- that communicates with each other on a 256bit bus as well as the other uncore parts with a crossbar configuration, that by tiself already kills your performance as the inter CCX comm use this slow-ass bus, as pc perspective benchmarks showed it wreaks havoc on cache coherency and L3 access beyond local cache(intel does not have this issue).
> 
> ...




christ!!!  thanks for the post. i read enough: not buying Ryzen to do my 1st build ever.  even though i am a noob, i could feel something was amiss as i pored over countless reviews, articles, and forum posts re Ryzen.  smells like smoke and the bird is crashing down in flames, it seems.  perhaps AMD will codename v2 'Phoenix'. that'd be funny. 

how i was hoping though to get something sensible to kick Intel's ass at a lower price point. damn, what a downer.


----------



## __isomorph__ (Apr 6, 2017)

i now hang my head in shame. everybody, cancel my last post.  after watching this :  







the fog of war has lifted and it now seems clear Ryzen 7, especially 1700, but also 1800X, is the superior CPU.  

watch the stats of Ryzen vs 7700K running BF1 starting @ 12:57.  as the reviewer says, the Intel chip has no headroom left and is at its limit, and with optimization! whereas Ryzen, without any optimization, is barely breaking a sweat with plenty of headroom left.  i predict that once the optims start rolling in, carnage and mayhem will ensue leaving the broken carcass of the Intel 7700K on the floor like the rotten corpse that it is.


----------



## __isomorph__ (Apr 6, 2017)

L'Eliminateur said:


> wow this is incredibly SHITTY, no matter how you spin it, at the tech level it's terribly bad, letme expand:



i take back my previous comment. check out this reviewer's youtube vid i post above.


----------



## ratirt (Apr 6, 2017)

__isomorph__ said:


> i now hang my head in shame. everybody, cancel my last post.  after watching this :
> 
> 
> 
> ...


That was a great video. And I like that dude's Scottish accent  Anyway that explains everything. It is really nice that somebody managed to make such a video and explained how those RyZen and Intel CPU's go. RyZen is better from intel I7 7700K and within a year RyZen will mop the floor with the I7 7xxx series. This video shows how you should be looking at the CPU's. Great point of view and the only one that's actually valid.

BTW.
I wonder when we would expect a game RyZen fully ready. I wish to see how those CPU's are performing. Also how will the Ryzen+ perform and when they will release it


----------



## SkOrPn (Apr 10, 2017)

Legacy-ZA said:


> Dual Channel seems to be one of the problems, if they brought it out with Triple / Quad, these would have performed way better.



They are bringing out quad channel memory for their High-End Ryzen Desktop Platform which will have 12 core and 16 core Ryzen parts for the real PC Enthusiast crowd, assuming that crowd exists. This not yet announced platform will require a new chipset probably X399 (or something along those lines) and a new much larger LGA socket, which will undoubtedly require more expensive boards. Expect $300-400 boards and $600-800 Ryzen parts using "roughly" the same socket as Naples. I say roughly because AMD will probably have the board designs built in such a way that Naple chips may not work on them at all, or heck maybe they will.

I would love to see Asus release a Rampage+Extreme X399 tier using 12 or 16 core Ryzen parts. That would be a first for AMD. Although not sure we have Enthusiasts of that caliber any longer these days. Ten years ago I would have dropped 5K on a machine just because, but today I don't even want to spend 2K if I can help it.


----------



## Super XP (May 11, 2017)

Infinity Fabric specs is a 256-bit wide bi-directional crossbar. Would a 512-bit wide bi-directional crossbar be more beneficial?


----------



## bug (May 11, 2017)

Super XP said:


> Infinity Fabric specs is a 256-bit wide bi-directional crossbar. Would a 512-bit wide bi-directional crossbar be more beneficial?


I'm pretty sure AMD has measured it from all points of view and 256 is the best compromise today. Whether 512 bit wouldn't add any significant improvements or would have done so by (further) sacrificing memory compatibility or blowing out max TDP, there must be a solid reason they went with 256.

And please note IF in its current incarnation is not a problem. It can have an impact on some workloads that are rather rare during typical desktop usage. It's only mentioned because users need to know about it lest they go with Ryzen and later find out (for whatever reason) they need to run of of these workloads for significant periods of time. For 99.99% users, IF has literally no impact.


----------



## uuuaaaaaa (May 11, 2017)

SkOrPn said:


> They are bringing out quad channel memory for their High-End Ryzen Desktop Platform which will have 12 core and 16 core Ryzen parts for the real PC Enthusiast crowd, assuming that crowd exists. This not yet announced platform will require a new chipset probably X399 (or something along those lines) and a new much larger LGA socket, which will undoubtedly require more expensive boards. Expect $300-400 boards and $600-800 Ryzen parts using "roughly" the same socket as Naples. I say roughly because AMD will probably have the board designs built in such a way that Naple chips may not work on them at all, or heck maybe they will.
> 
> I would love to see Asus release a Rampage+Extreme X399 tier using 12 or 16 core Ryzen parts. That would be a first for AMD. Although not sure we have Enthusiasts of that caliber any longer these days. Ten years ago I would have dropped 5K on a machine just because, but today I don't even want to spend 2K if I can help it.



Given AMD's history it would not surprise me to see consumer grade boards compatible with Naples if the socket is the same. Heck We could see a dual socket Naples board like the old eVGA SR-2, now that would be monstrous!


----------



## eidairaman1 (May 11, 2017)

uuuaaaaaa said:


> Given AMD's history it would not surprise me to see consumer grade boards compatible with Naples if the socket is the same. Heck We could see a dual socket Naples board like the old eVGA SR-2, now that would be monstrous!


Uses different socket, akin to 2011-3 or 2066


----------



## uuuaaaaaa (May 11, 2017)

eidairaman1 said:


> Uses different socket, akin to 2011-3 or 2066


 I am aware that this new HEDT platform will use a different socket. If they use the same LGA socket that they have on their Naples server platform, we could see some 32C/64T parts on "consumer" computers. In the past AMD has allowed it.


----------

