Thursday, December 9th 2021

12-channel DDR5 Memory Support Confirmed for Zen 4 EPYC CPUs by AMD

Thanks to a Linux driver update, we now know that AMD's upcoming Zen 4 based EPYC CPUs will support up to 12 channels of DDR5 memory, an upgrade over the current eight. The EDAC driver, or Error Detection and Correction driver update from AMD contained details of the memory types supported by AMD's upcoming server and workstation CPUs and although this doesn't tell us much about what we'll see from the desktop platform, some of this might spill over to a future Ryzen Threadripper CPU.

The driver also reveals that there will be support for both RDDR5 and LRDDR5, which translates to Registered DDR5 and Load-Reduced DDR5 respectively. LRDDR5 is the replacement for LRDIMMs, which are used in current servers with very high memory densities. Although we don't know when AMD is planning to announce Zen 4, even less so the new EPYC processors, it's expected that it will be some time in the second half next year.
Source: Phoronix
Add your own comment

63 Comments on 12-channel DDR5 Memory Support Confirmed for Zen 4 EPYC CPUs by AMD

#27
trsttte
ValantarDoubtful - datacenter workloads are bandwidth limited, consumer workloads are almost never that. Doubling the channel count would also significantly increase the number of traces run from the CPU to RAM, (which is already ~500 traces for a 2-channel setup), making motherboard design a lot more complex and expensive (on top of the increases we've already seen for DDR5). Pretty much the only consumer workload that would benefit from 4-channel RAM would be iGPU gaming, and they're definitely not designing AM5 around that as a central use case.
A solution would be to use the same split layout with memory on both sides of the cpu like it's done on HEDT and server boards. Of course this is just wishfull thinking, it's not really needed and very unlikely to happen.

Anyway there's still something to be said about how interesting the concept is, with how crazy high performance the 5950x already is and the big price premium on even the lowest end threadripper boards and cpus it would make for a more affordable "entry" level workstation (still 16 freaking cores). But then again, why would amd care when it's just more profitable to not care right?
Posted on Reply
#28
Valantar
trsttteA solution would be to use the same split layout with memory on both sides of the cpu like it's done on HEDT and server boards. Of course this is just wishfull thinking, it's not really needed and very unlikely to happen.

Anyway there's still something to be said about how interesting the concept is, with how crazy high performance the 5950x already is and the big price premium on even the lowest end threadripper boards and cpus it would make for a more affordable "entry" level workstation (still 16 freaking cores). But then again, why would amd care when it's just more profitable to not care right?
But that's the thing: for the vast majority of even workstation use cases, the benefit of TR is cores, not memory bandwidth. The memory scaling articles I linked above include quite a few workstation tasks, and some of them scale to some degree with memory bandwidth, but even two channels of DDR5-6000 is way into diminishing returns on a 16c24t CPU. It's possible that the same workloads on a 32+ core CPU would be able to make use of more channels, but for the vast majority of cases there simply wouldn't be a point to more than two - especially with the bandwidth benefits of DDR5. Anandtech's Bench comparison engine lets us make good comparisons, for example of the 5950X and Threadripper Pro 3990WX (the 8-channel version that's essentially an EPYC), and while the 3990WX wins quite a few workloads, so does the 5950X, and margins are often much closer than you'd expect even in workstation tasks where the 64-core chip wins. IPC and clock speeds often matter quite a lot more than massive core counts and memory bandwidth (Anandtech tests at JEDEC RAM speeds, so both should be running at DDR4-3200, but the 3990WX then has 4x the bandwidth).

If a channel increase for consumer platforms was to happen, I think a split layout would be a necessity - otherwise you're looking at server grade 10+ layer PCBs just to make the RAM work at all with 4 discrete channels to one side. Another disadvantage of this would be the wholesale exclusion of this range from the ITX/SFF market, which has seen dramatic growth over the past half decade or so. Ironically this is also where iGPU gaming makes the most sense, yet the only way to fit 4 DIMMs on an ITX board is to go SODIMM, and even then it gets tricky - and really expensive. I don't see why they would even consider this given that DDR5 already delivers 2x the bandwidth on an equivalent channel count and will go far higher still.
Posted on Reply
#29
user556
I was playing around with some random number algorithm testing a while back on a Ryzen1700X when it was still a new beast and, although I never compared with other hardware, one of the tests performed had a distinct performance hit per task when ratcheting up parallel instances. The reason was clear enough - each instance had a 4 GB map that was randomly indexed. This naturally exhausted cache in short order and any duplication just exacerbated the problem.

More RAM channels, in that situation, I'm sure would've helped a lot.
Posted on Reply
#30
Valantar
user556I was playing around with some random number algorithm testing a while back on a Ryzen1700X when it was still a new beast and, although I never compared with other hardware, one of the tests performed had a distinct performance hit per task when ratcheting up parallel instances. The reason was clear enough - each instance had a 4 GB map that was randomly indexed. This naturally exhausted cache in short order and any duplication just exacerbated the problem.

More RAM channels, in that situation, I'm sure would've helped a lot.
Well, sure, but how common a workload is that? I never said there are no workloads that can make use of more memory bandwidth - there's a reason why servers want that bandwidth after all - just that they are exceedingly rare in consumer applications.
Posted on Reply
#31
user556
Wasn't trying to say you were wrong. Just not everything is your average gaming rig. And the stereotyped consumer shouldn't be the only focus, especially since the vast majority of them only use a phone for everything now anyway.

Hell, crypto-mining has proven that.

Oh, and sometimes, when things are standard issue they get used.
Posted on Reply
#32
trsttte
user556Wasn't trying to say you were wrong. Just not everything is your average gaming rig. And the stereotyped consumer shouldn't be the only focus, especially since the vast majority of them only use a phone for everything now anyway.

Hell, crypto-mining has proven that.

Oh, and sometimes, when things are standard issue they get used.
I mean yes, it just so happens that for the not average consumer/gaming rig you have to pay the premium of hedt sadly. Intel was king at this making sure simple-ish things like ECC never got any traction in the consumer market
Posted on Reply
#33
Valantar
user556Wasn't trying to say you were wrong. Just not everything is your average gaming rig. And the stereotyped consumer shouldn't be the only focus, especially since the vast majority of them only use a phone for everything now anyway.

Hell, crypto-mining has proven that.

Oh, and sometimes, when things are standard issue they get used.
trsttteI mean yes, it just so happens that for the not average consumer/gaming rig you have to pay the premium of hedt sadly. Intel was king at this making sure simple-ish things like ECC never got any traction in the consumer market
Yeah, there's always a balancing act here, and Intel has historically skewed towards what I think is too restrictive overall - ECC is a good example of this. Still, it doesn't make sense to make MSDT platforms more expensive to accomodate tiny niche use cases that far less than 1% of users will ever touch, let alone have as a frequent task. That's what HEDT and entreprise platforms are for, after all - for those niche use cases.

Also, both Anandtech's and TPU's CPU benchmark suites go far beyond what would be done on "your average gaming rig" or what "stereotyped consumers" would do, including scientific computation, simulation, modelling, complex rendering, AI/ML, and other heavy workstation tasks - hence why I used those as examples of even demanding tasks often not scaling well with memory bandwidth. One way of wording this that I've heard repeated quite a lot across both forums and high quality technical reporting: if you have a workload that benefits significantly from RAM bandwidth, chances are that you are very well aware of this.

There is absolutely an argument for "build it and they will come" in terms of features (high speed I/O is very much in that category - there are few applications that max out even PCIe 3.0x16, let alone 4.0 or 5.0; there are very few peripherals that make use of even 10Gbps USB, let alone 20 or 40, etc.), but that needs to be weighed against the realism of that potentiality as well as the cost of implementing it, and increasing MSDT RAM channel counts just doesn't pass any type of reasonable bar there. If reasonably-common tasks could make better use of RAM bandwidth they likely already would (HEDT exists, after all, and has for a decade), and the added cost would be very significant, on top of motherboard prices having increased dramatically over the past few years.
Posted on Reply
#34
user556
Except DDR5 DIMMs have channel doubling as standard. All it takes is to also add that internally to the RAM controllers in the CPUs.
Posted on Reply
#35
Valantar
user556Except DDR5 DIMMs have channel doubling as standard. All it takes is to also add that internally to the RAM controllers in the CPUs.
CPUs have controllers to address every channel that's available from each DIMM - they're not leaving half the DIMM idle and unused. DDR4 controllers (typically) ran a single 64-bit channel each; DDR5 controllers (typically) run two 32-bit channels. They could of course make single-channel DDR5 controllers as well, but then they would always implement two of them, as the DIMMs all have two interfaces and leaving that performance on the table would be downright silly.

The question here isn't that, it's whether the "12 channels" in the leak means 12 actual channels (i.e. 12x32-bit bus width) or 12 DDR4-equivalent "channels", how channels have been described and spoken of for a decade or more (i.e. 12x2x32-bit bus width).
Posted on Reply
#36
user556
Right, DDR5 DIMMs have two channels per DIMM. Intel clearly haven't documented things that way thus far. Question is: Is AMD following suit with these 12 channels? Is that 12 x 64-bit (as per Intel's 12900K) or 12 x 32-bit? That was opening question in the first post of this topic - www.techpowerup.com/forums/threads/12-channel-ddr5-memory-support-confirmed-for-zen-4-epyc-cpus-by-amd.289751/post-4663234

Obviously, if Intel is technically correct in how their controller is operating then that can only mean they've merged pairs of 32-bit, two pairs, to make half the number of 64-bit logical channels - the stated two channels. Which can make sense too, because the RAM controller also has to handle DDR4 - Better reuse of hardware when switching modes.
Posted on Reply
#37
Valantar
user556Right, DDR5 DIMMs have two channels per DIMM. Intel clearly haven't documented things that way thus far. Question is: Is AMD following suit with these 12 channels? Is that 12 x 64-bit (as per Intel's 12900K) or 12 x 32-bit? That was opening question in the first post of this topic - www.techpowerup.com/forums/threads/12-channel-ddr5-memory-support-confirmed-for-zen-4-epyc-cpus-by-amd.289751/post-4663234
Yes, and that's what we've been discussing ever since, though with an aside into whether increasing consumer platform memory channels is feasible.
user556Obviously, if Intel is technically correct in how their controller is operating then that can only mean they've merged pairs of 32-bit, two pairs, to make half the number of 64-bit logical channels - the stated two channels. Which can make sense too, because the RAM controller also has to handle DDR4 - Better reuse of hardware when switching modes.
Their controllers operate with two 32-bit channels, otherwise it wouldn't be a DDR5 controller - this is entirely determined by the standard. There is no way of "merging" them that wouldn't make it incompatible with the RAM. Thus they aren't technically correct, they are practically correct - their CPUs have "two channels" in terms of being equivalent to two DDR/2/3/4 channels. That DDR5 mucks this up by splitting the channels in two makes communicating this to anyone not very technically inclined nearly impossible (calling it quad channel would verge on misleding advertising, as it would imply it being 2x more than previous generations, and getting into the weeds of channel width in marketing or anything like that is unlikely to go well). They thus have to choose between clarity of communication and technical accuracy, and (rightly, IMO) they have gone for the former.

AMD is IMO very likely to do the same, as the work of constantly reminding everyone that DDR5 channels are half as wide as DDR4 channels is going to get very annoying very quickly (and it would necessitate negatively loaded terms like "half-width channels" or something like that ("32-bit channels" wouldn't cut it to avoid misleading marketing as customers can't be expected to know how wide channels have been previously). It is also ultimately a moot point in anything but the technical workings of this, as the end result is ~the same due to each DIMM now also being dual-channel. In essentially every scenario where you're trying to communicate the capabilities of your CPU or platform, sticking to technically wrong "aggregate" channels is less confusing and more informative than technically accurate language.
Posted on Reply
#38
trsttte
ValantarAMD is IMO very likely to do the same, as the work of constantly reminding everyone that DDR5 channels are half as wide as DDR4 channels is going to get very annoying very quickly (and it would necessitate negatively loaded terms like "half-width channels" or something like that ("32-bit channels" wouldn't cut it to avoid misleading marketing as customers can't be expected to know how wide channels have been previously). It is also ultimately a moot point in anything but the technical workings of this, as the end result is ~the same due to each DIMM now also being dual-channel. In essentially every scenario where you're trying to communicate the capabilities of your CPU or platform, sticking to technically wrong "aggregate" channels is less confusing and more informative than technically accurate language.
I can already see the confusion that would ensue: "but new computers are all 64 bits, how come now we're going back to 32 bits?"
Posted on Reply
#39
user556
ValantarYes, and that's what we've been discussing ever since, though with an aside into whether increasing consumer platform memory channels is feasible.

Their controllers operate with two 32-bit channels, otherwise it wouldn't be a DDR5 controller - this is entirely determined by the standard. There is no way of "merging" them that wouldn't make it incompatible with the RAM.
With existing DDR5 DIMMs, that would be a lot slower on both bandwidth and latency compared to DDR4 in dual channel setup - which is how all these CPU+mobo are typically configured. Testing has shown the 12900K getting higher bandwidth with DDR5.

When the specs say dual-channel, Intel will be meaning 2 x 64-bit in both DDR4 and DDR5. Merging 2 x 32-bit, to make it act like 1 x 64-bit, will be easy enough inside of the RAM controller section of the CPU.
Posted on Reply
#40
Valantar
user556With existing DDR5 DIMMs, that would be a lot slower on both bandwidth and latency compared to DDR4 in dual channel setup - which is how all these CPU+mobo are typically configured. Testing has shown the 12900K getting higher bandwidth with DDR5.

When the specs say dual-channel, Intel will be meaning 2 x 64-bit in both DDR4 and DDR5. Merging 2 x 32-bit, to make it act like 1 x 64-bit, will be easy enough inside of the RAM controller section of the CPU.
Again: no. There is no merging going on. DDR5 is explicitly designed to run two concurrent 32-bit channels per DIMM (or per n DIMMs in an nDPC layout). The controller treats them separately becuase that's how they work. This entire discussion is about the terms used to describe this, as the reduction in channel width and concurrent doubling in channels makes this confusing. The only "merging" going on is in how this is described.
Posted on Reply
#41
user556
The Intel specs say two channels and that'll be 2 x 64-bit, anything less would be a backward step.

To get that, the DDR5 DIMMs are effectively being treated as 1 x 64-bit channel. And to do that the two physical 32-bit channels of each DIMM have to be merge to an effective single 64-bit channel. Two effective channels being what Intel is specifying.
Posted on Reply
#42
Valantar
user556The Intel specs say two channels and that'll be 2 x 64-bit, anything less would be a backward step.

To get that, the DDR5 DIMMs are effectively being treated as 1 x 64-bit channel. And to do that the two physical 32-bit channels of each DIMM have to be merge to an effective single 64-bit channel. Two effective channels being what Intel is specifying.
No. They have to be spoken of as if they were analogous to previous DDR standards. Which makes sense when you're trying to communicate something. There is absolutely zero requirement for there to be an actual "merging" on any level except language for this to work. None. Language is a pragmatic medium, not an exact one, and has no causal or indexical relation to that which it signifies.

Its the same thing with LPDDR4X, which also has 32-bit channels - Renoir and other chips using it still report "dual channel" memory despite actually having 4 32-bit channels. Why? Because in terms of communication, keeping "channels" as meaning "64-bit channels or pairs of channels" (or fours, given that LPDDR4X can even use 16-bit channels) is the only way of ensuring some form of understanding in communication. The system handles however many actual channels there are on its own regardless of this - the purpose of these designations is communication, not perfect technical description. So there is no requirements for these channels to be merged in any way, the only requirement is a consideration of what makes for the most clear communication.
Posted on Reply
#43
user556
So you think that when Intel says it is dual channel that in reality it is four channels that can fetch/write four independent memory addresses in parallel?

What I'm saying is that there is only two independent channels - as stated by Intel. But to achieve that while still utilising all four electrical databuses of two DDR5 DIMMs it will require merging. Otherwise two channels will by 2 x 32-bit only, which will lose throughput compared to 2 x 64-bit DDR4 DIMMs.
Posted on Reply
#44
Valantar
user556So you think that when Intel says it is dual channel that in reality it is four channels that can fetch/write four independent memory addresses in parallel?
Yes. Exactly this. That is the entire point of DDR5 moving to its split channel layout - there are performance and latency gains to be had through this, while bus width and trace layout complexity is kept the same.
user556What I'm saying is that there is only two independent channels - as stated by Intel. But to achieve that while still utilising all four electrical databuses of two DDR5 DIMMs it will require merging. Otherwise two channels will by 2 x 32-bit only, which will lose throughput compared to 2 x 64-bit DDR4 DIMMs.
But that's the thing: there are only two channels in how this is communicated to people. For the system, for the memory controller, for the DIMMs, there are four half-width channels. But to avoid confusion and people making bad comparisons and misunderstanding things, those are presented as "(64-bit-equivalent aggregate) channels" to humans outside of the few scenarios where the difference matters.
Posted on Reply
#45
user556
ValantarBut that's the thing: there are only two channels in how this is communicated to people. For the system, for the memory controller, for the DIMMs, there are four half-width channels. But to avoid confusion and people making bad comparisons and misunderstanding things, those are presented as "(64-bit-equivalent aggregate) channels" to humans outside of the few scenarios where the difference matters.
I think the CPU/controller side implementations for separation into smaller independent 32-bit channel are still to come. For the moment, my hunch is there is a simplified pairing option that is being used to manage DRR5 DIMMs as effectively single 64-bit wide channels. Making them more functionally like DDR4 DIMMs. Hence the no-change in channel config in the CPU specs. One step at a time sort of thing.

And seems AMD is doing the same. Which means 12 x 64-bit for upcoming EPYCs - which is eye-watering amount of pins needed.
Posted on Reply
#46
trsttte
user556I think the CPU/controller side implementations for separation into smaller independent 32-bit channel are still to come. For the moment, my hunch is there is a simplified pairing option that is being used to manage DRR5 DIMMs as effectively single 64-bit wide channels. Making them more functionally like DDR4 DIMMs. Hence the no-change in channel config in the CPU specs. One step at a time sort of thing.

And seems AMD is doing the same. Which means 12 x 64-bit for upcoming EPYCs - which is eye-watering amount of pins needed.
Given that Tiger Lake already used LPDDR4 on some computers and LPDDR4 uses this split bus scheme I doubt there's any thruth to that, it's a simple and very effective optimization, same with amd.

All in all we're complicating what doesn't need to be complicated because we're a bunch of nerds that know more in depth how this things work, there was a possibility for mischievous suits to use this to fool customers but they probably don't understand the thing well enough to do it so the engineers were able to simple to do more and present us with the performance gains without marketing trickeries getting in the way
Posted on Reply
#47
user556
Complicating in what way? That the 12900K is not 2 channels of 64-bit at all and really 4 channels of 32-bit but Intel isn't saying so. Or that it's really only 2 channels of 32-bit, all DIMMs are on the same string and the extra bandwidth wasn't needed anyway?
Posted on Reply
#48
trsttte
user556Complicating in what way? That the 12900K is not 2 channels of 64-bit at all and really 4 channels of 32-bit but Intel isn't saying so. Or that it's really only 2 channels of 32-bit, all DIMMs are on the same string and the extra bandwidth wasn't needed anyway?
We're counting bits and looking at how they're arranged, that's what I mean by complicating. It's 2 channels, just like before and the memory tech is better.

Now if you want to go more in depth, each channel is split (same overall bus lenght of 64 but split in 32+32) that can be addressed and push data individually in a similar way to how a quad channel system would work. Do you want to call that quad channel? Sure though not really
Posted on Reply
#49
user556
Well, it either is or isn't quad.
Posted on Reply
#50
Valantar
user556I think the CPU/controller side implementations for separation into smaller independent 32-bit channel are still to come. For the moment, my hunch is there is a simplified pairing option that is being used to manage DRR5 DIMMs as effectively single 64-bit wide channels. Making them more functionally like DDR4 DIMMs. Hence the no-change in channel config in the CPU specs. One step at a time sort of thing.
Not at all. As @trsttte said above, 32-bit channels have been in use on PCs for several years already. Also, the DDR5 standard is based around each DIMM having two 32-bit channels. You literally cannot make a DDR5-compliant and compatible controller without making it handle individual 32-bit channels. As there will always be two of these per DIMM, and some DDR5 controllers will also be DDR4-compatible, there is little doubt that these controllers will be similar to combined DDR4/LPDDR4X controllers: One block capable of controlling either a single 64-bit (DDR4) channel, or two separate 32-bit (LPDDR4X or DDR5 depending on the chip in question) channels. This does not make them merged or anything like that - it means that it's a dual-mode controller that can combine its two 32-bit interfaces into a single 64-bit interface when needed. DDR5 does not support or make use of such a combined interface, but bases its specific modes of operations and performance characteristics (such as less actual latency vs. on-paper latency) on these channels being separate.

Open task manager in any LPDDR4X system, you'll find it saying "dual channel" despite having 4 32-bit channels. Why? Clarity of communication. Nothing else.

You really, really need to listen to what I've been saying all along: what is communicated to people, and what is technical reality, are not necessarily identical. There is no direct causal or indexical relation between words and what they signify - especially when we're stacking layers of abstraction like we are here. Often, technically inaccurate communication is better because it more effectively communicates the core characteristics of what is communicated. This is what we are seeing now. And that is what @trsttte said above: for anyone but us here, who actually know the bus width of RAM in PCs (which really takes some doing), this is a distinction without a difference. Whether it's 2 64-bit channels or 4 32-bit channels is immaterial - aggregate bandwidth is the most important value for comparison, both across and within generations of tech. And, as "channels" has been how this has been communicated since the dawn of DDR RAM and multi-channel systems, the only sensible approach to ensuring understandable communication is to abandon technical accuracy in service of communicative pragmatism. Thus, four separate 32-bit channels are then "dual channel", because they are equivalent to previous dual-channel setups. Calling them quad channels would lead people to expect "twice as much" than before; calling them 32-bit channels would cause people to ask "how many bits were my channels before this?".

Remember: even CPU spec sheets are not written to appeal to those seeking in-depth technical knowledge. They are simplifications made in order to communicate the capabilities and requirements of a component at a relatively high level of detail, but are by no means exhaustive.
user556Well, it either is or isn't quad.
And this shows exactly that you haven't been listening at all. "Dual channel" DDR5 is quad channel. It will always be. But it won't be called that, because nobody knows the channels are half as wide, making calling it quad channel wildly misleading. It is, therefore, in all communications "dual (equivalent to previous generations) channel".
Posted on Reply
Add your own comment
Dec 17th, 2024 20:44 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts