Saturday, October 26th 2024
AMD Ryzen 7 9800X3D Has the CCD on Top of the 3D V-cache Die, Not Under it
Much of the Ryzen 7 9800X3D teaser material from AMD had the recurring buzzwords "X3D Reimagined," causing us to speculate what it could be. 9550pro, a reliable source with hardware leaks, says that AMD has redesigned the way the CPU complex die (CCD) and 3D V-cache die (L3D) are stacked together. In past generations of X3D processors, such as the 5800X3D "Vermeer-X" and the 7800X3D "Raphael-X," the L3D is stacked on top of the CCD. It would stack above the central region of the CCD that has the on-die 32 MB L3 cache, while blocks of structural silicon would be placed on top of the edges of the CCD that have the CPU cores, with these structural silicon blocks performing the crucial task of transferring heat from the CPU cores to the IHS above. This is about to change.
If the leaks are right, AMD has inverted the CCD-L3D stack with the 9000X3D series such that the "Zen 5" CCD is now on top, the L3D is below it, under the central region of the CCD. The CPU cores now dissipate heat to the IHS as they do on regular 9000 series processors without the 3D V-cache technology. The way we imagine they achieved this is by enlarging the L3D to align with the size of the CCD, and serve as a kind of "base tile." The L3D would have to be peppered with TSVs that connect the CCD to the fiberglass substrate below. We know where AMD is going with this in the future. Right now, the L3D "base tile" contains the 64 MB 3D V-cache that gets appended to the 32 MB on-die L3 cache, but in the future (probably with "Zen 6"), AMD could design the CCDs with TSVs even for the per-core L2 caches.This piece of speculation also perfectly explains what "X3D boost" could be. With the CCD making direct contact with the IHS the way it is in non-X3D processors, the X3D processors could have the same overclocking capabilities as the regular chips. There are much fewer thermal hurdles in the way, and AMD can go ahead and give these chips the same TDP and PPT values as regular chips, as well as higher clock speeds. The company used to be conservative with the PPT and clock speeds of its X3D processors in the past.
AMD is expected to launch the Ryzen 7 9800X3D on November 7, 2024.
Source:
HXL (Twitter)
If the leaks are right, AMD has inverted the CCD-L3D stack with the 9000X3D series such that the "Zen 5" CCD is now on top, the L3D is below it, under the central region of the CCD. The CPU cores now dissipate heat to the IHS as they do on regular 9000 series processors without the 3D V-cache technology. The way we imagine they achieved this is by enlarging the L3D to align with the size of the CCD, and serve as a kind of "base tile." The L3D would have to be peppered with TSVs that connect the CCD to the fiberglass substrate below. We know where AMD is going with this in the future. Right now, the L3D "base tile" contains the 64 MB 3D V-cache that gets appended to the 32 MB on-die L3 cache, but in the future (probably with "Zen 6"), AMD could design the CCDs with TSVs even for the per-core L2 caches.This piece of speculation also perfectly explains what "X3D boost" could be. With the CCD making direct contact with the IHS the way it is in non-X3D processors, the X3D processors could have the same overclocking capabilities as the regular chips. There are much fewer thermal hurdles in the way, and AMD can go ahead and give these chips the same TDP and PPT values as regular chips, as well as higher clock speeds. The company used to be conservative with the PPT and clock speeds of its X3D processors in the past.
AMD is expected to launch the Ryzen 7 9800X3D on November 7, 2024.
110 Comments on AMD Ryzen 7 9800X3D Has the CCD on Top of the 3D V-cache Die, Not Under it
As if you're expected to have one 7800X3D for games, and a 7950X for work. Strictly thinking inside the box and turn it into law lol.
Or, people who won't stop bitching about why gaming laptops won't/shouldn't have cameras.. yeah you're supposed to buy another laptop for that, or a separate camera..
/end of rant
I remember a couple of months ago, my "ok" comment being deleted, for not adding anything to the conversation here.
Though indeed there shouldn't be any difference, it's still all in the same package and whatnot
A "Page" has been set at 4096 bytes since the 1980s (even ARM systems are paged at 4k). There's a 4096 entry TLB in Zen5, meaning there is 4096 (entries) x 4096 bytes (per entry with default pages) == 16MB of RAM indexed in the Virtual RAM page table before the CPU Core runs out of entries.
That's smaller than Zen5 x3d L3 cache. In fact, this curious slowdown has been true for quite a few generations (and is likely a reason why Zen5 upgraded from 3072 entry into 4096 entry TLB between Zen4 and Zen5).
--------
Modern computers can theoretically use "HugePages" (2MB or 1GB in size). Servers are configured to use them but consumer hardware has so much backwards compatibility issues with Windows and Linux that the default page size remains 4k in practice. Still, if you can play with the right settings, setting up the TLB to be of these larger page sizes leads to 10%+ improvements as more data effectively fits in the TLB-cache (a process necessary before the real cache is hit).
Since node shrinking will continue to be a tougher problem (less nm = less process yields, more heat density, etc), AMD wants to make place for bigger CCDs even with 4nm or 3nm. L3 cache takes size of roughly 4 Zen 5 cores. Putting that cache below cores would allow not only putting more cores into a CCD, but also expanding L3 cache and other caches, too. This way AMD can easily reach 10-12 cores per CCD with 96+ MB of cache in regular non-X3D processors.
Putting cache below CCD also allows for significant core clocks boost, basically the same clocks as you'd get with non-X3D CPUs.
One may start to think whether this is not the beginning of an end of X3D processors as we know them.
Searching trough memory takes time and the bigger it is the more time it takes.
Giving more cores access to the same memory also racks up penalties.
each core will only have limited time to read and write to the memory, and coordinating everything becomes even harder.
Also note that L3 isn't something that makes everything faster, if you look at the benchmarks provided here by the People of TPU you will see that it's only interesting for virtualisation and gaming.
And since gaming doesn't scale with an increasing number of cores. a second CCD with access to a big cache is worthless for gaming.
as for virtualisation the shared L3 is nothing but a security risk. it's a joke that refers to www.imdb.com/title/tt0105929/
The 16 cores with all V-cache is not necessarily about thinking you need more than 8 cores for games. It's for people who wants 16 cores for work, but not wanting a compromize in either way with that high price. Moved and double V-cache might help there. Unified, shared V-cache would be a possible next step, but maybe not feasible for one reason or another.
Then there's conflicing info about recommended hardware for Space marine 2 4k, for instance. I haven't read into it, but 12 cores is recommended (both AMD and Intel) on Steam.
It's just that the 9800X3D actually can make use of it, not really a drawback. Just change it if you're not happy with it.
No, if what's stated ends up being correct in that the thermal issue is solved and clocks are the same between the X3D and non-X3D part productivity performance will be equal to or better than non-X3D parts. It would eliminate the downside to X3D chips.
So yeah, adding L3 to both CCD's would reduce productivity for a minor gain in performance. What's worse is that it'll increase performance for unwanted situations which they would want to mitigate through drivers anyway because ideally you want the gaming cores to be pinned to one CCD. In situations where it jumps to another, it won't match the 9800X3D's performance simply because of the latency incurred to jump to the other CCD.
So you're looking at a slight benefit for games in edge cases and a slight hit to productivity for a CPU that costs more. Pretty sure AMD said the same during 7950X3D launch when they did the math. Whether that changes remains to be seen
2) You are assuming that the 9800X3D will be clocked as high as the 9950X3D. If they can increase the clocks on the new X3D, they may choose to further segment by having higher clocks on the higher end part.
Mind you either way the frequency is increasing as compared to prior gen X3D parts so relative to past X3D parts any performance different as a result of the X3D cache will have changed this generation. Ok now I understand. You read the article, you just don't know what you are talking about / can't understand it.
"slight benefit for games in edge cases"?
Clearly you are unaware that the 7950X3D was 14% faster on average than the 7950X in games.
Even if there were 0 frequency improvements to the 9950X3D, it would mirror that performance increase at the very least.
In the CPU world that isn't slight, it's what you typically get with a new architecture.
You also don't seem to understand what edge cases are either, X3D's boost is not only to edge cases. A wide array of games benefit from X3D. You seem to be arguing against X3D in general which is just dumb. Every benchmark out there disproves you.
Also, since when is an increase in performance "unwanted"? Utter nonsense.