Wednesday, January 9th 2019

AMD 3rd Gen Ryzen AM4 Package Capable of Two 8-core Chiplets

At its CES 2019 keynote, AMD unveiled two killer client-segment products, the Radeon VII graphics card, which beats the GeForce RTX 2080; and a sneak preview of the 3rd generation Ryzen socket AM4 processor based on the company's "Zen 2" microarchitecture. As part of the unveil, CEO Lisa Su demonstrated an 8-core/16-thread 3rd generation Ryzen prototype processor in a head-to-head CineBench nT face-off with the Intel Core i9-9900K processor, which has the same core-count. The Ryzen narrowly beat the Intel flagship. Following this, Dr. Su held up a de-lidded sibling of the processor that was tested, revealing not one, but two dies.

This confirms that AMD is taking the heterogeneous multi-chip module approach to building its 3rd generation Ryzen processors, much like its 2nd generation EPYC processors that were unveiled late last year. The MCM of the processor Dr. Su held up had two chips, the smaller chip is an 8-core CPU chiplet built on the 7 nm process, that appears to have the same die-size as the 8-core chiplets that make up the 64-core 2nd gen EPYC MCMs, the larger die is an I/O controller logic built on the 14 nm process. This die controls the memory, PCIe, and SoC connectivity of the package. We noticed something curious about the way the two dies are arranged on the package substrate.
On close inspection of the substrate, we find that while the I/O controller die is somewhat centrally to the side of the package, the sole 8-core CPU chiplet is not located at a similar position (think Intel "Clarkdale" MCMs). On zooming in further, we find that just south of the 8-core CPU chiplet die, there appear to be blank bumps protruding over an area similar to that of a chiplet covered up by the outer layers of the substrate, leading us to conclude that the AM4 package is capable of three dies, an I/O controller, and two 8-core CPU chiplets. There very much will be a 16-core/32-thread Ryzen for the AM4 platform, and it's only a question of when.

The 16-core Ryzen AM4 MCM will be similar in concept to the larger 64-core SP3r2 EPYC/Threadripper MCMs: the CPU dies only pack the CPU cores and an InfinityFabric interface, while the I/O controller die is wired to multiple CPU dies, and manages the memory, PCIe, and SoC connectivity of the processor.

Interestingly, in the client-segment Intel dabbled with this concept a decade ago with "Clarkdale," which combined a 32 nm dual-core CPU die that spoke to a larger 65 nm die that controlled PCIe, memory, and an iGPU, with QPI serving as the interconnect between the two. Intel's requirements at the time were different. The company hadn't yet managed to put CPU and iGPU into a single die, and needed the iGPU to sit closer to the memory interface. The company would go onto fuse CPU and iGPU with the 32 nm "Sandy Bridge."

AMD's engineering bravado with "Matisse" also unlocks the possibility of the Ryzen "Raven Ridge" APU successor being an MCM with one 8-core chiplet, and an oversized I/O controller die that packs a "Vega" or "Navi" based iGPU, in addition to memory, PCIe, SoC, and the works. Dies on that package could be arranged differently from this.
Add your own comment

40 Comments on AMD 3rd Gen Ryzen AM4 Package Capable of Two 8-core Chiplets

#26
btarunr
Editor & Senior Moderator
RahnakSu already hinted at parts with more cores after the keynote and if the previous leaks were somewhat right in relation to SKUs... Did Intel's 9900k just get beaten in MT Cinebench by a R5 3600X not running final clocks? :twitch:
I think AMD will come up with a Ryzen 9 extension. Ryzen 7 could be up to 16-thread, and Ryzen 9 up to 32-thread. The chip these guys showed has the MP performance of i9-9900K and market positioning against i7-9700K.

I think the bigger news is that AMD has fully caught up with Intel at IPC.
Posted on Reply
#28
Wavetrex
I love how the cameraman held in position for light to reflect from the "empty spot" showing clearly there's stuff under there.

100% designed for 16 cores on AM4!
Smoking !

Bring it faster AMD, my wallet is itching !
Posted on Reply
#29
stimpy88
I'm down for a Ryzen9 16 core CPU! AMD, you did it, you have achieved single core performance parity with Intel. Keep the momentum up, and you will win the market.

It's going to be a long 6 months waiting for this, which will be my first AMD system in 12 years.

Oh, and all nice to see AdoredTV being proved right again, despite all the negativity towards him.
Posted on Reply
#31
Slizzo
btarunrI think AMD will come up with a Ryzen 9 extension. Ryzen 7 could be up to 16-thread, and Ryzen 9 up to 32-thread. The chip these guys showed has the MP performance of i9-9900K and market positioning against i7-9700K.

I think the bigger news is that AMD has fully caught up with Intel at IPC.
They were already pretty much caught up in terms of IPC. Only clock speed was lacking.
Posted on Reply
#32
Darmok N Jalad
stimpy88I'm down for a Ryzen9 16 core CPU! AMD, you did it, you have achieved single core performance parity with Intel. Keep the momentum up, and you will win the market.

It's going to be a long 6 months waiting for this, which will be my first AMD system in 12 years.

Oh, and all nice to see AdoredTV being proved right again, despite all the negativity towards him.
Intel hasn’t made any IPC advances since they’ve stalled out at 14nm. I believe that is something they are working toward with their Core replacement. Not to take away from AMD’s achievements here, but Intel is in a pretty big rut, and once they get themselves out of it, I expect them to come out swinging. Hopefully AMD knows this and is not resting for a second.
Posted on Reply
#33
InVasMani
Darmok N JaladThe score is very consistent with other reviews, so I don’t believe it’s gimped. Also, I think AMD more wanted to showcase that their demo Ryzen was matching the 9900K, but doing it at 75W vs 125W. It suggests they can not only match Intel in performance, but they have 50W of headroom to move past them. It’s not that far-fetched—Intel is behind on manufacturing, and they are pushing their current node to its absolute limits.

If that is true, then maybe those 5.0ghz rumors are valid. I think Zen was more architecturally limited to the low 4 GHz range than it was node limited. Zen 2 pretty much has to correct that problem so they can advance performance.


Actually, if I remember right, the GPU could just be built right into the IO chip. I believe that’s what happened with the custom GPU ATI built for the Xbox360 and what Intel did with early Core MCM designs. If they shrink that IO+GPU chip to 7nm, it would probably fit the footprint of the base IO chip.
Interesting so they could technically have perhaps two I/O GPU hub chips syncronizing with 2 CPU chiplet's which might utilizing hyper threading between each other for load balancing in perfect tandem with each other or alternating in perfect sync. That way each I/O GPU hub gets it's own CPU L1 cache latency rather than sharing them unapologetic. In fact they could both run in parallel, but alternating the I/O GPU/CPU chips so each chip kind of automatically manages waste heat in a better way in terms of hot spots. Since voltage currents are waves up down on/off staggering them would be ideal for dealing with the heat so make the two CPU dies diagonal from one another and same with GPU I/O dies.

I was actually thinking of Navi a bit in another thread AMD could have a I/O die with 6 chiplets similar to what a Vega 32 would be aka a completely cut in half Vega 64. They'd be better yields naturally and efficiency would be quite a bit better plus the waste heat easier to manage. Basically 6 smaller die Vega chips and one more monolithic I/O die chip sitting between all 6 of them 3 on each side of it. It would be a absolute powerhouse. I mean what does Vega 64 do for traditional ray tracing in terms of frame rates?

Cut that in half and presto no RTX gimmick just pure traditional ray tracing via brute force of essentially a chiplet render farm of sorts. Now with path tracing it could be interesting. We need to get to a point with ray tracing with path tracing though where you can apply that in real time to a scene based on like mipmap/LOD/Culling type behavior selectively for denoise. Simply making the more distant less important and visible scenes have a bit less denoise applied since it's less vital anyway.
Vya DomusNot only this was clearly designed for two dies there's also room for another one TDP wise according to those power figures.
Not that surprised if they shrink that I/O die to 7nm they could squeeze in 3 chiplets alongside it.
Darmok N JaladIntel hasn’t made any IPC advances since they’ve stalled out at 14nm. I believe that is something they are working toward with their Core replacement. Not to take away from AMD’s achievements here, but Intel is in a pretty big rut, and once they get themselves out of it, I expect them to come out swinging. Hopefully AMD knows this and is not resting for a second.
The gist of it is Intel got really complacent in recent years. I think we'll see something akin to C2D/C2Q transition from AMD64 in response eventually, but that kind of change doesn't happen overnight. Much like AMD64 kicked the pants out of Intel for awhile and like Ryzen is doing pretty well now. I'm just hoping neither company gets to complacent and that they both try to one up the other in tangible significant ways as opposed to oh hey here's a 2-3% performance boost for 100% of the cost of the last generation product lineup enjoy the price gouging.
Posted on Reply
#34
Melvis
Good to see this 8core 16 thread CPU beating out the 9900k@4.7GHz in Cinebench and at lower clocks be my guess with that power consumption, 4,6GHz be my guess.
Posted on Reply
#35
efikkan
InVasManiInteresting so they could technically have perhaps two I/O GPU hub chips syncronizing with 2 CPU chiplet's which might utilizing hyper threading between each other for load balancing in perfect tandem with each other or alternating in perfect sync. That way each I/O GPU hub gets it's own CPU L1 cache latency rather than sharing them unapologetic. In fact they could both run in parallel, but alternating the I/O GPU/CPU chips so each chip kind of automatically manages waste heat in a better way in terms of hot spots. Since voltage currents are waves up down on/off staggering them would be ideal for dealing with the heat so make the two CPU dies diagonal from one another and same with GPU I/O dies.
There should be no need for any of this, the IO dies handles queuing of memory accesses from up to 8 chipliets as needed, there should no need for synchronizing threads, as memory accesses are not evenly distributed anyway. L1 and L2 cache are located on the chiplets.

-----

My concern is that when/if 12/16 core variants arrive, will the dual channel memory controller be enough? Many of the workloads which could utilize this many cores are very bandwidth intensive. I assume Ryzen 3 will use memory beyond DDR4-2933, but to my knowledge only DDR4-3200 is currently finalized by the JEDEC standard. And before anyone suggest overclocking memory, I would remind everyone that it's not a reliable solution.

Anyway, the benchmark conducted on CES was with DDR4-2666.
Posted on Reply
#36
Captain_Tom
efikkanThere should be no need for any of this, the IO dies handles queuing of memory accesses from up to 8 chipliets as needed, there should no need for synchronizing threads, as memory accesses are not evenly distributed anyway. L1 and L2 cache are located on the chiplets.

-----

My concern is that when/if 12/16 core variants arrive, will the dual channel memory controller be enough? Many of the workloads which could utilize this many cores are very bandwidth intensive. I assume Ryzen 3 will use memory beyond DDR4-2933, but to my knowledge only DDR4-3200 is currently finalized by the JEDEC standard. And before anyone suggest overclocking memory, I would remind everyone that it's not a reliable solution.

Anyway, the benchmark conducted on CES was with DDR4-2666.
Exactly. 3200 is still 20% higher than the demo that destroyed the 9900K. It should be plenty for the 12-core 3700X, and after that who cares?

I mean people buying the 16-core models will likely not be buying them for gaming first, and I doubt they will lose any performance. Remember that the 2990WX loses in gaming only because it has "island dies" - it's not because of a lack of bandwidth, it's a lack of a decent connection to bandwidth. And even then, they have shown the 2990WX would perform only ~10% worse in games than the other Ryzen chips if Windows would schedule correctly.
Posted on Reply
#37
InVasMani
You could actually fit 3 CPU chiplet's if you rotate one 90 degrees and shift the i/o into the corner. That's w/o needing to even shrink the i/o as well tried it in mspaint out of curiosity. If AMD engineers are savvy enough they could probably re-design such a chip eventually on the same socket w/ or w/o a i/o die shrink. Intel has it's hands full for awhile I guess AMD has a bit of ace or two up it's sleeve.
Posted on Reply
#38
TheLaughingMan
Captain_TomExactly. 3200 is still 20% higher than the demo that destroyed the 9900K. It should be plenty for the 12-core 3700X, and after that who cares?

I mean people buying the 16-core models will likely not be buying them for gaming first, and I doubt they will lose any performance. Remember that the 2990WX loses in gaming only because it has "island dies" - it's not because of a lack of bandwidth, it's a lack of a decent connection to bandwidth. And even then, they have shown the 2990WX would perform only ~10% worse in games than the other Ryzen chips if Windows would schedule correctly.
Actually Level1Tech proved that the issue is definitely the Windows scheduler. It has trouble understanding that the other cores are available all the time and gets itself caught in a loop of moving around threads instead of processing them. I am not sure how much of that is translated to games, but I would have to assume the behavior is the same.
Posted on Reply
#39
Unregistered
InVasManiYou could actually fit 3 CPU chiplet's if you rotate one 90 degrees and shift the i/o into the corner. That's w/o needing to even shrink the i/o as well tried it in mspaint out of curiosity. If AMD engineers are savvy enough they could probably re-design such a chip eventually on the same socket w/ or w/o a i/o die shrink. Intel has it's hands full for awhile I guess AMD has a bit of ace or two up it's sleeve.
Yeah, a 24 core Ryzen on AM4 platform would be a very nice, and dense, chip. Competition might bring this out, but IMO not until AMD felt pressure to do so, which right now they surely don't. Also they need income on existing product since their net profit margin is much lower than Intel per unit.
#40
InVasMani
I'm thinking given it's only dual channel memory and also power/heat for the platform 18c/36t or just 18c/18t could be more likely with a 3 CPU die + 1 I/O die for Ryzen AM4. If it were to happen at all that is. I don't think it's something AMD immediately would be interested in, but they pursue it eventually if they have lot of dies kicking around anyway down the road and want to use them up. I guess it depends on circumstances and how costly it would be to tweak the arrangement of them to make it possible in the first place. It really is quite interesting that 3 of them with the I/O die could still fit snugly together potentially.
Posted on Reply
Add your own comment
Nov 21st, 2024 12:35 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts