• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Patents Chiplet Architecture for Radeon GPUs

Joined
Mar 21, 2016
Messages
2,533 (0.78/day)
Well smaller chips are cheaper than a larger monolithic design by a wide margin.
 
Joined
May 2, 2017
Messages
7,762 (2.75/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
Big question is, will it cost more.
The main reason for doing this is to reduce costs. So no. The interposer will obviously not be cheap, but given sufficient production volume the cost of that will make little difference compared to the savings of making smaller dice. See my calculations a few posts up for a rough estimation.
 
Joined
Feb 3, 2017
Messages
3,879 (1.33/day)
Processor Ryzen 7800X3D
Motherboard ROG STRIX B650E-F GAMING WIFI
Memory 2x16GB G.Skill Flare X5 DDR5-6000 CL36 (F5-6000J3636F16GX2-FX5)
Video Card(s) INNO3D GeForce RTX™ 4070 Ti SUPER TWIN X2
Storage 2TB Samsung 980 PRO, 4TB WD Black SN850X
Display(s) 42" LG C2 OLED, 27" ASUS PG279Q
Case Thermaltake Core P5
Power Supply Fractal Design Ion+ Platinum 760W
Mouse Corsair Dark Core RGB Pro SE
Keyboard Corsair K100 RGB
VR HMD HTC Vive Cosmos
Cost and chiplet design overhead is also the function of chiplet size and count.
 
Joined
May 2, 2017
Messages
7,762 (2.75/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
Cost and chiplet design overhead is also the function of chiplet size and count.
True. Designing a cutting-edge chip and getting it mass produced does after all cost from hundreds of millions of USD to billions of USD. If a chiplet design allows them to go from, say, small-medium-large-XL monolithic chips to small+medium chiplets in various combinations, that is a massive R&D and manufacturing savings even when accounting for the R&D needed for interposer development, advanced packaging technologies, etc.
 
Joined
Jan 14, 2019
Messages
13,791 (6.26/day)
Location
Midlands, UK
Processor Various Intel and AMD CPUs
Motherboard Micro-ATX and mini-ITX
Cooling Yes
Memory Overclocking is overrated
Video Card(s) Various Nvidia and AMD GPUs
Storage A lot
Display(s) Monitors and TVs
Case The smaller the better
Audio Device(s) Speakers and headphones
Power Supply 300 to 750 W, bronze to gold
Mouse Wireless
Keyboard Mechanic
VR HMD Not yet
Software Linux gaming master race
Judging by the 20 C difference between edge temp and hotspot temp on my 5700 XT under load, imagine it must be easier to cool a bunch of smaller dies than a single big one.
 
Joined
May 2, 2017
Messages
7,762 (2.75/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
Judging by the 20 C difference between edge temp and hotspot temp on my 5700 XT under load, imagine it must be easier to cool a bunch of smaller dies than a single big one.
That depends. Getting a single cold plate to make ideal contact with a collection of individual surfaces will always be more difficult than having it make contact with a single surface. Also, edge/hotspot temperature deltas like that are likely found on all high powered chips, it's just rare for them to have a thermal reporting system that allows users to see both. A smaller die is of course likely to pull less power and might have a smaller distance from edge to hotspot, but the difference isn't likely to be huge. The portion of the chip consuming the power will always be hotter than surrounding regions.
 
Joined
Jul 8, 2019
Messages
169 (0.08/day)
So they decreased the chiplet dependency on new ryzens 5000, and they want to introduce similar thing to GPU? Why? Havent they learned about latency...
 
Joined
May 2, 2017
Messages
7,762 (2.75/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
So they decreased the chiplet dependency on new ryzens 5000, and they want to introduce similar thing to GPU? Why? Havent they learned about latency...
Hm? There are exactly the same amount of chiplets in Ryzen 5000 as Ryzen 3000. They reduced the number of CCXes (Core Complex) per CCD (chiplet, Core Complex Die) from 2 to 1 by doubling the number of cores per CCX, but there are still two CCDs + an IOD in anything with >8 cores and one CCD for anything =<8 cores.
 
Joined
Jul 8, 2019
Messages
169 (0.08/day)
Hm? There are exactly the same amount of chiplets in Ryzen 5000 as Ryzen 3000. They reduced the number of CCXes (Core Complex) per CCD (chiplet, Core Complex Die) from 2 to 1 by doubling the number of cores per CCX, but there are still two CCDs + an IOD in anything with >8 cores and one CCD for anything =<8 cores.

You are right. I thought they've ditched the whole infinity fabric shtick and made unified die. They actually didn't.
 
Joined
May 2, 2017
Messages
7,762 (2.75/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
You are right. I thought they've ditched the whole infinity fabric shtick and made unified die. They actually didn't.
That's only the APUs, AMD aren't going back to monolithic dice for CPUs, likely not ever. The MCM approach allows them low production costs, high yields, great binning flexibility, easy configurability, and a heap of other advantages. And latency is much improved too, even if monolithic chips are still better in that regard.
 
Joined
Apr 24, 2020
Messages
2,798 (1.61/day)
So they decreased the chiplet dependency on new ryzens 5000, and they want to introduce similar thing to GPU? Why? Havent they learned about latency...

Ryzen 5000 I/O die only has 50GBps to each chiplet. GPUs need 500GBps (10x more than CPU bandwidth), but are allowed to have higher latency. The infinity fabric on AMD's CPU needs to be majorly changed to be effective in a GPU architecture.

NVidia's NVLink is closer to a proper chiplet design than anything AMD has made in their GPUs so far. The AMD MI100 Infinity Link system is along the right approach, but only reaches 80GBps. NVidia is pushing 600GBps with the latest generation of NVLink.
 
Joined
May 2, 2017
Messages
7,762 (2.75/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
Ryzen 5000 I/O die only has 50GBps to each chiplet. GPUs need 500GBps (10x more than CPU bandwidth), but are allowed to have higher latency. The infinity fabric on AMD's CPU needs to be majorly changed to be effective in a GPU architecture.

NVidia's NVLink is closer to a proper chiplet design than anything AMD has made in their GPUs so far. The AMD MI100 Infinity Link system is along the right approach, but only reaches 80GBps. NVidia is pushing 600GBps with the latest generation of NVLink.
IF can scale out much, much wider than its implementation in Ryzen though, so aggregate bandwidth shouldn't be a problem. But still, there's no mention of IF in the patent, so they might be using some other bus for this (or just keeping the patent intentionally vague, obviously).
 
Joined
Feb 3, 2017
Messages
3,879 (1.33/day)
Processor Ryzen 7800X3D
Motherboard ROG STRIX B650E-F GAMING WIFI
Memory 2x16GB G.Skill Flare X5 DDR5-6000 CL36 (F5-6000J3636F16GX2-FX5)
Video Card(s) INNO3D GeForce RTX™ 4070 Ti SUPER TWIN X2
Storage 2TB Samsung 980 PRO, 4TB WD Black SN850X
Display(s) 42" LG C2 OLED, 27" ASUS PG279Q
Case Thermaltake Core P5
Power Supply Fractal Design Ion+ Platinum 760W
Mouse Corsair Dark Core RGB Pro SE
Keyboard Corsair K100 RGB
VR HMD HTC Vive Cosmos
IF can scale out much, much wider than its implementation in Ryzen though, so aggregate bandwidth shouldn't be a problem. But still, there's no mention of IF in the patent, so they might be using some other bus for this (or just keeping the patent intentionally vague, obviously).
Sure IF can scale. The problem isn't scalability, it is probably power at large bandwidth numbers :)
This is not unique to AMD either, Nvidia has the same problem with NVLink.
 
Joined
May 2, 2017
Messages
7,762 (2.75/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
Sure IF can scale. The problem isn't scalability, it is probably power at large bandwidth numbers :)
This is not unique to AMD either, Nvidia has the same problem with NVLink.
Oh, absolutely. But given that AMD can handle a ton of IF links over relatively long distances through a PCB substrate in TR with about 70W of power for those links + the IOD (including 8 memory controllers and a heap of PCIe), implementing a wide link setup through a silicon interposer for GPUs ought to be manageable in terms of power if we consider a total package power envelope of 250-300W.
 
Joined
Mar 21, 2016
Messages
2,533 (0.78/day)
Chimp innovation at it's finest...so advanced you'd swear it's bananas! This chimp copies that chimp who makes those chimps go chimpanzee OMG bananas over it!!!
 
Joined
May 2, 2017
Messages
7,762 (2.75/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
so, amd just copies every step Intel has already done, or planned to do. yawn
Ah, yes, because nobody has talked about MCM GPUs before Intel ...

My guess, AMD, Nvidia and Intel have all been at work on this tech for 3+ years.
 
Joined
Mar 10, 2010
Messages
11,880 (2.19/day)
Location
Manchester uk
System Name RyzenGtEvo/ Asus strix scar II
Processor Amd R5 5900X/ Intel 8750H
Motherboard Crosshair hero8 impact/Asus
Cooling 360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory Gskill Trident Z 3900cas18 32Gb in four sticks./16Gb/16GB
Video Card(s) Asus tuf RX7900XT /Rtx 2060
Storage Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s) Samsung UAE28"850R 4k freesync.dell shiter
Case Lianli 011 dynamic/strix scar2
Audio Device(s) Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply corsair 1200Hxi/Asus stock
Mouse Roccat Kova/ Logitech G wireless
Keyboard Roccat Aimo 120
VR HMD Oculus rift
Software Win 10 Pro
Benchmark Scores laptop Timespy 6506
so, amd just copies every step Intel has already done, or planned to do. yawn
In what way, AMD are laying out a path to their version of multi die GPU and Intel sure as shit were not doing multi die GPU before AMD.
Pontevechio was for servers not consumer's.
Interesting actual angle, from my reading you have master and slave dies, massive bandwidth but essentially one tile to rule them all and an io die in the interposer.

First GPU does all the scheduling, the first virtex pass on math's Then hand's out work, there may be a efficiency hit on the first designs, of few tiles but if it scales it could serve well as a forward path and be really effective across 8 or more tiles.
 
Last edited:
Joined
May 2, 2017
Messages
7,762 (2.75/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
This solution is based on a 12 inch wafers and in the future the industry will move to 18 inch wafers, which means higher utilization of the fab and better pricing per wafer and eventually better prices to the end user. Basically, much more dies per wafer. This die per wafer calculator show the various options per wafer size: https://anysilicon.com/die-per-wafer-formula-free-calculators/
Hasn't that been "in the future" for like two decades now, with no real progress being made? Considering the massive fab expansions in the works currently (planned to be ready for mass production between this year and 2025-27), all of which are 300mm, it's going to be a long, long time until 450mm wafers take over high end fabs.
 
Top