AMD Ryzen AI Max+ "Strix Halo" Die Exposed and Annotated

AleksandarK · Feb 18, 2025

AMD's "Strix Halo" APU, marketed as Ryzen AI Max+, has just been exposed in die-shot analysis. Confirming the processor's triple-die architecture, the package showcases a total silicon footprint of 441.72 mm² that integrates advanced CPU, GPU, and AI acceleration capabilities within a single package. The processor's architecture centers on two 67.07 mm² CPU CCDs, each housing eight Zen 5 cores with a dedicated 8 MB L2 cache. A substantial 307.58 mm² I/O complements these die that houses an RDNA 3.5-based integrated GPU featuring 40 CUs and AMD's XDNA 2 NPU. The memory subsystem demonstrates a 256-bit LPDDR5X interface capable of delivering 256 GB/s bandwidth, supported by 32 MB of strategically placed Last Level Cache to optimize data throughput.

The die shots reveal notable optimizations for mobile deployment, including shortened die-to-die interfaces that reduce the interconnect distance by 2 mm compared to desktop implementations. Some through-silicon via structures are present, which suggest potential compatibility with AMD's 3D V-Cache technology, though the company has not officially confirmed plans for such implementations. The I/O die integrates comprehensive connectivity options, including PCIe 4.0 x16 lanes and USB4 support, while also housing dedicated media engines with full AV1 codec support. Initial deployments of the Strix Halo APU will commence with the ASUS ROG Flow Z13 launch on February 25, marking the beginning of what AMD anticipates will be broad adoption across premium mobile computing platforms.

View at TechPowerUp Main Site | Source

Daven · Feb 18, 2025

It looks like a single CCD has an MTr/mm^2 of 128.2. That's the same as the 5090. 4 nm confirmed!

Squared · Feb 18, 2025

AMD told Chips and Cheese that these are literally the same CCDs that go into Ryzen 9000, which is indeed using TSMC N4P.

Bruno Vieira · Feb 18, 2025

Daven said:
It looks like a single CCD has an MTr/mm^2 of 128.2. That's the same as the 5090. 4 nm confirmed!

The process is called 4n, and the N stands for Nvidia. That's just the original 5nm with some Nvidia mods, just like Ada was. Its Nvidia 5nm confirmed.

Squared said:
AMD told Chips and Cheese that these are literally the same CCDs that go into Ryzen 9000, which is indeed using TSMC N4P.

In the same interview, they say this chip uses an organic substrate to connect the CCDs with lower latency and power than the desktop parts.

AlB80 · Feb 18, 2025

The IO looks like a regular GPU. Even USB controllers look fine. Modern graphics cards are equipped with USB-C ports.

Wirko · Feb 18, 2025

Bruno Vieira said:
In the same interview, they say this chip uses an organic substrate to connect the CCDs with lower latency and power than the desktop parts.

Desktop parts use organic substrate, and the huge number of wires is the reason IOD can't be placed closer to CCDs. This APU seems to include something more advanced that enables the chips to sit next to each other. It could be Local Si Interconnect (LSI), which is approximately the TSMC's version of EMIB.

Denver · Feb 19, 2025

AMD absolutely nailed the first generation with minimal setbacks. I never expected an MCM APU with 16 fucking cores to deliver such solid battery life while going toe-to-toe with Apple; despite using an inferior process node and lacking mac OS-level optimization. I can’t help but wonder how powerful the "Halo next-gen" will be, especially given the overwhelmingly positive reception.

Tek-Check · Feb 19, 2025

Bruno Vieira said:
n the same interview, they say this chip uses an organic substrate to connect the CCDs with lower latency and power than the desktop parts.

Yes, a new version of Fanout Infinity Link with organic interposer. They used something similar on Navi 31 to connect MCDs to GCD. Infinity Link brings a lot more density and bandwidth than Infinity Fabric.

Screenshot 2025-02-19 at 00-22-57 AMD ZEN 6 — Next-gen Chiplets & Packaging - YouTube.png

It looks like they are trialing this interconnect on Strix Halo and they will implement it across Zen6 chiplets and IOD as a new high-bandwidth, low latency and high-efficiency interconnect standard. Quite exciting, indeed.

AlB80 said:
The IO looks like a regular GPU. Even USB controllers look fine. Modern graphics cards are equipped with USB-C ports.

No. Regular GPU die has less diverse logic and functions than IOD.
Some modern GPUs have USB-C port as an interface for DP video signal, and not for carrying USB or PCIe data.

On Strix Halo IOD, USB3 and USB4 PHY are additional pieces of logic, as well as NPU, which are not present on GPU die.

Sound_Card · Feb 19, 2025

Imagine they got rid of the NPU and give up 8 more CU's.

Wirko · Feb 19, 2025

Tek-Check said:
Yes, a new version of Fanout Infinity Link with organic interposer. They used something similar on Navi 31 to connect MCDs to GCD. Infinity Link brings a lot more density and bandwidth than Infinity Fabric.

Is this the same as RDL (Re-Distribution Layer)? In my (limited) understanding, RDL is a thin and dense multilayer PCB, which is bonded as a whole, or maybe built layer by layer, on top of the organic substrate, under the chiplets. AMD didn't want to disclose if they used it in Navi 31 or MI300, at least not initially.

Denver said:
AMD absolutely nailed the first generation with minimal setbacks.

It is 1st gen but a lot of experience building MCMs has been funneled into this product. AMD could call it "Instinct AI 101", it would seem right.

Sound_Card said:
Imagine they got rid of the NPU and give up 8 more CU's.

Just an idea: CPU thread scheduling is so hard that no one can get it right these days, and "AI" has the potential to help here. If AMD and MS wanted of course.

Tek-Check · Feb 19, 2025

Sound_Card said:
Imagine they got rid of the NPU and give up 8 more CU's.

Not now. NPU is needed for AI workloads, power management, other productivity tasks, etc. Its size takes space of 12 CUs. They need NPU on a chip like this for further development, as the next gen of NPU should go over 100 TOPS. As the logic shrinks in the next gen, they will be able to add more CUs and IO on a die.

It took them four iterations to come to this maximum size design for the package used. It's better to have such product in the market than wait another year or so for yet another lab chip iteration to be perfected. It's more practical the way it is.

Wirko said:
Is this thread same as RDL (Re-Distribution Layer)? In my (limited) understanding, RDL is a thin and dense multilayer PCB, which is bonded as a whole, or maybe built layer by layer, on top of the organic substrate, under the chiplets. AMD didn't want to disclose if they used it in Navi 31 or MI300, at least not initially.

Final MI300 is CoWoS-S, though they did have CoWoS-R as test chip. Navi 31 uses InFO-R/oS packaging with 4 RDL layers.

AMD MI300 – Taming The Hype – AI Performance, Volume Ramp, Customers, Cost, IO, Networking, Software

Amazing engineering, but what of the path to market? With massive GPU shortages and Nvidia charging ~5x markups versus manufacturing costs, everyone in the industry is desperate for an alternative.…

semianalysis.com

For Strix Halo, we don't know, until we know. It's either InFO-R or InFO-L, or another interation.

ymdhis · Feb 19, 2025

I want this for desktop so much.

AlB80 · Feb 19, 2025

Tek-Check said:
No. Regular GPU die has less diverse logic and functions than IOD.
Some modern GPUs have USB-C port as an interface for DP video signal, and not for carrying USB or PCIe data.

On Strix Halo IOD, USB3 and USB4 PHY are additional pieces of logic, as well as NPU, which are not present on GPU die.

nv tensor cores are not NPU?

Wirko · Feb 19, 2025

ymdhis said:
I want this for desktop so much.

Transplanting that notebook mobo to a tower case should be possible.

TPUnique · Feb 19, 2025

Wirko said:
Transplanting that notebook mobo to a tower case should be possible.

No need, ITX mobos featuring this chip will probably be announced within this year.

Tek-Check · Feb 19, 2025

AlB80 said:
nv tensor cores are not NPU?

Tensor cores are found in SMs. Each SM integrates specialized hardware, including tensor cores, RT cores, texture units, etc., whereas NPU is a separate, dedicated logic on mobility APUs for neural processing. Two different designs.

TPUnique · Feb 19, 2025

Tek-Check said:
Not now. NPU is needed for AI workloads, power management, other productivity tasks, etc. Its size takes space of 12 CUs. They need NPU on a chip like this for further development, as the next gen of NPU should go over 100 TOPS. As the logic shrinks in the next gen, they will be able to add more CUs and IO on a die.

Is the NPU actually, technically needed zo ? AFAIK power management has never needed a dedicated NPU, nor did AI tasks.

It seems to me that it "needs" to be there only due to Microsoft's commercial tantrum.

Carillon · Feb 19, 2025

Wirko said:
Desktop parts use organic substrate, and the huge number of wires is the reason IOD can't be placed closer to CCDs. This APU seems to include something more advanced that enables the chips to sit next to each other. It could be Local Si Interconnect (LSI), which is approximately the TSMC's version of EMIB.

the first picture shows the IFOP on CCDs and IOD perfectly aligned, something that would be impossible with the N6 IOD on desktop. This alone could what allows thedies to be put next to each other

Tek-Check · Feb 19, 2025

TPUnique said:
Is the NPU actually, technically needed zo ? AFAIK power management has never needed a dedicated NPU, nor did AI tasks.
It seems to me that it "needs" to be there only due to Microsoft's commercial tantrum.

It is needed if equivalent AI acceleration, similar in function to tensor cores, cannot be found integrated within GPU cores.

mrnagant · Feb 19, 2025

TPUnique said:
Is the NPU actually, technically needed zo ? AFAIK power management has never needed a dedicated NPU, nor did AI tasks.

It seems to me that it "needs" to be there only due to Microsoft's commercial tantrum.

Jim Kelly has an interview somewhere since he has been at Tenstorrent. I remember the convo being something in regards to him talking to a PSU engineer, and even they could benefit from a micro ML processor that costs a couple bucks. NPUs will eventually be in all kinds of devices.

Do you technically need NPU capabilities? Sure don't. Just like you technically don't need RT capabilities to run RT stuff. You can do it all on a GPU with RT cores or even a CPU. It'll just be slower and use more energy. NPU can do something the CPU could do, but it'll do it much faster and use a lot less power. It's kinda why general processors can have multiple fixed function processors/engines on the same package.

igormp · Feb 19, 2025

Tek-Check said:
It is needed if equivalent AI acceleration, similar in function to tensor cores, cannot be found integrated within GPU cores.

Tensor cores are more general than NPUs tho, as in data types, and they require more power than a bare NPU as well.
AMD's software stack is still lacking when it comes to using their GPUs for ML acceleration, I guess that's why they are shoving an NPU in this product.

Speedyblupi · 2025-04-01T10:51:06+0100

Tek-Check said:
Not now. NPU is needed for AI workloads, power management, other productivity tasks, etc. Its size takes space of 12 CUs. They need NPU on a chip like this for further development, as the next gen of NPU should go over 100 TOPS. As the logic shrinks in the next gen, they will be able to add more CUs and IO on a die.

NPUs have their uses, but it looks to me like Microsoft went crazy for hammers, and now every problem looks like a nail to them. But most of those problems obviously aren't nails, and everyone else is using a different and better tool to solve the problem.
Most NPUs in computers don't do anything because almost nothing uses them. Even Copilot, which is what these NPUs are supposed to be for, doesn't use them (yet?).
NPUs are useful for some AI productivity tasks, but most people who need powerful AI accelerators for productivity buy graphics cards with more powerful capabilities than any integrated NPU.
A typical user's AI needs are more efficiently handled by cloud services which can dynamically distribute load across thousands of NPUs/TPUs, not by having their own NPU that does literally nothing 99%+ of the time.
Where are NPUs being used for power management, and what does an NPU do in those situations that can't be done by a basic microcontroller?

Maybe NPUs will end up great in the long run, but it would require a major shift in both how people use their computers and how software uses NPU hardware. I'm expecting it will end up the same way as "smart"/IOT devices, and that these NPUs in everything will be about as useful as wifi connections for fridges and toasters - maybe useful for a few people with specialised needs, but stupid and irrelevant for everyone else.

Tek-Check · 2025-04-01T18:20:17+0100

Speedyblupi said:
NPUs are useful for some AI productivity tasks, but most people who need powerful AI accelerators for productivity buy graphics cards with more powerful capabilities than any integrated NPU.

NPUs can even accelerate GPU, both in gaming and in productivity. Intel showed this a few days ago on Arrow Lake HX. GPUs can offload some taxing tasks to NPU and this could bring a few percentages in gaming by freeing resources on GPU. We need to see NPU as an enhancer of performance, both for hardware and software.

Speedyblupi said:
A typical user's AI needs are more efficiently handled by cloud services which can dynamically distribute load across thousands of NPUs/TPUs, not by having their own NPU that does literally nothing 99%+ of the time.

We need to have NPU hardware in the first place on die in order to develop software ecosystem that will use it. We are in the process of developing software and this will only get better. At the end of the say, not all CPUs are sold with NPU and that's fine. Mobility chips need it more at the moment, but make no mistake, more powerful NPUs will eventually come to desktop chips too.

Speedyblupi said:
Where are NPUs being used for power management, and what does an NPU do in those situations that can't be done by a basic microcontroller?

Some tasks with parallel processing will save more battery power if offloaded to NPU than done on traditional CPU sequentially.

One example of more flexible data flow through NPU

Speedyblupi said:
Maybe NPUs will end up great in the long run, but it would require a major shift in both how people use their computers and how software uses NPU hardware. I'm expecting it will end up the same way as "smart"/IOT devices, and that these NPUs in everything will be about as useful as wifi connections for fridges and toasters - maybe useful for a few people with specialised needs, but stupid and irrelevant for everyone else.

We will be using NPUs without even been aware of it, unless you look for it into Task Manager. It's software developers who need to improve applications we commonly use in order to make better use of NPUs. It will take some time. Microsoft is deploying Copilot across their applications, some more meaningful than others, but it will be down to individual applications to improve.

For example, during an average Zoom call in future version of it, NPU will handle denoising, improving picture and sound. It will be a better experience, but you will never think it's because there is NPU. Hence this impression that NPU does not do anything obvious and tangible. It will be also taking care of background things that we usually do not think about.

igormp · 2025-04-01T20:21:51+0100

Tek-Check said:
NPUs can even accelerate GPU, both in gaming and in productivity. Intel showed this a few days ago on Arrow Lake HX. GPUs can offload some taxing tasks to NPU and this could bring a few percentages in gaming by freeing resources on GPU. We need to see NPU as an enhancer of performance, both for hardware and software.

It does not enhance performance. Intel's example was about running a game AND an assistant (likely a small LLM) in tandem. The assistant was offloaded to the NPU (exactly what this kind of thing is meant for), leaving the GPU free for the game alone.
But the performance of just running the game does not improve if you suddenly plug an NPU in there. NPUs are meant for inference of models, period.

Tek-Check said:
Some tasks with parallel processing will save more battery power if offloaded to NPU than done on traditional CPU sequentially

Not "some tasks", only model inference, which there aren't many in the desktop world that you run locally at the moment.

Tek-Check · 2025-04-01T22:03:33+0100

igormp said:
It does not enhance performance. Intel's example was about running a game AND an assistant (likely a small LLM) in tandem. The assistant was offloaded to the NPU (exactly what this kind of thing is meant for), leaving the GPU free for the game alone. But the performance of just running the game does not improve if you suddenly plug an NPU in there. NPUs are meant for inference of models, period.

I am not into semantic disputes just because I did not mention an assistant. NPU can and does enhance the performance of other components if tasks can be offloaded to it. There is no doubt about it. Whether additional application needs to be used for such gains, such as assistant, it does not really matter. What matters is end result.

igormp said:
Not "some tasks", only model inference, which there aren't many in the desktop world that you run locally at the moment.

There are not many, you are right. I enjoy LM Studio and there are plenty of models that can be loaded into it. My system is good enough for 32B models. 70B is already slow. I do not have NPU, to help a bit my GPU. NPUs are not only meant for inference of models, as there are other applications gradually enabling more functions, such as specific editing and content creation tools in Photoshop, Premiere Pro, Lightroom, DaVinci, etc. There are plug-ins that already work and others that are being developed. It will take time to develop entire ecosystem.

Processor	i5-6600K
Motherboard	Asus Z170A
Cooling	some cheap Cooler Master Hyper 103 or similar
Memory	16GB DDR4-2400
Video Card(s)	IGP
Storage	Samsung 850 EVO 250GB
Display(s)	2x Oldell 24" 1920x1200
Case	Bitfenix Nova white windowless non-mesh
Audio Device(s)	E-mu 1212m PCI
Power Supply	Seasonic G-360
Mouse	Logitech Marble trackball, never had a mouse
Keyboard	Key Tronic KT2000, no Win key because 1994
Software	Oldwin

Processor	Ryzen 6900HX
Memory	32 GB DDR4LP
Video Card(s)	Radeon 6800m
Display(s)	LG C3 42''
Software	Windows 11 home premium

Processor	i5-6600K
Motherboard	Asus Z170A
Cooling	some cheap Cooler Master Hyper 103 or similar
Memory	16GB DDR4-2400
Video Card(s)	IGP
Storage	Samsung 850 EVO 250GB
Display(s)	2x Oldell 24" 1920x1200
Case	Bitfenix Nova white windowless non-mesh
Audio Device(s)	E-mu 1212m PCI
Power Supply	Seasonic G-360
Mouse	Logitech Marble trackball, never had a mouse
Keyboard	Key Tronic KT2000, no Win key because 1994
Software	Oldwin

Processor	i5-6600K
Motherboard	Asus Z170A
Cooling	some cheap Cooler Master Hyper 103 or similar
Memory	16GB DDR4-2400
Video Card(s)	IGP
Storage	Samsung 850 EVO 250GB
Display(s)	2x Oldell 24" 1920x1200
Case	Bitfenix Nova white windowless non-mesh
Audio Device(s)	E-mu 1212m PCI
Power Supply	Seasonic G-360
Mouse	Logitech Marble trackball, never had a mouse
Keyboard	Key Tronic KT2000, no Win key because 1994
Software	Oldwin

Processor	5950x
Motherboard	B550 ProArt
Cooling	Fuma 2
Memory	4x32GB 3200MHz Corsair LPX
Video Card(s)	2x RTX 3090
Display(s)	LG 42" C2 4k OLED
Power Supply	XPG Core Reactor 850W
Software	I use Arch btw

System Name	Upgraded CyberpowerPC Ultra 5 Elite Gaming PC
Processor	AMD Ryzen 7 5800X3D
Motherboard	MSI B450M Pro-VDH Plus
Cooling	Thermalright Peerless Assassin 120 SE
Memory	CM4X8GD3000C16K4D (OC to CL14)
Video Card(s)	XFX Speedster MERC RX 7800 XT
Storage	TCSunbow X3 1TB, ADATA SU630 240GB, Seagate BarraCuda ST2000DM008 2TB
Display(s)	AOC Agon AG241QX 1440p 144Hz
Case	Cooler Master MasterBox MB520 (CyberpowerPC variant)
Power Supply	600W Cooler Master

AMD Ryzen AI Max+ "Strix Halo" Die Exposed and Annotated

News Editor