Friday, April 19th 2024
AMD "Strix Halo" Zen 5 Mobile Processor Pictured: Chiplet-based, Uses 256-bit LPDDR5X
Enthusiasts on the ChipHell forum scored an alleged image of AMD's upcoming "Strix Halo" mobile processor, and set out to create some highly plausible schematic slides. These are speculative. While "Strix Point" is the mobile processor that succeeds the current "Hawk Point" and "Phoenix" processors; "Strix Halo" is in a category of its own—to offer gaming experiences comparable to discrete GPUs in the ultraportable form-factor where powerful discrete GPUs are generally not possible. "Strix Halo" also goes head on against Apple's M3 Max and M3 Pro processors powering the latest crop of MacBook Pros. It has the same advantages as a single-chip solution, as the M3 Max.
The "Strix Halo" silicon is a chiplet-based processor, although very different from "Fire Range". The "Fire Range" processor is essentially a BGA version of the desktop "Granite Ridge" processor—it's the same combination of one or two "Zen 5" CCDs that talk to a client I/O die, and is meant for performance-thru-enthusiast segment notebooks. "Strix Halo," on the other hand, use the same one or two "Zen 5" CCDs, but with a large SoC die featuring an oversized iGPU, and 256-bit LPDDR5X memory controllers not found on the cIOD. This is key to what AMD is trying to achieve—CPU and graphics performance in the league of the M3 Pro and M3 Max at comparable PCB and power footprints.The iGPU of the "Strix Halo" processor is based on the RDNA 3+ graphics architecture, and features a massive 40 RDNA compute units. These work out to 2,560 stream processors, 80 AI accelerators, 40 Ray accelerators, 160 TMUs, and an unknown number of ROPs (we predict at least 64). The slide predicts an iGPU engine clock as high as 3.00 GHz.
Graphics is an extremely memory sensitive application, and so AMD is using a 256-bit (quad-channel or octa-subchannel) LPDDR5X-8533 memory interface, for an effective cached bandwidth of around 500 GB/s. The memory controllers are cushioned by a 32 MB L4 cache located on the SoC die. The way we understand this cache hierarchy, the CCDs (CPU cores) can treat this as a victim cache, besides the iGPU treating this like an L2 cache (similar to the Infinite Cache found in RDNA 3 discrete GPUs).
The iGPU isn't the only logic-heavy and memory-sensitive device on the SoC die, there's also a NPU. From what we gather, this is the exact same NPU model found in "Strix Point" processors, with a performance of around 45-50 AI TOPS, and is based on the XDNA 2 architecture developed by AMD's Xilinx team.The SoC I/O of "Strix Halo" isn't as comprehensive as "Fire Range," because the chip has been designed on the idea that the notebook will use its large iGPU. It has PCIe Gen 5, but only a total of 12 Gen 5 lanes—4 toward an M.2 NVMe slot, and 8 to spare for a discrete GPU (if present), although these can be used to connect any PCIe device, including additional M.2 slots. There's also integrated 40 Gbps USB4, and 20 Gbps USB 3.2 Gen 2.
As for the CPU, since "Strix Halo" is using one or two "Zen 5" CCDs, its CPU performance will be similar to "Fire Range." You get up to 16 "Zen 5" CPU cores, with 32 MB of L3 cache per CCD, or 64 MB of total CPU L3 cache. The CCDs are connected to the SoC die either using conventional IFOP (Infinity Fabric over package), just like "Fire Range" and "Granite Ridge," or there's even a possibility that AMD is using Infinity Fanout links like on some of its chiplet-based RDNA 3 discrete GPUs.Lastly, there are some highly speculative performance predictions for the "Strix Halo" iGPU, which puts it competitive to the GeForce RTX 4060M and RTX 4070M.
Sources:
ChipHell Forums, harukaze5719 (Twitter)
The "Strix Halo" silicon is a chiplet-based processor, although very different from "Fire Range". The "Fire Range" processor is essentially a BGA version of the desktop "Granite Ridge" processor—it's the same combination of one or two "Zen 5" CCDs that talk to a client I/O die, and is meant for performance-thru-enthusiast segment notebooks. "Strix Halo," on the other hand, use the same one or two "Zen 5" CCDs, but with a large SoC die featuring an oversized iGPU, and 256-bit LPDDR5X memory controllers not found on the cIOD. This is key to what AMD is trying to achieve—CPU and graphics performance in the league of the M3 Pro and M3 Max at comparable PCB and power footprints.The iGPU of the "Strix Halo" processor is based on the RDNA 3+ graphics architecture, and features a massive 40 RDNA compute units. These work out to 2,560 stream processors, 80 AI accelerators, 40 Ray accelerators, 160 TMUs, and an unknown number of ROPs (we predict at least 64). The slide predicts an iGPU engine clock as high as 3.00 GHz.
Graphics is an extremely memory sensitive application, and so AMD is using a 256-bit (quad-channel or octa-subchannel) LPDDR5X-8533 memory interface, for an effective cached bandwidth of around 500 GB/s. The memory controllers are cushioned by a 32 MB L4 cache located on the SoC die. The way we understand this cache hierarchy, the CCDs (CPU cores) can treat this as a victim cache, besides the iGPU treating this like an L2 cache (similar to the Infinite Cache found in RDNA 3 discrete GPUs).
The iGPU isn't the only logic-heavy and memory-sensitive device on the SoC die, there's also a NPU. From what we gather, this is the exact same NPU model found in "Strix Point" processors, with a performance of around 45-50 AI TOPS, and is based on the XDNA 2 architecture developed by AMD's Xilinx team.The SoC I/O of "Strix Halo" isn't as comprehensive as "Fire Range," because the chip has been designed on the idea that the notebook will use its large iGPU. It has PCIe Gen 5, but only a total of 12 Gen 5 lanes—4 toward an M.2 NVMe slot, and 8 to spare for a discrete GPU (if present), although these can be used to connect any PCIe device, including additional M.2 slots. There's also integrated 40 Gbps USB4, and 20 Gbps USB 3.2 Gen 2.
As for the CPU, since "Strix Halo" is using one or two "Zen 5" CCDs, its CPU performance will be similar to "Fire Range." You get up to 16 "Zen 5" CPU cores, with 32 MB of L3 cache per CCD, or 64 MB of total CPU L3 cache. The CCDs are connected to the SoC die either using conventional IFOP (Infinity Fabric over package), just like "Fire Range" and "Granite Ridge," or there's even a possibility that AMD is using Infinity Fanout links like on some of its chiplet-based RDNA 3 discrete GPUs.Lastly, there are some highly speculative performance predictions for the "Strix Halo" iGPU, which puts it competitive to the GeForce RTX 4060M and RTX 4070M.
109 Comments on AMD "Strix Halo" Zen 5 Mobile Processor Pictured: Chiplet-based, Uses 256-bit LPDDR5X
- The only silicon configuration with all the CUs will be the insanely-expensive flagship variant(s)
- Because they're flagships, they'll only appear in
- Behemoth overpriced gaming laptops with dGPUs, rendering the IPG pointless
- Impossibly thin, overpriced ultraportables that compromise on cooling to look "sexy and thin" so hard that it throttles hard within 60 seconds of any real GPU load and is therefore unusable for gaming.
- The sort of configuration that will appear in a half-decent, everyday $1000 laptop is going to be a 6-core with 20CU and lacking the LPGDDR5X because manufacturers are cheapskates and the LPGDDR5X is probably "optional"...
I really hope I'm wrong!or
The iGPU of the "Strix Halo" processor is based on the RDNA 3+ graphics architecture, and features a massive 40 RDNA compute units. These work out to 2,560 stream processors, 80 AI accelerators, 40 Ray accelerators, 160 TMUs, and an unknown number of ROPs (we predict at least 64). The slide predicts an iGPU engine clock as high as 3.00 GHz."
Interesting................
Drop all the AI and RT (and transistors and price that goes with it) and make sure it has full media engine and make a nice HTPC chip.
I think if the GPU is downclocked to around 2Ghz, this APU could be put in a handled gaming console
AMD's laptop strategy has been schizophrenic and they've really been crowded out by Intel and NV on that front.
For every 1 AMD laptop out there, you'll find 10 Intel/NV options.
It's good to see AMD finally leveraging it's IPs to provide a part that neither Intel or NV can deliver, but I question whether the part will be delivered in volume and whether it would be enough to go against the current brand perceptions.
Could you imagine the performance of Windows written in pure assembly, targeted at a specific range of hardware? Shame Microsoft fired every decent and competent coder they had years ago.
If you remember the days of the Amiga, Windows 11 is like a game programmed in AMOS or basic vs a game programmed in assembly. Today our computers are so fast that it's almost impossible to not simply brute force good performance. Today Windows is coded in high level languages not far off HTML5 code.
It was tragic what 3DFX did to themselves, but it was even sadder when nGreedia killed them off.
Surely, the graphics wasn't bad for the time, but as far as I've seen, the whole 3DFX ownership experience wasn't that great because of things like the aforementioned fizzy image or the loop cable, or the fact that the card didn't do 2D, which bumped up the cost of a new build. Sadly, I've never owned one, so I can only comment based on the YouTube nostalgia videos I've seen.
Just a couple of examples....
Ironically performs about as well as a 780M but at 3x the power draw, but 4-5 years earlier. No new drivers though.
(See post #53)
I think I like the shared memory more though, no need to be VRAM limited (I like Skyrim mods) It is very possible to do ~60W in a 13" laptop if OEM is willing and competent.
Phawx on YT has shown (gaming only) the 8840/780M fit into like ~17W, I think it is possible for it to scale into (very) large handheld/smaller laptop.
(I don't buy large APU in Xbox handheld though, there is $ budget/limit as well as power limit/budget)
But not sure if all the extra cores will have high idle drain (i.e. impractical to use unless plugged in). And 60W will be limiting for the CPU either way
I have some discussion/speculation on my 4070Ti at ~70W video, if interested It's too expensive / CPU is too "excessive"
Unless you're playing CPU-bound games (basically only sims/>60fps), the benefit of having a better CPU (bin/efficiency) is to use less power and give it to the GPU (think laptop or even handheld).
Instead, why not just give more power/$ (cores/bandwidth) to the GPU?
Also as mentioned above, it is very possible to do ~60W in a 13" laptop if OEM is willing and competent.
For me that doesn't really care to much about gaming on a laptop at all I'm interested in both what Strix Point and Halo bring to the market. Zero interest in Fire Range.