AMD Strix Point SoC Reintroduces Dual-CCX CPU, Other Interesting Silicon Details Revealed

btarunr · Jul 24, 2024

Since its reveal last week, we got a slightly more technical deep-dive from AMD on its two upcoming processors—the "Strix Point" silicon powering its Ryzen AI 300 series mobile processors; and the "Granite Ridge" chiplet MCM powering its Ryzen 9000 desktop processors. We present a closer look into the "Strix Point" SoC in this article. It turns out that "Strix Point" takes a significantly different approach to heterogeneous multicore than "Phoenix 2." AMD gave us a close look at how this works. AMD built the "Strix Point" monolithic silicon on the TSMC N4P foundry node, with a die-area of around 232 mm².

The "Strix Point" silicon sees the company's Infinity Fabric interconnect as its omnipresent ether. This is a point-to-point interconnect, unlike the ringbus on some Intel processors. The main compute machinery on the "Strix Point" SoC are its two CPU compute complexes (CCX), each with a 32b (read)/16b (write) per cycle data-path to the fabric. The concept of CCX makes a comeback with "Strix Point" after nearly two generations of "Zen." The first CCX contains the chip's four full-sized "Zen 5" CPU cores, which share a 16 MB L3 cache among themselves. The second CCX contains the chip's eight "Zen 5c" cores that share a smaller 8 MB L3 cache. Each of the 12 cores has a 1 MB dedicated L2 cache.

This approach to heterogeneous multicore is significantly different from "Phoenix 2," where the two "Zen 4" and four "Zen 4c" cores were part of a common CCX, with a common 16 MB L3 cache accessible to all six cores.

The "Zen 5" cores on "Strix Point" will be able to sustain high boost frequencies, in excess of 5.00 GHz, and should benefit from the larger 16 MB L3 cache that's shared among just four cores (similar L3 cache per core to "Granite Ridge"). The "Zen 5c" cores, on the other hand, operate at lower base- and boost frequencies than the "Zen 5" cores, and have lesser amounts of available L3 caches. For threads to migrate between the two core types, they will have to go through the fabric, and in some cases, even incur a round-trip to the main memory.

The Zen 5c core is about 25% smaller in die-area than the Zen 5 core. For reference, the Zen 4c core is about 35% smaller than a regular Zen 4 core. AMD has worked to slightly improve the maximum boost frequencies of the Zen 5c core compared to its predecessor, so the frequency band of the Zen 5c cores are a tiny bit closer. The lower maximum voltages and maximum boost frequencies of Zen 5c cores put them at a significant power efficiency advantage over the Zen 5 cores. AMD is continuing to rely on a software based scheduling solution that ensures the right kind of processing workload goes to the right kind of core. The company says that the software based solution lets it correct "scheduling mistakes" over time.

The iGPU is the most bandwidth-hungry device on the fabric, and gets its widest data-path—4x 32B/cycle. Based on the RDNA 3.5 graphics architecture, which retains the SIMD engine and IPC of RDNA 3, but with several improvements to the performance/Watt, this iGPU also features 8 workgroup processors (WGPs), compared to the 6 on the current "Phoenix" silicon. This works out to 16 CU, or 1,024 stream processors. The iGPU also features 4 render backends+, which work out to 16 ROPs.

The third most bandwidth-hungry device is the XDNA 2 NPU, with a 32B/cycle data-path that's of a comparable bandwidth to a CCX. The NPU features four blocks of 8 XDNA 2 arrays, and 32 AI engine tiles; for 50 TOPS of AI inferencing throughput, and can be overclocked. It also supports the Block FP16 data format (not to be confused with bfloat16), which offers the precision of FP16, with the performance of FP8.

Besides the three logic-heavy components, there are other accelerators that are fairly demanding on the bandwidth, such as the Video CoreNext engine that accelerates encoding and decoding; the audio coprocessor that processes the audio stack when the system is "powered down," so it can respond to voice commands; the display controller that handles the display I/O, including display stream compression, if called for; the SMU, Microsoft Pluton, TPM, and other manageability hardware.

The I/O interfaces of the "Strix Point" SoC include a memory controller that supports 128-bit LPDDR5, LPDDR5x, and dual-channel DDR5 (160-bit). The PCI-Express root complex is slightly truncated compared to the one "Phoenix" comes with. There are a total of 16 PCIe Gen 4 lanes. All 16 should be usable in notebooks that lack a discrete FCH chipset, but the usable lane count should drop to 12 when AMD eventually adapts this silicon to Socket AM5 for desktop APUs. On gaming notebooks that use Ryzen AI HX or H 300 series processors, discrete GPUs should have a Gen 4 x8 connection. USB connectivity includes a 40 Gbps USB4, or two 20 Gbps USB 3.2 Gen 2x2, two additional 10 Gbps USB 3.2 Gen 2, and three classic USB 2.0.

View at TechPowerUp Main Site

AnotherReader · Jul 24, 2024

That's a big increase in die size; the 8840HS is 178 mm^2. Given that Zen 5 is as compact as Zen 4, the increase is probably due to the much larger NPU with a smaller contribution from the larger GPU.

Caring1 · Jul 24, 2024

Centralized hot spot should be easier to cool.

Klemc · Jul 24, 2024

That's cool

Daven · Jul 24, 2024

I'm not sure I would have gone with the term 'classic' to describe performance cores but okay.

The green block 'CPU core' for the classic cores is transistor to transistor the same as the green block 'CPU core' for the compact core. AMD has reduced the spacing between transistors which limits the max clock frequency but maintains IPC.

We now have both pictures of the chip arrangement of Turin Epycs:

Turin 128 'Classic' Cores (256 threads)

Turin 192 'Compact' Cores (384 threads)

Klemc · Jul 24, 2024

Why CPU need to be small, i mean, it's small piece of a few cm/cm, if size can help make it cold... easyly, why not make CPU bigger, who cares

Patriot · Jul 24, 2024

Klemc said:
Why CPU need to be small, i mean, it's small piece of a few cm/cm, if size can help make it cold... easyly, why not make CPU bigger, who cares

you sound cheap, bigger cost more, for them and then for us.

Klemc · Jul 24, 2024

Patriot said:
you sound cheap, bigger cost more, for them and then for us.

But less problems... right, if cheaper then i will not change.

T1beriu · Jul 24, 2024

Klemc said:
Why CPU need to be small, i mean, it's small piece of a few cm/cm, if size can help make it cold... easyly, why not make CPU bigger, who cares

Every square mm cost $$$.

Patriot · Jul 24, 2024

Klemc said:
But less problems... right, if cheaper then i will not change.

Not really, higher latency, mores power usage for longer traces to maintain signal integrity.

Wirko · Jul 24, 2024

btarunr said:
For threads to migrate between the two core types, they will have to go through the fabric, and in some cases, even incur a round-trip to the main memory.

Separate L3 caches ... that's weird. It seems that thread migration will come with a large penalty of re-filling the other L3, and the two caches must also be kept coherent at all times.

Minus Infinity · Jul 25, 2024

AnotherReader said:
That's a big increase in die size; the 8840HS is 178 mm^2. Given that Zen 5 is as compact as Zen 4, the increase is probably due to the much larger NPU with a smaller contribution from the larger GPU.

Really, cpu 50% more cores, larger NPU, 33% more iGPU CU's and yet it's only 30% larger. Way less than expected.

It will be funny if after Zen 4c being way stronger than Gracemont E, that Zen 5c is actually weaker than Skymont E, a strong possibility given Intel saying Skymont E is as strong as Raptor Cove P cores, and they clock to 4.7GHz

R0H1T · Jul 25, 2024

Minus Infinity said:
Zen 5c is actually weaker than Skymont E

Practically 0% chance of that given they(zen5) have the same IPC.

ratirt · Jul 25, 2024

Klemc said:
But less problems... right, if cheaper then i will not change.

You need to understand also, wafer, these are being printed on have flaws. The bigger the chip, the more probability the chip will be defective. (Something will not work as it should) Meaning getting the full spec'ed chip will be harder. Less of them will meet the requirements which will boost price for these.

AnotherReader · Jul 25, 2024

Minus Infinity said:
Really, cpu 50% more cores, larger NPU, 33% more iGPU CU's and yet it's only 30% larger. Way less than expected.

It will be funny if after Zen 4c being way stronger than Gracemont E, that Zen 5c is actually weaker than Skymont E, a strong possibility given Intel saying Skymont E is as strong as Raptor Cove P cores, and they clock to 4.7GHz

8 of the new cores are smaller than the other 4 cores so it's more like 10 Zen 4 cores in die area. With such a great increase in die size, giving the IGP a large last level cache like its discrete counterparts wouldn't have increased die size that much.

As for Zen 5c vs Skymont E, SMT will allow Zen 5c to keep up with the latter in many workloads even at lower clocks. Of course, we don't know yet what clocks Zen 5c will reach.

Zareek · Jul 26, 2024

I saw Strix and my mind went right to, oh boy, Strix Halo information.

NPU = wasted die space... Fricken' M$ and their co-pilot BS. That die space would be far more useful as more PCIe lanes.

close · Jul 26, 2024

Minus Infinity said:
Really, cpu 50% more cores, larger NPU, 33% more iGPU CU's and yet it's only 30% larger. Way less than expected.

It will be funny if after Zen 4c being way stronger than Gracemont E, that Zen 5c is actually weaker than Skymont E, a strong possibility given Intel saying Skymont E is as strong as Raptor Cove P cores, and they clock to 4.7GHz

Lemme stop you right there. These days what Intel "says" should be taken with a boulder of salt unless backed up by 3 years of data. We're at the point where even coming up with a CPU that's provably faster in any day 1 benches is not worth a lot given that it might just run in a fashion that literally kills it within the year. They did it with woefully insecure designs, they did it with absurd power levels, they did it with high voltages, they'll keep doing it as long as enough people have more brand loyalty than common sense.

System Name	RBMK-1000
Processor	AMD Ryzen 7 5700G
Motherboard	ASUS ROG Strix B450-E Gaming
Cooling	DeepCool Gammax L240 V2
Memory	2x 8GB G.Skill Sniper X
Video Card(s)	Palit GeForce RTX 2080 SUPER GameRock
Storage	Western Digital Black NVMe 512GB
Display(s)	BenQ 1440p 60 Hz 27-inch
Case	Corsair Carbide 100R
Audio Device(s)	ASUS SupremeFX S1220A
Power Supply	Cooler Master MWE Gold 650W
Mouse	ASUS ROG Strix Impact
Keyboard	Gamdias Hermes E2
Software	Windows 11 Pro

Processor	Ryzen 7 5700X
Motherboard	ASUS TUF Gaming X570-PRO (WiFi 6)
Cooling	Noctua NH-C14S (two fans)
Memory	2x16GB DDR4 3200
Video Card(s)	Reference Vega 64
Storage	Intel 665p 1TB, WD Black SN850X 2TB, Crucial MX300 1TB SATA, Samsung 830 256 GB SATA
Display(s)	Nixeus NX-EDG27, and Samsung S23A700
Case	Fractal Design R5
Power Supply	Seasonic PRIME TITANIUM 850W
Mouse	Logitech
VR HMD	Oculus Rift
Software	Windows 11 Pro, and Ubuntu 20.04

System Name	H7 Flow 2024
Processor	AMD 5800X3D
Motherboard	Asus X570 Tough Gaming
Cooling	Custom liquid
Memory	32 GB DDR4
Video Card(s)	Intel ARC A750
Storage	Crucial P5 Plus 2TB.
Display(s)	AOC 24" Freesync 1m.s. 75Hz
Mouse	Lenovo
Keyboard	Eweadn Mechanical
Software	W11 Pro 64 bit

System Name	KLM
Processor	7800X3D
Motherboard	B-650E-E Strix
Cooling	Arctic Cooling III 280
Memory	16x2 Fury Renegade 6000-32
Video Card(s)	4070-ti PNY
Storage	500+512+8+8+2+1+1+2+256+8+512+2
Display(s)	VA 32" 4K@60 - OLED 27" 2K@240
Case	4000D Airflow
Audio Device(s)	Edifier 1280Ts
Power Supply	Shift 1000
Mouse	502 Hero
Keyboard	K68
VR HMD	Steam Deck OLED
Software	EMDB
Benchmark Scores	0>1000

System Name	KLM
Processor	7800X3D
Motherboard	B-650E-E Strix
Cooling	Arctic Cooling III 280
Memory	16x2 Fury Renegade 6000-32
Video Card(s)	4070-ti PNY
Storage	500+512+8+8+2+1+1+2+256+8+512+2
Display(s)	VA 32" 4K@60 - OLED 27" 2K@240
Case	4000D Airflow
Audio Device(s)	Edifier 1280Ts
Power Supply	Shift 1000
Mouse	502 Hero
Keyboard	K68
VR HMD	Steam Deck OLED
Software	EMDB
Benchmark Scores	0>1000

System Name	[H]arbringer
Processor	4x 61XX ES @3.5Ghz (48cores)
Motherboard	SM GL
Cooling	3x xspc rx360, rx240, 4x DT G34 snipers, D5 pump.
Memory	16x gskill DDR3 1600 cas6 2gb
Video Card(s)	blah bigadv folder no gfx needed
Storage	32GB Sammy SSD
Display(s)	headless
Case	Xigmatek Elysium (whats left of it)
Audio Device(s)	yawn
Power Supply	Antec 1200w HCP
Software	Ubuntu 10.10
Benchmark Scores	http://valid.canardpc.com/show_oc.php?id=1780855 http://www.hwbot.org/submission/2158678 http://ww

Processor	i5-6600K
Motherboard	Asus Z170A
Cooling	some cheap Cooler Master Hyper 103 or similar
Memory	16GB DDR4-2400
Video Card(s)	IGP
Storage	Samsung 850 EVO 250GB
Display(s)	2x Oldell 24" 1920x1200
Case	Bitfenix Nova white windowless non-mesh
Audio Device(s)	E-mu 1212m PCI
Power Supply	Seasonic G-360
Mouse	Logitech Marble trackball, never had a mouse
Keyboard	Key Tronic KT2000, no Win key because 1994
Software	Oldwin

System Name	Bro2
Processor	Ryzen 5800X
Motherboard	Gigabyte X570 Aorus Elite
Cooling	Corsair h115i pro rgb
Memory	32GB G.Skill Flare X 3200 CL14 @3800Mhz CL16
Video Card(s)	Powercolor 6900 XT Red Devil 1.1v@2400Mhz
Storage	M.2 Samsung 970 Evo Plus 500MB/ Samsung 860 Evo 1TB
Display(s)	LG 27UD69 UHD / LG 27GN950
Case	Fractal Design G
Audio Device(s)	Realtec 5.1
Power Supply	Seasonic 750W GOLD
Mouse	Logitech G402
Keyboard	Logitech slim
Software	Windows 10 64 bit

System Name	Gaming Rig
Processor	Ryzen 7 3800X
Motherboard	Gigabyte X570 Aurus Pro Wifi
Cooling	Noctua NH-D15 chromax.black
Memory	32GB(2x16GB) Patriot Viper DDR4-3200C16
Video Card(s)	EVGA RTX 3060 Ti
Storage	Samsung 970 EVO Plus 1TB (Boot/OS)\|Hynix Platinum P41 2TB (Games)
Display(s)	Gigabyte G27F
Case	Corsair Graphite 600T w/mesh side
Audio Device(s)	Logitech Z625 2.1 \| cheapo gaming headset when mic is needed
Power Supply	Corsair HX850i
Mouse	Redragon M808-KS Storm Pro (Great Value)
Keyboard	Redragon K512 Shiva replaced a Corsair K70 Lux - Blue on Black
VR HMD	Nope
Software	Windows 11 Pro x64
Benchmark Scores	Nope

AMD Strix Point SoC Reintroduces Dual-CCX CPU, Other Interesting Silicon Details Revealed

Editor & Senior Moderator