AMD Ryzen AI Max 390 "Strix Halo" Surfaces in Geekbench AI Benchmark

btarunr · Sep 24, 2024

In case you missed it, AMD's new madcap enthusiast silicon engineering effort, the "Strix Halo," is real, and comes with the Ryzen AI Max 300 series branding. These are chiplet-based mobile processors with one or two "Zen 5" CCDs—same ones found in "Granite Ridge" desktop processors—paired with a large SoC die that has an oversized iGPU. This arrangement lets AMD give the processor up to 16 full-sized "Zen 5" CPU cores, and an iGPU with as many as 40 RDNA 3.5 compute units (2,560 stream processors), and a 256-bit LPDDR5/x memory interface for UMA.

"Strix Halo" is designed for ultraportable gaming notebooks or mobile workstations where low PCB footprint is of the essence, and discrete GPU is not an option. For enthusiast gaming notebooks with discrete GPUs, AMD is designing the "Fire Range" processor, which is essentially a mobile BGA version of "Granite Ridge," and a successor to the Ryzen 7045 series "Dragon Range." The Ryzen AI Max series has three models based on CPU and iGPU CU counts—the Ryzen AI Max 395+ (16-core/32-thread with 40 CU), the Ryzen AI Max 390 (12-core/24-thread with 40 CU), and the Ryzen AI Max 385 (8-core/16-thread, 32 CU). An alleged Ryzen AI Max 390 engineering sample surfaced on the Geekbench AI benchmark online database.

The online database entry for this Geekbench AI benchmark submission mentions a processor that identifies itself as "AMD Eng Sample: 100-000001421-50_Y," which corresponds with the Ryzen AI Max 390 (12-core/24-thread, 40 CU). The processor has a CPU base frequency of 3.20 GHz, and maximum boost frequency of 5.00 GHz, at least for this engineering sample (the retail chip could differ). This processor is driving a prototype HP ZBook Ultra 14 G1a mobile workstation, and is wired to 64 GB of memory.

The processor yielded a single-precision Geekbench AI score of 4733 points, half-precision score of 4944 points, and quantized score of 13944 points. HotHardware notes that this is a rather large 60% delta with the desktop Ryzen 9 9900X processor. There could be several reasons behind this. The screenshot shows that the notebook is running on a Balanced power plan; and the benchmark uses 256-bit AVX2 SIMD instructions, and not the newer AVX512. The "Zen 5" cores on the "Strix Halo" are carried over from "Granite Ridge" and EPYC "Turin," since they're the same 8-core CCD, and feature full 512-bit FP data-paths. This is unlike the "Zen 5" cores on the "Strix Point" monolithic silicon, which are restricted to a dual-pumped 256-bit FP data-path even when executing AVX512 or VNNI instructions. Therefore, AI benchmarks that use AVX512/VNNI could yield different results. Then there's the fact that this is an engineering sample, and AMD could be deliberately nerfing its performance.

View at TechPowerUp Main Site | Source

JWNoctis · Sep 24, 2024

Previous discussion on the subject: https://www.techpowerup.com/forums/...5-mobile-chips-to-feature-40-igpu-cus.326828/

In short, this is a benchmark of a capability of the chip that could have been faster on either the iGPU or the NPU, and now revealed to be not even using the most powerful instruction set Zen 5 happened to focus its improvements on. Take this with a grain of salt.

mikesg · Sep 24, 2024

How to throw away the best product your company has had in years, call it "AI Max".

zeljans · Sep 24, 2024

I am excited to see this chips in small form factor PC.
Something like Minisforum or Beelink.
Just not to be slow in the rollout, 9 months later.

Presumably it will cost a lot, 16 core CPU + GPU + fast integrated ram, cooling, + ~220wats power adapter. 1600$ easy.

Mac Studio replacement.

Caring1 · Sep 24, 2024

mikesg said:
How to throw away the best product your company has had in years, call it "AI Max".

Would you rather Ai Janet?

AVATARAT · Sep 24, 2024

This is interesting for gaming as the communication between CPU/GPU/RAM will be very low latency.

_JP_ · Sep 24, 2024

mikesg said:
How to throw away the best product your company has had in years, call it "AI Max".

It makes the investors/shareholders happy, so why not?
For the average Joe it's still about "My budget is X".

SL2 · Sep 24, 2024

mikesg said:
How to throw away the best product your company has had in years, call it "AI Max".

No one cares, give it a rest.

In other news: Gigabyte PSU explodes, Windows Vista RTM is a resource hog, some people used drugs at the Woodstock festival 1969, and the Anglo-Saxon king Harold Godwinson have died at the battle of Hastings.

igormp · Sep 24, 2024

AVATARAT said:
This is interesting for gaming as the communication between CPU/GPU/RAM will be very low latency.

LPDDR5 memory actually has higher latencies compared to your regular DDR5 sticks.

That CPU is also chiplet-based, so you have your regular desktop CCDs communicating with the IO Die in a similar fashion to Ryzen 9000.

trsttte · Sep 25, 2024

igormp said:
LPDDR5 memory actually has higher latencies compared to your regular DDR5 sticks.

That CPU is also chiplet-based, so you have your regular desktop CCDs communicating with the IO Die in a similar fashion to Ryzen 9000.

Higher latency but bigger bandwidth. Small enough latency for general computing and boosted bandwidth so those 40 compute units don't go to waste.

JWNoctis · Sep 25, 2024

igormp said:
LPDDR5 memory actually has higher latencies compared to your regular DDR5 sticks.

That CPU is also chiplet-based, so you have your regular desktop CCDs communicating with the IO Die in a similar fashion to Ryzen 9000.

Doesn't that also imply a 64GB/s bandwidth limit between each CCD and the IOD, if they keep the same IF architecture there? That adds up to much less than half the theoretical bandwidth of a 256-bit LPDDR5 interface, although that is probably still a long way above current AM5 offerings, provided there are no bottlenecks elsewhere.

Either way, iGPU/NPU offload is definitely going to be needed for the workload it is expected to do.

Mobile chips starting to outstrip non-HEDT desktop CPU performance. What a world we live in.

igormp · Sep 25, 2024

trsttte said:
Higher latency but bigger bandwidth. Small enough latency for general computing and boosted bandwidth so those 40 compute units don't go to waste.

The gpu is going to love it for sure. The cpu, however, not that much (but it's far from awful, don't get me wrong).

JWNoctis said:
Doesn't that also imply a 64GB/s bandwidth limit between each CCD and the IOD, if they keep the same IF architecture there? That adds up to much less than half the theoretical bandwidth of a 256-bit LPDDR5 interface, although that is probably still a long way above current AM5 offerings, provided there are no bottlenecks elsewhere.

Either way, iGPU/NPU offload is definitely going to be needed for the workload it is expected to do.

Mobile chips starting to outstrip non-HEDT desktop CPU performance. What a world we live in.

Yeah, that's likely going to be the case. But I believe that high bandwidth is meant to keep the gpu/npu fed (as our colleague said above), and not the CPU itself, so it still makes sense if that's indeed the case.

AVATARAT · Sep 25, 2024

igormp said:
LPDDR5 memory actually has higher latencies compared to your regular DDR5 sticks.

That CPU is also chiplet-based, so you have your regular desktop CCDs communicating with the IO Die in a similar fashion to Ryzen 9000.

Yes it is, but all the components and memory are much closer than on a normal system.

igormp · Sep 25, 2024

AVATARAT said:
Yes it is, but all the components and memory are much closer than on a normal system.

I don't think strix halo will have on-package ram (like apple chips or Lunar lake), just soldered ram, so it's not that close.

Anyhow, even though it's closer, latencies are still bad. Just take a look at apple chips, even with the memory on the package, its latencies are still in the 100s (worse than even Ryzen desktop).

JWNoctis · Sep 25, 2024

AVATARAT said:
Yes it is, but all the components and memory are much closer than on a normal system.

Between CPU and GPU since the memory is unified, and between GPU and RAM since GDDR6/X had greater bandwidth but even worse latencies, yes.

System Name	RBMK-1000
Processor	AMD Ryzen 7 5700G
Motherboard	Gigabyte B550 AORUS Elite V2
Cooling	DeepCool Gammax L240 V2
Memory	2x 16GB DDR4-3200
Video Card(s)	Galax RTX 4070 Ti EX
Storage	Samsung 990 1TB
Display(s)	BenQ 1440p 60 Hz 27-inch
Case	Corsair Carbide 100R
Audio Device(s)	ASUS SupremeFX S1220A
Power Supply	Cooler Master MWE Gold 650W
Mouse	ASUS ROG Strix Impact
Keyboard	Gamdias Hermes E2
Software	Windows 11 Pro

System Name	Kuro
Processor	AMD Ryzen 7 7800X3D@65W
Motherboard	MSI MAG B650 Tomahawk WiFi
Cooling	Thermalright Phantom Spirit 120 EVO
Memory	Corsair DDR5 6000C30 2x48GB (Hynix M)@6000 30-36-36-76 1.36V
Video Card(s)	PNY XLR8 RTX 4070 Ti SUPER 16G@200W
Storage	Crucial T500 2TB + WD Blue 8TB
Case	Lian Li LANCOOL 216
Power Supply	MSI MPG A850G
Software	Ubuntu 24.04 LTS + Windows 10 Home Build 19045
Benchmark Scores	17761 C23 Multi@65W

System Name	H7 Flow 2024
Processor	AMD 5800X3D
Motherboard	Asus X570 Tough Gaming
Cooling	Custom liquid
Memory	32 GB DDR4
Video Card(s)	Intel ARC A750
Storage	Crucial P5 Plus 2TB.
Display(s)	AOC 24" Freesync 1m.s. 75Hz
Mouse	Lenovo
Keyboard	Eweadn Mechanical
Software	W11 Pro 64 bit

Processor	Ryzen 7 9700x
Motherboard	Asrock B650E PG Riptide WiFi
Cooling	Underfloor CPU cooling
Memory	2x32GB 6200MT/s
Video Card(s)	ASUS Prime Radeon RX 9070 XT OC Edition
Storage	Kingston Fury Renegade 1TB, Seagate Exos 12TB
Display(s)	MSI Optix MAG301RF 2560x1080@200Hz
Case	APNX V1 Black
Power Supply	NZXT C850 850W Gold
Mouse	Bloody W95 Max Naraka

System Name	LenovoⓇ ThinkPad™ T430
Processor	IntelⓇ Core™ i5-3210M processor (2 cores, 2.50GHz, 3MB cache), Intel Turbo Boost™ 2.0 (3.10GHz), HT™
Motherboard	Lenovo 2344 (Mobile Intel QM77 Express Chipset)
Cooling	Single-pipe heatsink + Delta fan
Memory	2x 8GB KingstonⓇ HyperX™ Impact 2133MHz DDR3L SO-DIMM
Video Card(s)	Intel HD Graphics™ 4000 (GPU clk: 1100MHz, vRAM clk: 1066MHz)
Storage	SamsungⓇ 860 EVO mSATA (250GB) + 850 EVO (500GB) SATA
Display(s)	14.0" (355mm) HD (1366x768) color, anti-glare, LED backlight, 200 nits, 16:9 aspect ratio, 300:1 co
Case	ThinkPad Roll Cage (one-piece magnesium frame)
Audio Device(s)	HD Audio, RealtekⓇ ALC3202 codec, DolbyⓇ Advanced Audio™ v2 / stereo speakers, 1W x 2
Power Supply	ThinkPad 65W AC Adapter + ThinkPad Battery 70++ (9-cell)
Mouse	TrackPointⓇ pointing device + UltraNav™, wide touchpad below keyboard + ThinkLight™
Keyboard	6-row, 84-key, ThinkVantage button, spill-resistant, multimedia Fn keys, LED backlight (PT Layout)
Software	MicrosoftⓇ WindowsⓇ 10 x86-64 (22H2)

Processor	5950x
Motherboard	B550 ProArt
Cooling	Fuma 2
Memory	4x32GB 3200MHz Corsair LPX
Video Card(s)	2x RTX 3090
Display(s)	LG 42" C2 4k OLED
Power Supply	XPG Core Reactor 850W
Software	I use Arch btw

AMD Ryzen AI Max 390 "Strix Halo" Surfaces in Geekbench AI Benchmark

Editor & Senior Moderator