• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Ryzen AI Max 390 "Strix Halo" Surfaces in Geekbench AI Benchmark

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,670 (7.43/day)
Location
Dublin, Ireland
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard Gigabyte B550 AORUS Elite V2
Cooling DeepCool Gammax L240 V2
Memory 2x 16GB DDR4-3200
Video Card(s) Galax RTX 4070 Ti EX
Storage Samsung 990 1TB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
In case you missed it, AMD's new madcap enthusiast silicon engineering effort, the "Strix Halo," is real, and comes with the Ryzen AI Max 300 series branding. These are chiplet-based mobile processors with one or two "Zen 5" CCDs—same ones found in "Granite Ridge" desktop processors—paired with a large SoC die that has an oversized iGPU. This arrangement lets AMD give the processor up to 16 full-sized "Zen 5" CPU cores, and an iGPU with as many as 40 RDNA 3.5 compute units (2,560 stream processors), and a 256-bit LPDDR5/x memory interface for UMA.

"Strix Halo" is designed for ultraportable gaming notebooks or mobile workstations where low PCB footprint is of the essence, and discrete GPU is not an option. For enthusiast gaming notebooks with discrete GPUs, AMD is designing the "Fire Range" processor, which is essentially a mobile BGA version of "Granite Ridge," and a successor to the Ryzen 7045 series "Dragon Range." The Ryzen AI Max series has three models based on CPU and iGPU CU counts—the Ryzen AI Max 395+ (16-core/32-thread with 40 CU), the Ryzen AI Max 390 (12-core/24-thread with 40 CU), and the Ryzen AI Max 385 (8-core/16-thread, 32 CU). An alleged Ryzen AI Max 390 engineering sample surfaced on the Geekbench AI benchmark online database.



The online database entry for this Geekbench AI benchmark submission mentions a processor that identifies itself as "AMD Eng Sample: 100-000001421-50_Y," which corresponds with the Ryzen AI Max 390 (12-core/24-thread, 40 CU). The processor has a CPU base frequency of 3.20 GHz, and maximum boost frequency of 5.00 GHz, at least for this engineering sample (the retail chip could differ). This processor is driving a prototype HP ZBook Ultra 14 G1a mobile workstation, and is wired to 64 GB of memory.

The processor yielded a single-precision Geekbench AI score of 4733 points, half-precision score of 4944 points, and quantized score of 13944 points. HotHardware notes that this is a rather large 60% delta with the desktop Ryzen 9 9900X processor. There could be several reasons behind this. The screenshot shows that the notebook is running on a Balanced power plan; and the benchmark uses 256-bit AVX2 SIMD instructions, and not the newer AVX512. The "Zen 5" cores on the "Strix Halo" are carried over from "Granite Ridge" and EPYC "Turin," since they're the same 8-core CCD, and feature full 512-bit FP data-paths. This is unlike the "Zen 5" cores on the "Strix Point" monolithic silicon, which are restricted to a dual-pumped 256-bit FP data-path even when executing AVX512 or VNNI instructions. Therefore, AI benchmarks that use AVX512/VNNI could yield different results. Then there's the fact that this is an engineering sample, and AMD could be deliberately nerfing its performance.

View at TechPowerUp Main Site | Source
 
Last edited:
How to throw away the best product your company has had in years, call it "AI Max".
 
I am excited to see this chips in small form factor PC.
Something like Minisforum or Beelink.
Just not to be slow in the rollout, 9 months later.

Presumably it will cost a lot, 16 core CPU + GPU + fast integrated ram, cooling, + ~220wats power adapter. 1600$ easy.

Mac Studio replacement.
 
This is interesting for gaming as the communication between CPU/GPU/RAM will be very low latency.
 
How to throw away the best product your company has had in years, call it "AI Max".
It makes the investors/shareholders happy, so why not?
For the average Joe it's still about "My budget is X".
 
How to throw away the best product your company has had in years, call it "AI Max".
No one cares, give it a rest.

In other news: Gigabyte PSU explodes, Windows Vista RTM is a resource hog, some people used drugs at the Woodstock festival 1969, and the Anglo-Saxon king Harold Godwinson have died at the battle of Hastings.
 
This is interesting for gaming as the communication between CPU/GPU/RAM will be very low latency.
LPDDR5 memory actually has higher latencies compared to your regular DDR5 sticks.

That CPU is also chiplet-based, so you have your regular desktop CCDs communicating with the IO Die in a similar fashion to Ryzen 9000.
 
LPDDR5 memory actually has higher latencies compared to your regular DDR5 sticks.

That CPU is also chiplet-based, so you have your regular desktop CCDs communicating with the IO Die in a similar fashion to Ryzen 9000.

Higher latency but bigger bandwidth. Small enough latency for general computing and boosted bandwidth so those 40 compute units don't go to waste.
 
LPDDR5 memory actually has higher latencies compared to your regular DDR5 sticks.

That CPU is also chiplet-based, so you have your regular desktop CCDs communicating with the IO Die in a similar fashion to Ryzen 9000.
Doesn't that also imply a 64GB/s bandwidth limit between each CCD and the IOD, if they keep the same IF architecture there? That adds up to much less than half the theoretical bandwidth of a 256-bit LPDDR5 interface, although that is probably still a long way above current AM5 offerings, provided there are no bottlenecks elsewhere.

Either way, iGPU/NPU offload is definitely going to be needed for the workload it is expected to do.

Mobile chips starting to outstrip non-HEDT desktop CPU performance. What a world we live in.
 
Higher latency but bigger bandwidth. Small enough latency for general computing and boosted bandwidth so those 40 compute units don't go to waste.
The gpu is going to love it for sure. The cpu, however, not that much (but it's far from awful, don't get me wrong).
Doesn't that also imply a 64GB/s bandwidth limit between each CCD and the IOD, if they keep the same IF architecture there? That adds up to much less than half the theoretical bandwidth of a 256-bit LPDDR5 interface, although that is probably still a long way above current AM5 offerings, provided there are no bottlenecks elsewhere.

Either way, iGPU/NPU offload is definitely going to be needed for the workload it is expected to do.

Mobile chips starting to outstrip non-HEDT desktop CPU performance. What a world we live in.
Yeah, that's likely going to be the case. But I believe that high bandwidth is meant to keep the gpu/npu fed (as our colleague said above), and not the CPU itself, so it still makes sense if that's indeed the case.
 
LPDDR5 memory actually has higher latencies compared to your regular DDR5 sticks.

That CPU is also chiplet-based, so you have your regular desktop CCDs communicating with the IO Die in a similar fashion to Ryzen 9000.
Yes it is, but all the components and memory are much closer than on a normal system.
 
Yes it is, but all the components and memory are much closer than on a normal system.
I don't think strix halo will have on-package ram (like apple chips or Lunar lake), just soldered ram, so it's not that close.

Anyhow, even though it's closer, latencies are still bad. Just take a look at apple chips, even with the memory on the package, its latencies are still in the 100s (worse than even Ryzen desktop).
 
Yes it is, but all the components and memory are much closer than on a normal system.
Between CPU and GPU since the memory is unified, and between GPU and RAM since GDDR6/X had greater bandwidth but even worse latencies, yes.
 
Back
Top