Monday, March 17th 2025

AMD's Ryzen AI MAX+ 395 Delivers up to 12x AI LLM Performance Compared to Intel's "Lunar Lake"
AMD's latest flagship APU, the Ryzen AI MAX+ 395 "Strix Halo," demonstrates some impressive performance advantages over Intel's "Lunar Lake" processors in large language model (LLM) inference workloads, according to recent benchmarks on AMD's blog. Featuring 16 Zen 5 CPU cores, 40 RDNA 3.5 compute units, and over 50 AI TOPS via its XDNA 2 NPU, the processor achieves up to 12.2x faster response times than Intel's Core Ultra 258V in specific LLM scenarios. Notably, Intel's Lunar Lake has four E-cores and four P-cores, which in total is half of the Ryzen AI MAX+ 395 CPU core count, but the performance difference is much more pronounced than the 2x core gap. The performance delta becomes even more notable with model complexity, particularly with 14-billion parameter models approaching the limit of what standard 32 GB laptops can handle.
In LM Studio benchmarks using an ASUS ROG Flow Z13 with 64 GB unified memory, the integrated Radeon 8060S GPU delivered 2.2x higher token throughput than Intel's Arc 140V across various model architectures. Time-to-first-token metrics revealed a 4x advantage in smaller models like Llama 3.2 3B Instruct, expanding to 9.1x with 7-8B parameter models such as DeepSeek R1 Distill variants. AMD's architecture particularly excels in multimodal vision tasks, where the Ryzen AI MAX+ 395 processed complex visual inputs up to 7x faster in IBM Granite Vision 3.2 3B and 6x faster in Google Gemma 3 12B compared to Intel's offering. The platform's support for AMD Variable Graphics Memory allows allocating up to 96 GB as VRAM from systems equipped with 128 GB unified memory, enabling the deployment of state-of-the-art models like Google Gemma 3 27B Vision. The processor's performance advantages extend to practical AI applications, including medical image analysis and coding assistance via higher-precision 6-bit quantization in the DeepSeek R1 Distill Qwen 32B model.
Source:
AMD Blog
In LM Studio benchmarks using an ASUS ROG Flow Z13 with 64 GB unified memory, the integrated Radeon 8060S GPU delivered 2.2x higher token throughput than Intel's Arc 140V across various model architectures. Time-to-first-token metrics revealed a 4x advantage in smaller models like Llama 3.2 3B Instruct, expanding to 9.1x with 7-8B parameter models such as DeepSeek R1 Distill variants. AMD's architecture particularly excels in multimodal vision tasks, where the Ryzen AI MAX+ 395 processed complex visual inputs up to 7x faster in IBM Granite Vision 3.2 3B and 6x faster in Google Gemma 3 12B compared to Intel's offering. The platform's support for AMD Variable Graphics Memory allows allocating up to 96 GB as VRAM from systems equipped with 128 GB unified memory, enabling the deployment of state-of-the-art models like Google Gemma 3 27B Vision. The processor's performance advantages extend to practical AI applications, including medical image analysis and coding assistance via higher-precision 6-bit quantization in the DeepSeek R1 Distill Qwen 32B model.
23 Comments on AMD's Ryzen AI MAX+ 395 Delivers up to 12x AI LLM Performance Compared to Intel's "Lunar Lake"
Why, you may ask yourself...
Cause da AI already has all your personal & financial info, so just go for it !
If not, those will be in really different segments. Strix halo will be present in laptops that closely resemble gaming laptops, way chonkier and with way higher power consumption.
Lunar lake at most competes with Strix Point (at least there are notebooks in the same form factor with both options), and more realistically with krackan point given the low power focus. i believe this was just a rumor. LNL is a one-off thing, won't see any other improvements or a next iteration.
On a serious note, it would be really nice to cut the dumb NPU "cores", and place more CPU cores/GPU CUs, on both Strix Halo, and Strix Point.
However, a M4 Max is also a tad more expensive than an strix halo config. Strix Halo is closer to a M4 Pro in performance when it comes to LLM stuff.
This is a chip I want in my next laptop, but none are available to buy. Not even the mini-pc options are available yet! That NPU can do specific tasks a lot more efficiently than tapping the GPU compute units, it's just that at the moment that task is just running the stupid microsoft copilot. If developers start tapping the npu to do more usefull stuff it's just another processing unit, if it continues to be reserved for microsoft yeah, waste of sand
AMD is also comparing their highest end full power laptop chip to Intel's best thin and light chip. A more honest comparison would have been with the 285HX. I'm sure the Strix Halo would still win since it has the superior memory and GPU, but I bet it would be closer.
Do you wish that only cell phones had NPUs and not PCs?