AMD Introduces GAIA - an Open-Source Project That Runs Local LLMs on Ryzen AI NPUs

T0@st · Mar 21, 2025

AMD has launched a new open-source project called, GAIA (pronounced /ˈɡaɪ.ə/), an awesome application that leverages the power of Ryzen AI Neural Processing Unit (NPU) to run private and local large language models (LLMs). In this blog, we'll dive into the features and benefits of GAIA, while introducing how you can take advantage of GAIA's open-source project to adopt into your own applications.

Introduction to GAIA
GAIA is a generative AI application designed to run local, private LLMs on Windows PCs and is optimized for AMD Ryzen AI hardware (AMD Ryzen AI 300 Series Processors). This integration allows for faster, more efficient processing - i.e. lower power- while keeping your data local and secure. On Ryzen AI PCs, GAIA interacts with the NPU and iGPU to run models seamlessly by using the open-source Lemonade (LLM-Aid) SDK from ONNX TurnkeyML for LLM inference. GAIA supports a variety of local LLMs optimized to run on Ryzen AI PCs. Popular models like Llama and Phi derivatives can be tailored for different use cases, such as Q&A, summarization, and complex reasoning tasks.

Getting Started with GAIA
To get started with GAIA in under 10 minutes. Follow the instructions to download and install GAIA on your Ryzen AI PC. Once installed, you can launch GAIA and begin exploring its various agents and capabilities. There are 2 versions of GAIA:

1) GAIA Installer - this will run on any Windows PC; however, performance may be slower.
2) GAIA Hybrid Installer - this package is optimized to run on Ryzen AI PCs and uses the NPU and iGPU for better performance.

The Agent RAG Pipeline
One of the standout features of GAIA is its agent Retrieval-Augmented Generation (RAG) pipeline. This pipeline combines an LLM with a knowledge base, enabling the agent to retrieve relevant information, reason, plan, and use external tools within an interactive chat environment. This results in more accurate and contextually aware responses.

The current GAIA agents enable the following capabilities:

Simple Prompt Completion: No agent for direct model interaction for testing and evaluation.
Chaty: an LLM chatbot with history that engages in conversation with the user.
Clip: an Agentic RAG for YouTube search and Q&A agent.
Joker: a simple joke generator using RAG to bring humor to the user.

Additional agents are currently in development, and developers are encouraged to create and contribute their own agent to GAIA.

How does GAIA Work?
The left side of Figure 2: GAIA Overview Diagram illustrates the functionality of Lemonade SDK from TurnkeyML. Lemonade SDK provides tools for LLM-specific tasks such as prompting, accuracy measurement, and serving across multiple runtimes (e.g., Hugging Face, ONNX Runtime GenAI API) and hardware (CPU, iGPU, and NPU).

Lemonade exposes an LLM web service that communicates with the GAIA application (on the right) via an OpenAI compatible REST API. GAIA consists of three key components:

1) LLM Connector - Bridges the NPU service's Web API with the LlamaIndex-based RAG pipeline.
2) LlamaIndex RAG Pipeline - Includes a query engine and vector memory, which processes and stores relevant external information.
3) Agent Web Server - Connects to the GAIA UI via WebSocket, enabling user interaction.

On the right side of the figure, GAIA acts as an AI-powered agent that retrieves and processes data. It vectorizes external content (e.g., GitHub, YouTube, text files) and stores it in a local vector index. When a user submits a query, the following process occurs:

1) The query is sent to GAIA, where it is transformed into an embedding vector.
2) The vectorized query is used to retrieve relevant context from the indexed data.
3) The retrieved context is passed to the web service, where it is embedded into the LLM's prompt.
4) The LLM generates a response, which is streamed back through the GAIA web service and displayed in the UI.

This process ensures that user queries are enhanced with relevant context before being processed by the LLM, improving response accuracy and relevance. The final answer is delivered to the user in real-time through the UI.

Benefits of Running LLMs Locally
Running LLMs locally on the NPU offers several benefits:

Enhanced privacy, as no data needs to leave your machine. This eliminates the need to send sensitive information to the cloud, greatly enhancing data privacy and security while still delivering high-performance AI capabilities.
Reduced latency, since there's no need to communicate with the cloud.
Optimized performance with the NPU, leading to faster response times and lower power consumption.

Comparing NPU and iGPU
Running GAIA on the NPU results in improved performance for AI-specific tasks, as it is designed for inference workloads. Beginning with Ryzen AI Software Release 1.3, there is hybrid support for deploying quantized LLMs that utilize both the NPU and the iGPU. By using both components, each can be applied to the tasks and operations they are optimized for.

Applications and Industries
This setup could benefit industries that require high performance and privacy, such as healthcare, finance, and enterprise applications where data privacy is critical. It can also be applied in fields like content creation and customer service automation, where generative AI models are becoming essential. Lastly, it helps industries without Wi-Fi to send data to the cloud and receive responses, as all the processing is done locally.

Conclusion
In conclusion, GAIA, an open-source AMD application, uses the power of the Ryzen AI NPU to deliver efficient, private, and high-performance LLMs. By running LLMs locally, GAIA ensures enhanced privacy, reduced latency, and optimized performance, making it ideal for industries that prioritize data security and rapid response times.

GAIA: An Open-Source AMD Solution for Running Local LLMs on AMD Ryzen AI

Ready to try GAIA yourself? Our video provides a brief overview and installation demo of GAIA.

Check out and contribute to the GAIA repo at github.com/amd/gaia. For feedback or questions, please reach out to us at GAIA@amd.com.

View at TechPowerUp Main Site | Source

john_ · Mar 21, 2025

Nice. In an era where AI is everywhere, AMD needs to not just follow, but even start offering more options to it's customers. They need it to stay competitive.

1) GAIA Installer - this will run on any Windows PC; however, performance may be slower.

That's also a good choice from them. They don't offer just a Ryzen AI optimized solution in an effort to use it as a marketing tool.

lexluthermiester · Mar 21, 2025

Open Source is good IF it's actually open.

RejZoR · Mar 21, 2025

lexluthermiester said:
Open Source is good IF it's actually open.

And not like OpenAi where "open" part is in air quotes...

MCJAxolotl7 · Saturday at 1:23 AM

lexluthermiester said:
Open Source is good IF it's actually open.

Can confirm, is open. Went to AMD's GitHub, found it immediately.

GitHub - amd/gaia: Run LLM Agents on Ryzen AI PCs in Minutes

Run LLM Agents on Ryzen AI PCs in Minutes. Contribute to amd/gaia development by creating an account on GitHub.

github.com

lexluthermiester · Saturday at 1:40 AM

MCJAxolotl7 said:
Can confirm, is open. Went to AMD's GitHub, found it immediately.

GitHub - amd/gaia: Run LLM Agents on Ryzen AI PCs in Minutes

Run LLM Agents on Ryzen AI PCs in Minutes. Contribute to amd/gaia development by creating an account on GitHub.

github.com

Cool. Hadn't actually gone looking.

Sunlight91 · Saturday at 10:13 AM

It needs Linux support.

Tomorrow · Saturday at 4:15 PM

Finally utilizing "dark silicon" NPU's that most of the time sit idle...

R0H1T · Saturday at 4:46 PM

Next up ~ Nvidia probably launches Cap'n planet :laugh:

blinnbanir · Saturday at 6:34 PM

I Asked Co-pilot what AI I could use to mine Bitcoin and it actually gave me 3 choices. To me AI is nothing more than an Enclyocpedia (Co Pilot). I guess this means it will do it faster. The next question I might ask is how to turn off Windows telemetry hehe.

Cheeseball · Saturday at 11:17 PM

R0H1T said:
Next up ~ Nvidia probably launches Cap'n planet

NVIDIA's version of Captain Planet:

Don Cheadle is Captain Planet

r.h.p · Sunday at 11:45 AM

Cheeseball said:
NVIDIA's version of Captain Planet:

Don Cheadle is Captain Planet

THAT WAS FRAKIN GOOD :roll:

Lesha · Sunday at 3:50 PM

Are Radeon GPUs supported?

NoLoihi · Sunday at 5:24 PM

AI Playground 2.0, released the 22nd of July 2024. It had only supported Intel, it seems like, whereas this claims to run on all.

Introducing AI Playground – Intel Gaming Access

game.intel.com

Creativity meets AI: Intel presents its new AI Playground | igor´sLAB

Intel is currently being criticized for its 13th and 14th generation processors, which many are already aware of. However, the company has now introduced a new tool that allows you to delve further…

www.igorslab.de

Intel AI Playground Is An AI PC Starter App For Arc GPUs And It's Free

Never messed with local AI and want to give it a shot? Intel AI Playground makes it extremely easy, as long as you have the required hardware.

hothardware.com

Intel releases AI Playground app for local AI computing — Lunar Lake support added in the latest version

Is this the future of consumer AI?

www.tomshardware.com

GitHub - intel/AI-Playground: AI PC starter app for doing AI image creation, image stylizing, and chatbot on a PC powered by an Intel® Arc™ GPU.

AI PC starter app for doing AI image creation, image stylizing, and chatbot on a PC powered by an Intel® Arc™ GPU. - intel/AI-Playground

github.com

T0@st said:
Running GAIA on the NPU results in improved performance for AI-specific tasks, as it is designed for inference workloads.

Isn’t that wrong? As far as I’m aware, the GPU plainly has enough raw power to beat the NPU, even in inference. (I cannot find numbers for that.)

Cheeseball · Sunday at 7:14 PM

Lesha said:
Are Radeon GPUs supported?

No, this is aimed specifically at the NPU and iGPU in the Strix Point APUs. If you have a 7000 or 9000 series GPU then you might as well run LM Studio directly.

NoLoihi said:
Isn’t that wrong? As far as I’m aware, the GPU plainly has enough raw power to beat the NPU, even in inference. (I cannot find numbers for that.)

That statement doesn't imply that the NPU has improved performance over a GPU at all. It means that instead of using either the CPU or GPU to run AI-related workloads, you can now utilize the NPU with GAIA and have the other two processors free for other tasks.

igormp · Sunday at 8:52 PM

NoLoihi said:
Isn’t that wrong? As far as I’m aware, the GPU plainly has enough raw power to beat the NPU, even in inference. (I cannot find numbers for that.)

This piece of software is meant to run smaller quantized models at INT4. Afaik, rdna 3.5 lacks support for that data type, and thus the NPU will indeed be faster.

NoLoihi · Monday at 1:55 AM

igormp said:
This piece of software is meant to run smaller quantized models at INT4. Afaik, rdna 3.5 lacks support for that data type, and thus the NPU will indeed be faster.

Here it mentions RDNA 3.5 in Strix Point being able to go down to INT4, but I don’t understand enough to be sure about its impact: https://chipsandcheese.com/p/lunar-lakes-igpu-debut-of-intels
(I find it hard to find info on this, even though I’d say I’ve given it a good search. Maybe I’m too unexperienced, maybe Google is being dumb; also not on top of my game and heading to sleep soon.)

Cheeseball said:
That statement doesn't imply that the NPU has improved performance over a GPU at all. It means that instead of using either the CPU or GPU to run AI-related workloads, you can now utilize the NPU with GAIA and have the other two processors free for other tasks.

Oh, that’s a very charitable reading of yours. Usually performance is about the item at hand (and more often than not, isolated benchmarks), had they meant whole-system performance, they should have said so.

Cheeseball · Monday at 7:53 AM

NoLoihi said:
Oh, that’s a very charitable reading of yours. Usually performance is about the item at hand (and more often than not, isolated benchmarks), had they meant whole-system performance, they should have said so.

It is though. The item at hand is just the NPU. The GPU was not mentioned in that statement and thus is not the one being compared to.

If you were inferring that a modern GPU is better, then generally thats true just by memory bandwidth limitation alone.

The point is the NPU part of the newer APUs can now be used instead of just being a CoPilot+ selling point. 50 TOPS is nothing compared to a RTX 4060 (242 TOPS by itself), but at least we have access to it now.

regs · Monday at 11:23 AM

Made in Python...

cal5582 · Monday at 1:48 PM

wake me up when they integrate the npu into the big desktop processors.

igormp · Monday at 3:05 PM

NoLoihi said:
Here it mentions RDNA 3.5 in Strix Point being able to go down to INT4, but I don’t understand enough to be sure about its impact: https://chipsandcheese.com/p/lunar-lakes-igpu-debut-of-intels
(I find it hard to find info on this, even though I’d say I’ve given it a good search. Maybe I’m too unexperienced, maybe Google is being dumb; also not on top of my game and heading to sleep soon.)

Oh, I stand corrected, thanks. The ISA reference does mention INT4 support as well for WMMA:

https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna35_instruction_set_architecture.pdf

Still, given how those WMMA instructions use the regular vector units on RDNA3.5 and older, I believe the NPU has a higher rate when processing those, but I couldn't find the proper numbers after a quick google.

regs said:
Made in Python...

Most of the AI stack out there is written in python, I don't see why this would be any different.

trsttte · Tuesday at 12:10 AM

regs said:
Made in Python...

Maybe you missed it, but nowadays python is pretty fast if you know what to do

OSdevr · Tuesday at 1:52 AM

Lesha said:
Are Radeon GPUs supported?

There's already plenty of LLM AI software that can run on Radeon GPUs much of which is open source. Ollama is a common backend, Msty is a particularly good free (albeit closed source) interface that can use it. Really the only thing notable about this is that it can use the Rysen AI's NPU as there's plenty of fully open source alternatives at this point.

regs · Tuesday at 11:36 AM

trsttte said:
Maybe you missed it, but nowadays python is pretty fast if you know what to do

Scripting language will never be fast and memory efficient.

igormp · Tuesday at 2:43 PM

regs said:
Scripting language will never be fast and memory efficient.

It doesn't matter, python is just used as a front-end/glue code to the actual inference engines.

System Name	The TPU Typewriter
Processor	AMD Ryzen 5 5600 (non-X)
Motherboard	GIGABYTE B550M DS3H Micro ATX
Cooling	DeepCool AS500
Memory	Kingston Fury Renegade RGB 32 GB (2 x 16 GB) DDR4-3600 CL16
Video Card(s)	PowerColor Radeon RX 7800 XT 16 GB Hellhound OC
Storage	Samsung 980 Pro 1 TB M.2-2280 PCIe 4.0 X4 NVME SSD
Display(s)	Lenovo Legion Y27q-20 27" QHD IPS monitor
Case	GameMax Spark M-ATX (re-badged Jonsbo D30)
Audio Device(s)	FiiO K7 Desktop DAC/Amp + Philips Fidelio X3 headphones, or ARTTI T10 Planar IEMs
Power Supply	ADATA XPG CORE Reactor 650 W 80+ Gold ATX
Mouse	Roccat Kone Pro Air
Keyboard	Cooler Master MasterKeys Pro L
Software	Windows 10 64-bit Home Edition

System Name	3 desktop systems: Gaming / Internet / HTPC
Processor	Ryzen 5 7600 / Ryzen 5 4600G / Ryzen 5 5500
Motherboard	X670E Gaming Plus WiFi / MSI X470 Gaming Plus Max (1) / MSI X470 Gaming Plus Max (2)
Cooling	Aigo ICE 400SE / Segotep T4 / Νoctua U12S
Memory	Kingston FURY Beast 32GB DDR5 6000 / 16GB JUHOR / 32GB G.Skill RIPJAWS 3600 + Aegis 3200
Video Card(s)	ASRock RX 6600 / Vega 7 integrated / Radeon RX 580
Storage	NVMes, ONLY NVMes / NVMes, SATA Storage / NVMe, SATA, external storage
Display(s)	Philips 43PUS8857/12 UHD TV (120Hz, HDR, FreeSync Premium) / 19'' HP monitor + BlitzWolf BW-V5
Case	Sharkoon Rebel 12 / CoolerMaster Elite 361 / Xigmatek Midguard
Audio Device(s)	onboard
Power Supply	Chieftec 850W / Silver Power 400W / Sharkoon 650W
Mouse	CoolerMaster Devastator III Plus / CoolerMaster Devastator / Logitech
Keyboard	CoolerMaster Devastator III Plus / CoolerMaster Devastator / Logitech
Software	Windows 10 / Windows 10&Windows 11 / Windows 10

System Name	Dark Monolith
Processor	AMD Ryzen 7 5800X3D
Motherboard	ASUS Strix X570-E
Cooling	Arctic Cooling Freezer II 240mm + 2x SilentWings 3 120mm
Memory	64 GB G.Skill Ripjaws V Black
Video Card(s)	XFX Radeon RX 9070 XT Mercury OC Magnetic Air
Storage	Seagate Firecuda 530 4 TB SSD + Samsung 850 Pro 2 TB SSD + Seagate Barracuda 8 TB HDD
Display(s)	ASUS ROG Swift PG27AQDM 240Hz OLED
Case	Silverstone Kublai KL-07
Audio Device(s)	Sound Blaster AE-9 MUSES Edition + Altec Lansing MX5021 2.1 Nichicon Gold
Power Supply	BeQuiet DarkPower 11 Pro 750W
Mouse	Logitech G502 Core
Keyboard	UVI Pride MechaOptical
Software	Windows 11 Pro

System Name	Alienware Aurora R13
Processor	i9-12900kf
Motherboard	Alienware 0C92D0
Cooling	Alienware CPU water block AIO thing
Memory	32gbs of DDR5-4400 (slow, I know)
Video Card(s)	Dell/Alienware RTX 3080ti
Storage	NVMe KIOXIA KXG70ZNV1T02 1024GB
Display(s)	Asus ROG PG32UCDM 4k 240hz 32'
Case	Alienware Aurora R13
Audio Device(s)	Soundcore Life Q20 headphones
Power Supply	Dell-Something-Or-Other 1000? watt
Mouse	Logitech Pro Superlight 2
Keyboard	Razer BlackWidow V4 with Razer Green switches
VR HMD	None
Software	Ubuntu Linux 24.10

System Name	My Gamer
Processor	9900X3D
Motherboard	As Rock X870E Taichi
Cooling	Thermalright Elite 360
Memory	Gskill DDR5 64GB 30 1.35 volts
Video Card(s)	7900XT
Storage	Corsair MP700 boot
Display(s)	FV43U
Case	7000D Airflow
Audio Device(s)	Void Headset, Creatibe Speakers
Power Supply	Super Flower Leadex 1000W
Mouse	AsusTuf M300

AMD Introduces GAIA - an Open-Source Project That Runs Local LLMs on Ryzen AI NPUs

T0@st

News Editor

john_

lexluthermiester

RejZoR

MCJAxolotl7

GitHub - amd/gaia: Run LLM Agents on Ryzen AI PCs in Minutes

lexluthermiester

GitHub - amd/gaia: Run LLM Agents on Ryzen AI PCs in Minutes

Sunlight91

Tomorrow

R0H1T

blinnbanir

Cheeseball

Not a Potato

r.h.p

Lesha

NoLoihi

Introducing AI Playground – Intel Gaming Access

Creativity meets AI: Intel presents its new AI Playground | igor´sLAB

Intel AI Playground Is An AI PC Starter App For Arc GPUs And It's Free

Intel releases AI Playground app for local AI computing — Lunar Lake support added in the latest version

GitHub - intel/AI-Playground: AI PC starter app for doing AI image creation, image stylizing, and chatbot on a PC powered by an Intel® Arc™ GPU.

Cheeseball

Not a Potato

igormp

NoLoihi

Cheeseball

Not a Potato

regs

cal5582

igormp

trsttte

OSdevr

regs

igormp

System Name	Titan
Processor	AMD Ryzen™ 7 7950X3D
Motherboard	ASRock X870 Taichi Lite
Cooling	Thermalright Phantom Spirit 120 EVO CPU
Memory	TEAMGROUP T-Force Delta RGB 2x16GB DDR5-6000 CL30
Video Card(s)	ASRock Radeon RX 7900 XTX 24 GB GDDR6 (MBA)
Storage	Crucial T500 2TB x 3
Display(s)	LG 32GS95UE-B, ASUS ROG Swift OLED (PG27AQDP), LG C4 42" (OLED42C4PUA)
Case	Cooler Master QUBE 500 Flatpack Macaron
Audio Device(s)	Kanto Audio YU2 and SUB8 Desktop Speakers and Subwoofer, Cloud Alpha Wireless
Power Supply	Corsair SF1000
Mouse	Logitech Pro Superlight 2 (White), G303 Shroud Edition
Keyboard	Keychron K2 HE Wireless / 8BitDo Retro Mechanical Keyboard (N Edition) / NuPhy Air75 v2
VR HMD	Meta Quest 3 512GB
Software	Windows 11 Pro 64-bit 24H2 Build 26100.2605

System Name	schweinestalle1 and schweinestalle 2
Processor	AMD Ryzen 7 5700X3D / AMD Ryzen 3200G
Motherboard	Asus Prime - Pro X570 + Asus PCI -E AC68 Adapter / Asus Prime B450 M-K
Cooling	AMD Wraith Prism / AMD Wraith
Memory	Kingston HyperX 2 x 16 gb DDR 4 3200mhz / Kingston HyperX 2x 8Gb DDR 3200mhz
Video Card(s)	AMD Radeon RX 7800 XT 16GB Pulse / AMD Reference Vega 64 8GB
Storage	Crucial 1TB M.2 SSD and WD Blue 500gb Nand SSD / WD Blue 240gb M.2 SSD
Display(s)	Asus XG 32 V ROG and LG ultra gear 32gs75q / TCL TV
Case	Corsair AIR ATX / Corsair Air Mini ATX
Audio Device(s)	Realtech standard / Realtech standard
Power Supply	Corsair 850 Modular / Corsair 750 Modular
Mouse	CM Havoc / Microsoft Wireless
Keyboard	Corsair Cherry Mechanical / Razor piece of shit
Software	Win 10 / win 10
Benchmark Scores	Soon ! whateva

Processor	Ryzen 7 7700X
Motherboard	ASRock B650E PG Riptide WiFi
Cooling	Noctua NH-D15
Memory	Kingston Fury Beast 32GB 5600 MHz CL36 @ 6200 MHz
Video Card(s)	AMD Radeon RX 6600
Case	Fractal Design Define R5
Power Supply	Corsair RM550x

Processor	5950x
Motherboard	B550 ProArt
Cooling	Fuma 2
Memory	4x32GB 3200MHz Corsair LPX
Video Card(s)	2x RTX 3090
Display(s)	LG 42" C2 4k OLED
Power Supply	XPG Core Reactor 850W
Software	I use Arch btw

System Name	Nirn
Processor	Amd Ryzen 7950X3D
Motherboard	MSI MEG ACE X670e
Cooling	Noctua NH-D15
Memory	128 GB Kingston DDR5 6000 (running at 4000)
Video Card(s)	Radeon RX 7900XTX (24G) + Geforce 4070ti (12G) Physx
Storage	SAMSUNG 990 EVO SSD 2TB Gen 5 x2 (OS)+SAMSUNG 980 SSD 1TB PCle 3.0x4 (Primocache) +2X 22TB WD Gold
Display(s)	Samsung UN55NU8000 (Freesync)
Case	Corsair Graphite Series 780T White
Audio Device(s)	Creative Soundblaster AE-7 + Sennheiser GSP600
Power Supply	Seasonic PRIME TX-1000 Titanium
Mouse	Razer Mamba Elite Wired
Keyboard	Razer BlackWidow Chroma v1
VR HMD	Oculus Quest 2
Software	Windows 10

Processor	Threadripper 1950X
Motherboard	ASRock X399 Professional Gaming
Cooling	IceGiant ProSiphon Elite
Memory	48GB DDR4 2934MHz
Video Card(s)	MSI GTX 1080
Storage	4TB Crucial P3 Plus NVMe, 1TB Samsung 980 NVMe, 1TB Inland NVMe, 2TB Western Digital HDD
Display(s)	2x 4K60
Power Supply	Cooler Master Silent Pro M (1000W)
Mouse	Corsair Ironclaw Wireless
Keyboard	Corsair K70 MK.2
VR HMD	HTC Vive Pro
Software	Windows 10, QubesOS