• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Recommends EPYC Processors for Everyday AI Server Tasks

T0@st

News Editor
Joined
Mar 7, 2023
Messages
2,646 (3.59/day)
Location
South East, UK
System Name The TPU Typewriter
Processor AMD Ryzen 5 5600 (non-X)
Motherboard GIGABYTE B550M DS3H Micro ATX
Cooling DeepCool AS500
Memory Kingston Fury Renegade RGB 32 GB (2 x 16 GB) DDR4-3600 CL16
Video Card(s) PowerColor Radeon RX 7800 XT 16 GB Hellhound OC
Storage Samsung 980 Pro 1 TB M.2-2280 PCIe 4.0 X4 NVME SSD
Display(s) Lenovo Legion Y27q-20 27" QHD IPS monitor
Case GameMax Spark M-ATX (re-badged Jonsbo D30)
Audio Device(s) FiiO K7 Desktop DAC/Amp + Philips Fidelio X3 headphones, or ARTTI T10 Planar IEMs
Power Supply ADATA XPG CORE Reactor 650 W 80+ Gold ATX
Mouse Roccat Kone Pro Air
Keyboard Cooler Master MasterKeys Pro L
Software Windows 10 64-bit Home Edition
Ask a typical IT professional today whether they're leveraging AI, and there's a good chance they'll say yes-after all, they have reputations to protect! Kidding aside, many will report that their teams may use Web-based tools like ChatGPT or even have internal chatbots that serve their employee base on their intranet, but for that not much AI is really being implemented at the infrastructure level. As it turns out, the true answer is a bit different. AI tools and techniques have embedded themselves firmly into standard enterprise workloads and are a more common, everyday phenomena than even many IT people may realize. Assembly line operations now include computer vision-powered inspections. Supply chains use AI for demand forecasting making business move faster and of course, AI note-taking and meeting summary is embedded on virtually all the variants of collaboration and meeting software.

Increasingly, critical enterprise software tools incorporate built-in recommendation systems, virtual agents or some other form of AI-enabled assistance. AI is truly becoming a pervasive, complementary tool for everyday business. At the same time, today's enterprises are navigating a hybrid landscape where traditional, mission-critical workloads coexist with innovative AI-driven tasks. This "mixed enterprise and AI" workload environment calls for infrastructure that can handle both types of processing seamlessly. Robust, general-purpose CPUs like the AMD EPYC processors are designed to be powerful and secure and flexible to address this need. They handle everyday tasks—running databases, web servers, ERP systems—and offer strong security features crucial for enterprise operations augmented with AI workloads. In essence, modern enterprise infrastructure is about creating a balanced ecosystem. AMD EPYC CPUs play a pivotal role in creating this balance, delivering high performance, efficiency, and security features that underpin both traditional enterprise workloads and advanced AI operations.




When CPU inference makes sense
Determining what workloads are good fits for CPU inference comes down to four potential use case characteristics:
  • High Memory Capacity: Increased memory capacity for larger models and more extensive state information to be maintained during inference.
  • Low Latency: Small and medium models with real time, sporadic or low concurrent inference requests
  • Batch/Offline Processing: Unbounded latency or where batch processing can be leveraged to handle high volume workloads
  • Cost and Energy Efficiency: Sensitivity to energy consumption and cost, both CAPEX and OPEX

These characteristics make the 5th Gen AMD EPYC processors a strategic choice for handling AI inference. It's no coincidence after all with the highest core count of x86 CPUs in the industry, these CPUs can support the parallelized architectures fundamental to AI models. Additionally, the proximity, speed and total capacity of memory allows AI models quick and easy access to the key value cache, helping models run efficiently. It's also no surprise that AMD EPYC CPUs have won hundreds of performance and efficiency world records, demonstrating leadership across a wide array of general-purpose computing tasks.



Workloads that fit for CPU inference
As we've seen, the characteristics of a workload will determine whether a workload is well suited for a CPU. The most common types of workloads that are repeatedly run on CPUs are classical machine learning, recommendation systems, natural language processing, generative AI, like language models, and collaborative prompt-based pre-processing. We take a deeper look into each of these and why they are a good fit for inference on 5th Gen AMD EPYC processors.

Classical Machine Learning
Common instances of machine learning models are decision trees and linear regression models. These algorithms typically have a more sequential architecture than AI models and involve matrix operations and rule-based logic vs deep neural networks. CPUs are well suited to efficiently handling scalar operations and branching logic. Additionally, classical ML algorithms work on structured datasets that fit in memory. CPUs, due to their low memory access latency and large memory capacity provide tremendous performance optimization.

Recommendation Systems
Consider how social media feeds and online shopping are curated with recommendations. These systems use diverse algorithms like collaborative filtering and content-based filtering and require processing a wide variety of data sets from item features, user demographics and interaction history. Supporting this wide variety requires flexibility, for which CPUs are ideal. Recommendation systems also require large, low latency memory access to optimally store entire datasets and embedding tables in memory for fast, frequent access, which CPUs are well suited for.

Natural Language Processing
Chatbots and text to speech or speech to text applications are often running natural language processing models. These models are compact and intended to run in a real time conversational scenario. Since human response time is within seconds, from a compute standpoint, these applications are not very latency sensitive requiring sub millisecond responses, so they make for great fits for CPU inference. Furthermore, leveraging high core count 5th Gen AMD EPYC CPUs, multiple concurrent instances can fit on a single CPU and deliver a compelling price-performance efficiency for these workloads.

Generative AI Including Language Models
Many enterprise applications that have moved from small chatbot applications are now using generative models to streamline and speed up the creation of content. The most common type of generative model are language models. Small and medium language models run efficiently on CPUs. The high core count and memory capacity of the 5th Gen AMD EPYC processors and can support real time inference that is responsive enough for most common use cases like chatbots or search engines and ideal for batch/offline inference that have relaxed response time needs. AMD EPYC optimized libraries can provide additional parallelism and options to run multi-instances enhancing performance throughput.

Collaborative Prompt Based Pre-Processing
Collaborative models are a newer category of models that are very small and efficient for pre-processing data or the user's prompt to streamline the inference work for a larger model downstream. These small models used in retrieval augmented generation (RAG) and speculative decoding AI solutions are great fits for running on a CPU and are often used in a "mixed" scenario running inference on the host CPU supporting the GPUs which run large inference workloads.

These workloads span a wide variety and are each used in multiple applications across industry segments. The set of end applications where these workloads fit are endless, making the applications for CPU based inference endless as well. Be it streamlining supply chains with demand forecasting powered by time series and classical machine learning models, or carbon footprint reduction using predictive analysis like XGBoost to forecast emissions, to improving customer experience with in-store deal and coupon delivery, CPUs power everyday AI inference. While each of these types of workloads can comfortably exist on a CPU, in each example, the high core count, high memory capacity architecture built to balance serialized and parallelized workloads and the flexibility to support multiple workloads and data types make 5th Gen AMD EPYC processors the ideal choice for CPU inference.



Speaking of flexibility, once you do start using accelerators, high frequency 5th Gen AMD EPYC processors are also the best host processor. Compared with the Intel Xeon 8592+, AMD EPYC 9575F processors boast 8% higher max core frequency (3.9 GHz vs 5.0 GHz), up to 50% more memory bandwidth capacity (8 channels vs 12 channels) and 1.6x the high-speed PCIe Gen 5 lanes (80 vs 128 in single socket configurations) for data movement.

To top this off, AMD offers a full portfolio of products to choose from, including the AMD Instinct GPU, for the ideal mix of compute engines. At the same time, a growing number of AMD EPYC CPU-based servers are certified to run NVIDIA GPUs, giving you the choice to run the infrastructure you want.

Solutions for the Evolving Spectrum of AI
AMD EPYC processors can give you the headroom to grow and evolve. Not only do they help consolidate legacy servers in your data center to free up space and power, they also offer flexibility to meet your AI workload needs, regardless of size and scale. For smaller scale AI deployments, 5th Gen AMD EPYC CPUs deliver exceptional price-performance efficiency and for large scale deployments, whether it requires 1 or hundreds of thousands of GPUs, they help extract maximum throughput for your AI workload.

Progress doesn't stand still. The future is opaque, so whether models get smaller and more efficient, or larger and more capable (or both!) 5th Gen AMD EPYC CPUs offer flexibility to adapt to the evolving AI landscape. To offer your customers the best products and services, at the right price, you must be able to adapt. An AMD EPYC CPU-based server will be able to adapt with you. Get started running AI on AMD EPYC with our out-of-the-box support for Pytorch models and see how we can help optimize your performance with ZenDNN.

View at TechPowerUp Main Site | Source
 
Joined
Nov 27, 2023
Messages
3,058 (6.48/day)
System Name The Workhorse
Processor AMD Ryzen R9 5900X
Motherboard Gigabyte Aorus B550 Pro
Cooling CPU - Noctua NH-D15S Case - 3 Noctua NF-A14 PWM at the bottom, 2 Fractal Design 180mm at the front
Memory GSkill Trident Z 3200CL14
Video Card(s) NVidia GTX 1070 MSI QuickSilver
Storage Adata SX8200Pro
Display(s) LG 32GK850G
Case Fractal Design Torrent (Solid)
Audio Device(s) FiiO E-10K DAC/Amp, Samson Meteorite USB Microphone
Power Supply Corsair RMx850 (2018)
Mouse Razer Viper (Original) on a X-Raypad Equate Plus V2
Keyboard Cooler Master QuickFire Rapid TKL keyboard (Cherry MX Black)
Software Windows 11 Pro (24H2)
I suppose we are at the point where companies recommend their own products via press releases they send out to tech outlets to post as “news” and this is just… a thing we do now. Mm.
 
Joined
Jun 8, 2017
Messages
37 (0.01/day)
I suppose we are at the point where companies recommend their own products via press releases they send out to tech outlets to post as “news” and this is just… a thing we do now. Mm.

?
A press release has always been news, so why wouldn't TPU cover it?
 
Joined
Jan 14, 2023
Messages
893 (1.13/day)
System Name Lenovo slim 5 16'
Processor AMD 8845hs
Motherboard Lenovo motherboard
Cooling 2 fans
Memory 64gb 5600mhz cl40
Video Card(s) 4070 laptop
Storage 16tb, x2 8tb SSD
Display(s) 16in 16:10 (1920x1200) 144hz
Power Supply 270w psu
I've heard news that if couples are trying to have a kid, they should wear a condom, it will protect the couple from STDs.

*I have no idea where I'm going with this on a tech site.
 
Joined
Nov 14, 2021
Messages
151 (0.12/day)
I mean, yeah. You don't need amazing gear to do things that fit within your scope.


At work, I run custom AI image recognition using CPU based Tensforflow. 21 photos every 15 minutes. It will use up the 32GB of RAM allotted to it while it processes the photos and only takes a minute to process all the images.

I run this on a VM contained within our Hypervisor stack and it is allotted 8 processors. Saves us a ton of money whether you look for a dedicated box, a dedicated AI box, or paying business prices for a GPU to add to the host when it isn't needed.
 
Joined
Aug 2, 2012
Messages
2,184 (0.47/day)
Location
Netherlands
System Name TheDeeGee's PC
Processor Intel Core i7-11700
Motherboard ASRock Z590 Steel Legend
Cooling Noctua NH-D15S
Memory Crucial Ballistix 3200/C16 32GB
Video Card(s) Nvidia RTX 4070 Ti 12GB
Storage Crucial P5 Plus 2TB / Crucial P3 Plus 2TB / Crucial P3 Plus 4TB
Display(s) EIZO CX240
Case Lian-Li O11 Dynamic Evo XL / Noctua NF-A12x25 fans
Audio Device(s) Creative Sound Blaster ZXR / AKG K601 Headphones
Power Supply Seasonic PRIME Fanless TX-700
Mouse Logitech G500S
Keyboard Keychron Q6
Software Windows 10 Pro 64-Bit
Benchmark Scores None, as long as my games runs smooth.
Which CPU do i need if i don't use AI?

:rolleyes:
 
Joined
Jan 14, 2019
Messages
15,033 (6.68/day)
Location
Midlands, UK
System Name My second and third PCs are Intel + Nvidia
Processor AMD Ryzen 7 7800X3D
Motherboard MSi Pro B650M-A Wifi
Cooling be quiet! Dark Rock 4
Memory 2x 24 GB Corsair Vengeance EXPO DDR5-6000 CL36
Video Card(s) PowerColor Reaper Radeon RX 9070 XT
Storage 2 TB Corsair MP600 GS, 4 TB Seagate Barracuda
Display(s) Dell S3422DWG 34" 1440 UW 144 Hz
Case Kolink Citadel Mesh
Audio Device(s) Logitech Z333 2.1 speakers, AKG Y50 headphones
Power Supply 750 W Seasonic Prime GX
Mouse Logitech MX Master 2S
Keyboard Logitech G413 SE
Software Bazzite (Fedora Linux) KDE Plasma
And I recommend myself for everyday coffee-drinking and pizza-eating tasks.
 
Top