Intel Announces Deployment of Gaudi 3 Accelerators on IBM Cloud

btarunr · Aug 29, 2024

IBM and Intel announced a global collaboration to deploy Intel Gaudi 3 AI accelerators as a service on IBM Cloud. This offering, which is expected to be available in early 2025, aims to help more cost-effectively scale enterprise AI and drive innovation underpinned with security and resiliency. This collaboration will also enable support for Gaudi 3 within IBM's watsonx AI and data platform. IBM Cloud is the first cloud service provider (CSP) to adopt Gaudi 3, and the offering will be available for both hybrid and on-premise environments.

"Unlocking the full potential of AI requires an open and collaborative ecosystem that provides customers with choice and accessible solutions. By integrating Gaudi 3 AI accelerators and Xeon CPUs with IBM Cloud, we are creating new AI capabilities and meeting the demand for affordable, secure and innovative AI computing solutions," said Justin Hotard, Intel executive vice president and general manager of the Data Center and AI Group.

While generative AI has the potential to accelerate transformation, the required compute power needed emphasizes the importance of availability, performance, cost, energy efficiency and security as top priorities for enterprises. Through this collaboration, Intel and IBM aim to lower the total cost of ownership to leverage and scale AI, while enhancing performance. Gaudi 3, integrated with 5th Gen Xeon, supports enterprise AI workloads in the cloud and in data centers, providing customers with visibility and control over their software stack, simplifying workload and application management. IBM Cloud and Gaudi 3 aim to help customers more cost-effectively scale enterprise AI workloads, while prioritizing performance, security and resiliency.

For generative AI inferencing workloads, IBM plans to enable support for Gaudi 3 within IBM's watsonx AI and data platform, providing watsonx clients with additional AI infrastructure resources for scaling their AI workloads across hybrid cloud environments, helping to optimize model inferencing price/performance.

"IBM is committed to helping our clients drive AI and hybrid cloud innovation by offering solutions to meet their business needs. Our dedication to security and resiliency with IBM Cloud has helped fuel IBM's hybrid cloud and AI strategy for our enterprise clients," said Alan Peacock, general manager of IBM Cloud. "Leveraging Intel's Gaudi 3 accelerators on IBM Cloud will provide our clients access to a flexible enterprise AI solution that aims to optimize cost performance. We are unlocking potential new AI business opportunities, designed for clients to more cost-effectively test, innovate and deploy AI inferencing solutions."

IBM and Intel are collaborating to provide a Gaudi 3 service capability to support clients leveraging AI. To help clients across industries, including those that are heavily regulated, IBM and Intel intend to leverage IBM Cloud's security and compliance capabilities.

Scalability and Flexibility: IBM Cloud and Intel offer scalable and flexible solutions that allow clients to adjust computing resources as needed, which has the potential to lead to cost savings and operational efficiency.
Enhanced Performance and Security: Integrating Gaudi 3 into IBM Cloud Virtual Servers for VPC will help enable x86-based enterprises to run applications faster and more securely than before the integration, enhancing user experiences.

Intel and IBM have a long-standing collaboration, from the development of the IBM PC to the creation of enterprise AI solutions with Gaudi 3. IBM Cloud with Gaudi 3 offerings will be generally available at the beginning of 2025. Stay tuned for more updates from Intel and IBM in the coming months. To learn more, visit the Intel Gaudi 3 AI accelerators website.

View at TechPowerUp Main Site

Daven · Aug 29, 2024

I've been confused about this for a while. Is the Intel data center GPU line dead replaced by the Gaudi product line?

thesmokingman · Aug 29, 2024

This is like bottom tier here...

ncrs · Aug 29, 2024

Daven said:
I've been confused about this for a while. Is the Intel data center GPU line dead replaced by the Gaudi product line?

Intel currently has two data center GPU lines: GPU Max and GPU Flex.
The former was based on the Xe-HPC variant of Intel's graphics IP. The final product code-named Ponte Vecchio was an ambitious 47 tile + EMIB + Foveros design that did so well its next generation was canceled.
GPU Flex is similar to mainstream Arc GPUs, but with added enterprise features and support.
Gaudi is a line of AI accelerators from Habana Labs which Intel bought. At the moment it has indeed replaced the GPU Max series at least for the AI/ML market.
Intel plans to combine Gaudi and Arc IPs into a new design that will also integrate x86 CPU tiles, similar to AMD Instinct MI300A. Its code-name is Falcon Shores and targeted release around 2025.

Daven · Aug 29, 2024

ncrs said:
Intel currently has two data center GPU lines: GPU Max and GPU Flex.
The former was based on the Xe-HPC variant of Intel's graphics IP. The final product code-named Ponte Vecchio was an ambitious 47 tile + EMIB + Foveros design that did so well its next generation was canceled.
GPU Flex is similar to mainstream Arc GPUs, but with added enterprise features and support.
Gaudi is a line of AI accelerators from Habana Labs which Intel bought. At the moment it has indeed replaced the GPU Max series at least for the AI/ML market.
Intel plans to combine Gaudi and Arc IPs into a new design that will also integrate x86 CPU tiles, similar to AMD Instinct MI300A. Its code-name is Falcon Shores and targeted release around 2025.

I heard Falcon Shores is no longer an XPU and now just a GPU.

Intel abandons XPU plan to cram CPU, GPU, memory into one package

AMD now has clear runway to conquer datacenter APU market

www.theregister.com

Wirko · Aug 29, 2024

IBM can design and make ultra-crazy tech like the Z mainframe with their own main processors as well as their own AI processors inside, but they still can't make a competitive AI accelerator for the cloud?

ncrs · Aug 30, 2024

Daven said:
I heard Falcon Shores is no longer an XPU and now just a GPU.

Intel abandons XPU plan to cram CPU, GPU, memory into one package

AMD now has clear runway to conquer datacenter APU market

www.theregister.com

I must've missed that when it got published, thanks.
That's unfortunate and gives clear advantage to AMD and NVIDIA due to tight integration of their respective solutions.

Wirko said:
IBM can design and make ultra-crazy tech like the Z mainframe with their own main processors as well as their own AI processors inside, but they still can't make a competitive AI accelerator for the cloud?

I guess it's a financial matter for them. IBM's cloud is not really a market leader - depending on how you measure their cloud's market share is around 2% (for example by Statista). They simply might not see the benefit of having to design a dedicated AI accelerator, especially if targeting a high end manufacturing process which brings the cost to tens or hundreds of millions. If you look at Telum II's AI accelerator size (as reported on STH) it almost looks like it was an afterthought that got bolted on

Perhaps they just licensed some external IP block for it. This year's HotChips was full of new dedicated AI chips, I'm surprised TechPowerUp didn't report on any of it.
This collaboration makes more sense since Intel is desperately trying to increase their AI accelerator market share, and might be offering hefty rebates.
Intel has been offering really attractive pricing for Gaudi hardware in the past. If your use-case can benefit for it, then buying Gaudi is a steal compared to NVIDIA (which is usually bogged down by wait times anyway).

Wirko · Aug 30, 2024

ncrs said:
I guess it's a financial matter for them. IBM's cloud is not really a market leader - depending on how you measure their cloud's market share is around 2% (for example by Statista). They simply might not see the benefit of having to design a dedicated AI accelerator, especially if targeting a high end manufacturing process which brings the cost to tens or hundreds of millions. If you look at Telum II's AI accelerator size (as reported on STH) it almost looks like it was an afterthought that got bolted on Perhaps they just licensed some external IP block for it. This year's HotChips was full of new dedicated AI chips, I'm surprised TechPowerUp didn't report on any of it.

They have something more powerful to show, the Spyre accelerator chip. Still not among the top AI chips as it uses LPDDR5 memory and fits on a PCIe card with 75W slot power.

ncrs · Aug 30, 2024

Wirko said:
They have something more powerful to show, the Spyre accelerator chip. Still not among the top AI chips as it uses LPDDR5 memory and fits on a PCIe card with 75W slot power.

Spyre is 32x of what they put into Telum. That's for inference, so not really usable for training, it's not comparable to Gaudi which targets training. As far as I know IBM has nothing specifically for training, and IBM's own announcement of Spyre admits it:

Teams are working to move beyond inferencing, to find effective and robust ways to do fine-tuning and even potentially training models, on mainframes.

As a rule of thumb if you see "TOPS" as a metric it's for inference, as in running existing AI models. When you see TFLOPS it's for training, as in modifying or creating AI models. Sometimes it gets more complicated by the fact that for example NVIDIA's tensor cores can run in INT8 mode and they specify the performance also as TOPS, but FP8 metrics are back to FLOPS.

System Name	RBMK-1000
Processor	AMD Ryzen 7 5700G
Motherboard	Gigabyte B550 AORUS Elite V2
Cooling	DeepCool Gammax L240 V2
Memory	2x 16GB DDR4-3200
Video Card(s)	Galax RTX 4070 Ti EX
Storage	Samsung 990 1TB
Display(s)	BenQ 1440p 60 Hz 27-inch
Case	Corsair Carbide 100R
Audio Device(s)	ASUS SupremeFX S1220A
Power Supply	Cooler Master MWE Gold 650W
Mouse	ASUS ROG Strix Impact
Keyboard	Gamdias Hermes E2
Software	Windows 11 Pro

Processor	AMD 5900x
Motherboard	Asus x570 Strix-E
Cooling	Hardware Labs
Memory	G.Skill 4000c17 2x16gb
Video Card(s)	RTX 3090
Storage	Sabrent
Display(s)	Samsung G9
Case	Phanteks 719
Audio Device(s)	Fiio K5 Pro
Power Supply	EVGA 1000 P2
Mouse	Logitech G600
Keyboard	Corsair K95

Processor	i5-6600K
Motherboard	Asus Z170A
Cooling	some cheap Cooler Master Hyper 103 or similar
Memory	16GB DDR4-2400
Video Card(s)	IGP
Storage	Samsung 850 EVO 250GB
Display(s)	2x Oldell 24" 1920x1200
Case	Bitfenix Nova white windowless non-mesh
Audio Device(s)	E-mu 1212m PCI
Power Supply	Seasonic G-360
Mouse	Logitech Marble trackball, never had a mouse
Keyboard	Key Tronic KT2000, no Win key because 1994
Software	Oldwin

Processor	i5-6600K
Motherboard	Asus Z170A
Cooling	some cheap Cooler Master Hyper 103 or similar
Memory	16GB DDR4-2400
Video Card(s)	IGP
Storage	Samsung 850 EVO 250GB
Display(s)	2x Oldell 24" 1920x1200
Case	Bitfenix Nova white windowless non-mesh
Audio Device(s)	E-mu 1212m PCI
Power Supply	Seasonic G-360
Mouse	Logitech Marble trackball, never had a mouse
Keyboard	Key Tronic KT2000, no Win key because 1994
Software	Oldwin

Intel Announces Deployment of Gaudi 3 Accelerators on IBM Cloud

Editor & Senior Moderator