Thursday, August 29th 2024

Intel Announces Deployment of Gaudi 3 Accelerators on IBM Cloud

IBM and Intel announced a global collaboration to deploy Intel Gaudi 3 AI accelerators as a service on IBM Cloud. This offering, which is expected to be available in early 2025, aims to help more cost-effectively scale enterprise AI and drive innovation underpinned with security and resiliency. This collaboration will also enable support for Gaudi 3 within IBM's watsonx AI and data platform. IBM Cloud is the first cloud service provider (CSP) to adopt Gaudi 3, and the offering will be available for both hybrid and on-premise environments.

"Unlocking the full potential of AI requires an open and collaborative ecosystem that provides customers with choice and accessible solutions. By integrating Gaudi 3 AI accelerators and Xeon CPUs with IBM Cloud, we are creating new AI capabilities and meeting the demand for affordable, secure and innovative AI computing solutions," said Justin Hotard, Intel executive vice president and general manager of the Data Center and AI Group.
While generative AI has the potential to accelerate transformation, the required compute power needed emphasizes the importance of availability, performance, cost, energy efficiency and security as top priorities for enterprises. Through this collaboration, Intel and IBM aim to lower the total cost of ownership to leverage and scale AI, while enhancing performance. Gaudi 3, integrated with 5th Gen Xeon, supports enterprise AI workloads in the cloud and in data centers, providing customers with visibility and control over their software stack, simplifying workload and application management. IBM Cloud and Gaudi 3 aim to help customers more cost-effectively scale enterprise AI workloads, while prioritizing performance, security and resiliency.

For generative AI inferencing workloads, IBM plans to enable support for Gaudi 3 within IBM's watsonx AI and data platform, providing watsonx clients with additional AI infrastructure resources for scaling their AI workloads across hybrid cloud environments, helping to optimize model inferencing price/performance.

"IBM is committed to helping our clients drive AI and hybrid cloud innovation by offering solutions to meet their business needs. Our dedication to security and resiliency with IBM Cloud has helped fuel IBM's hybrid cloud and AI strategy for our enterprise clients," said Alan Peacock, general manager of IBM Cloud. "Leveraging Intel's Gaudi 3 accelerators on IBM Cloud will provide our clients access to a flexible enterprise AI solution that aims to optimize cost performance. We are unlocking potential new AI business opportunities, designed for clients to more cost-effectively test, innovate and deploy AI inferencing solutions."

IBM and Intel are collaborating to provide a Gaudi 3 service capability to support clients leveraging AI. To help clients across industries, including those that are heavily regulated, IBM and Intel intend to leverage IBM Cloud's security and compliance capabilities.
  • Scalability and Flexibility: IBM Cloud and Intel offer scalable and flexible solutions that allow clients to adjust computing resources as needed, which has the potential to lead to cost savings and operational efficiency.
  • Enhanced Performance and Security: Integrating Gaudi 3 into IBM Cloud Virtual Servers for VPC will help enable x86-based enterprises to run applications faster and more securely than before the integration, enhancing user experiences.
Intel and IBM have a long-standing collaboration, from the development of the IBM PC to the creation of enterprise AI solutions with Gaudi 3. IBM Cloud with Gaudi 3 offerings will be generally available at the beginning of 2025. Stay tuned for more updates from Intel and IBM in the coming months. To learn more, visit the Intel Gaudi 3 AI accelerators website.
Add your own comment

8 Comments on Intel Announces Deployment of Gaudi 3 Accelerators on IBM Cloud

#1
Daven
I've been confused about this for a while. Is the Intel data center GPU line dead replaced by the Gaudi product line?
Posted on Reply
#3
ncrs
DavenI've been confused about this for a while. Is the Intel data center GPU line dead replaced by the Gaudi product line?
Intel currently has two data center GPU lines: GPU Max and GPU Flex.
The former was based on the Xe-HPC variant of Intel's graphics IP. The final product code-named Ponte Vecchio was an ambitious 47 tile + EMIB + Foveros design that did so well its next generation was canceled.
GPU Flex is similar to mainstream Arc GPUs, but with added enterprise features and support.
Gaudi is a line of AI accelerators from Habana Labs which Intel bought. At the moment it has indeed replaced the GPU Max series at least for the AI/ML market.
Intel plans to combine Gaudi and Arc IPs into a new design that will also integrate x86 CPU tiles, similar to AMD Instinct MI300A. Its code-name is Falcon Shores and targeted release around 2025.
Posted on Reply
#4
Daven
ncrsIntel currently has two data center GPU lines: GPU Max and GPU Flex.
The former was based on the Xe-HPC variant of Intel's graphics IP. The final product code-named Ponte Vecchio was an ambitious 47 tile + EMIB + Foveros design that did so well its next generation was canceled.
GPU Flex is similar to mainstream Arc GPUs, but with added enterprise features and support.
Gaudi is a line of AI accelerators from Habana Labs which Intel bought. At the moment it has indeed replaced the GPU Max series at least for the AI/ML market.
Intel plans to combine Gaudi and Arc IPs into a new design that will also integrate x86 CPU tiles, similar to AMD Instinct MI300A. Its code-name is Falcon Shores and targeted release around 2025.
I heard Falcon Shores is no longer an XPU and now just a GPU.

www.theregister.com/AMP/2023/05/22/intel_abandons_xpu/
Posted on Reply
#5
Wirko
IBM can design and make ultra-crazy tech like the Z mainframe with their own main processors as well as their own AI processors inside, but they still can't make a competitive AI accelerator for the cloud?
Posted on Reply
#6
ncrs
DavenI heard Falcon Shores is no longer an XPU and now just a GPU.

www.theregister.com/AMP/2023/05/22/intel_abandons_xpu/
I must've missed that when it got published, thanks.
That's unfortunate and gives clear advantage to AMD and NVIDIA due to tight integration of their respective solutions.
WirkoIBM can design and make ultra-crazy tech like the Z mainframe with their own main processors as well as their own AI processors inside, but they still can't make a competitive AI accelerator for the cloud?
I guess it's a financial matter for them. IBM's cloud is not really a market leader - depending on how you measure their cloud's market share is around 2% (for example by Statista). They simply might not see the benefit of having to design a dedicated AI accelerator, especially if targeting a high end manufacturing process which brings the cost to tens or hundreds of millions. If you look at Telum II's AI accelerator size (as reported on STH) it almost looks like it was an afterthought that got bolted on ;) Perhaps they just licensed some external IP block for it. This year's HotChips was full of new dedicated AI chips, I'm surprised TechPowerUp didn't report on any of it.
This collaboration makes more sense since Intel is desperately trying to increase their AI accelerator market share, and might be offering hefty rebates.
Intel has been offering really attractive pricing for Gaudi hardware in the past. If your use-case can benefit for it, then buying Gaudi is a steal compared to NVIDIA (which is usually bogged down by wait times anyway).
Posted on Reply
#7
Wirko
ncrsI guess it's a financial matter for them. IBM's cloud is not really a market leader - depending on how you measure their cloud's market share is around 2% (for example by Statista). They simply might not see the benefit of having to design a dedicated AI accelerator, especially if targeting a high end manufacturing process which brings the cost to tens or hundreds of millions. If you look at Telum II's AI accelerator size (as reported on STH) it almost looks like it was an afterthought that got bolted on ;) Perhaps they just licensed some external IP block for it. This year's HotChips was full of new dedicated AI chips, I'm surprised TechPowerUp didn't report on any of it.
They have something more powerful to show, the Spyre accelerator chip. Still not among the top AI chips as it uses LPDDR5 memory and fits on a PCIe card with 75W slot power.
Posted on Reply
#8
ncrs
WirkoThey have something more powerful to show, the Spyre accelerator chip. Still not among the top AI chips as it uses LPDDR5 memory and fits on a PCIe card with 75W slot power.
Spyre is 32x of what they put into Telum. That's for inference, so not really usable for training, it's not comparable to Gaudi which targets training. As far as I know IBM has nothing specifically for training, and IBM's own announcement of Spyre admits it:
Teams are working to move beyond inferencing, to find effective and robust ways to do fine-tuning and even potentially training models, on mainframes.
As a rule of thumb if you see "TOPS" as a metric it's for inference, as in running existing AI models. When you see TFLOPS it's for training, as in modifying or creating AI models. Sometimes it gets more complicated by the fact that for example NVIDIA's tensor cores can run in INT8 mode and they specify the performance also as TOPS, but FP8 metrics are back to FLOPS.
Posted on Reply
Sep 3rd, 2024 11:25 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts