News Posts matching #Gaudi2

Return to Keyword Browsing

Intel Gaudi 2 Remains Only Benchmarked Alternative to NV H100 for Generative AI Performance

Today, MLCommons published results of the industry-standard MLPerf v4.0 benchmark for inference. Intel's results for Intel Gaudi 2 accelerators and 5th Gen Intel Xeon Scalable processors with Intel Advanced Matrix Extensions (Intel AMX) reinforce the company's commitment to bring "AI Everywhere" with a broad portfolio of competitive solutions. The Intel Gaudi 2 AI accelerator remains the only benchmarked alternative to Nvidia H100 for generative AI (GenAI) performance and provides strong performance-per-dollar. Further, Intel remains the only server CPU vendor to submit MLPerf results. Intel's 5th Gen Xeon results improved by an average of 1.42x compared with 4th Gen Intel Xeon processors' results in MLPerf Inference v3.1.

"We continue to improve AI performance on industry-standard benchmarks across our portfolio of accelerators and CPUs. Today's results demonstrate that we are delivering AI solutions that deliver to our customers' dynamic and wide-ranging AI requirements. Both Intel Gaudi and Xeon products provide our customers with options that are ready to deploy and offer strong price-to-performance advantages," said Zane Ball, Intel corporate vice president and general manager, DCAI Product Management.

Intel Gaudi2 Accelerator Beats NVIDIA H100 at Stable Diffusion 3 by 55%

Stability AI, the developers behind the popular Stable Diffusion generative AI model, have run some first-party performance benchmarks for Stable Diffusion 3 using popular data-center AI GPUs, including the NVIDIA H100 "Hopper" 80 GB, A100 "Ampere" 80 GB, and Intel's Gaudi2 96 GB accelerator. Unlike the H100, which is a super-scalar CUDA+Tensor core GPU; the Gaudi2 is purpose-built to accelerate generative AI and LLMs. Stability AI published its performance findings in a blog post, which reveals that the Intel Gaudi2 96 GB is posting a roughly 56% higher performance than the H100 80 GB.

With 2 nodes, 16 accelerators, and a constant batch size of 16 per accelerator (256 in all), the Intel Gaudi2 array is able to generate 927 images per second, compared to 595 images for the H100 array, and 381 images per second for the A100 array, keeping accelerator and node counts constant. Scaling things up a notch to 32 nodes, and 256 accelerators or a batch size of 16 per accelerator (total batch size of 4,096), the Gaudi2 array is posting 12,654 images per second; or 49.4 images per-second per-device; compared to 3,992 images per second or 15.6 images per-second per-device for the older-gen A100 "Ampere" array.

Intel Gaudi 2 AI Accelerator Powers Through Llama 2 Text Generation

Intel's "AI Everywhere" hype campaign has generated the most noise in mainstream and enterprise segments. Team Blue's Gaudi—a family of deep learning accelerators—does not hit the headlines all that often. Their current generation model, Gaudi 2, is overshadowed by Team Green and Red alternatives—according to Intel's official marketing spiel: "it performs competitively on deep learning training and inference, with up to 2.4x faster performance than NVIDIA A100." Habana, an Intel subsidiary, has been working on optimizing Large Language Model (LLM) inference on Gaudi 1 and 2 for a while—their co-operation with Hugging Face has produced impressive results, as of late February. Siddhant Jagtap, an Intel Data Scientist, has demonstrated: "how easy it is to generate text with the Llama 2 family of models (7b, 13b and 70b) using Optimum Habana and a custom pipeline class."

Jagtap reckons that folks will be able to: "run the models with just a few lines of code" on Gaudi 2 accelerators—additionally, Intel's hardware is capable of accepting single and multiple prompts. The custom pipeline class: "has been designed to offer great flexibility and ease of use. Moreover, it provides a high level of abstraction and performs end-to-end text-generation which involves pre-processing and post-processing." His article/blog outlines various prerequisites and methods of getting Llama 2 text generation up and running on Gaudi 2. Jagtap concluded that Habana/Intel has: "presented a custom text-generation pipeline on Intel Gaudi 2 AI accelerator that accepts single or multiple prompts as input. This pipeline offers great flexibility in terms of model size as well as parameters affecting text-generation quality. Furthermore, it is also very easy to use and to plug into your scripts, and is compatible with LangChain." Hugging Face reckons that Gaudi 2 delivers roughly twice the throughput speed of NVIDIA A100 80 GB in both training and inference scenarios. Intel has teased third generation Gaudi accelerators—industry watchdogs believe that next-gen solutions are designed to compete with Team Green H100 AI GPUs.

Intel Preparing Habana "Gaudi2C" SKU for the Chinese AI Market

Intel's software team has added support in its open-source Linux drivers for an unannounced Habana "Gaudi2C" AI accelerator variant. Little is documented about the mystery Gaudi2C, which shares a core identity with Intel's flagship Gaudi2 data center training and inference chip, otherwise broadly available. The new revision is distinguished only by a PCI ID of "3" in the latest patch set for Linux 6.8. Speculations circulate that Gaudi2C may be a version tailored to meet China-specific demands, similar to Intel's Gaudi2 HL-225B SKU launched in July with reduced interconnect links. With US export bans restricting sales of advanced hardware to China, including Intel's leading Gaudi2 products, creating reduced-capability spinoffs that meet export regulations lets Intel maintain crucial Chinese revenue.

Meanwhile, Intel's upstream Linux contributions remain focused on hardening Gaudi/Gaudi2 support, now considered "very stable" by lead driver developer Oded Gabbay. Minor new additions reflect maturity, not instability. The open-sourced foundations contrast NVIDIA's proprietary driver model, a key Intel competitive argument for service developers using Habana Labs hardware. With the SynapseAI software suite reaching stability, some enterprises could consider Gaudi accelerators as an alternative to NVIDIA. And with Gaudi3 arriving next year, the ecosystem will get a better competitive advantage with increased performance targets.

Intel Advances Scientific Research and Performance for New Wave of Supercomputers

At SC23, Intel showcased AI-accelerated high performance computing (HPC) with leadership performance for HPC and AI workloads across Intel Data Center GPU Max Series, Intel Gaudi 2 AI accelerators and Intel Xeon processors. In partnership with Argonne National Laboratory, Intel shared progress on the Aurora generative AI (genAI) project, including an update on the 1 trillion parameter GPT-3 LLM on the Aurora supercomputer that is made possible by the unique architecture of the Max Series GPU and the system capabilities of the Aurora supercomputer. Intel and Argonne demonstrated the acceleration of science with applications from the Aurora Early Science Program (ESP) and the Exascale Computing Project. The company also showed the path to Intel Gaudi 3 AI accelerators and Falcon Shores.

"Intel has always been committed to delivering innovative technology solutions to meet the needs of the HPC and AI community. The great performance of our Xeon CPUs along with our Max GPUs and CPUs help propel research and science. That coupled with our Gaudi accelerators demonstrate our full breadth of technology to provide our customers with compelling choices to suit their diverse workloads," said Deepak Patil, Intel corporate vice president and general manager of Data Center AI Solutions.

Intel Innovation 2023: Bringing AI Everywhere

As the world experiences a generational shift to artificial intelligence, each of us is participating in a new era of global expansion enabled by silicon. It's the "Siliconomy," where systems powered by AI are imbued with autonomy and agency, assisting us across both knowledge-based and physical-based tasks as part of our everyday environments.

At Intel Innovation, the company unveiled technologies to bring AI everywhere and to make it more accessible across all workloads - from client and edge to network and cloud. These include easy access to AI solutions in the cloud, better price performance for Intel data center AI accelerators than the competition offers, tens of millions of new AI-enabled Intel PCs shipping in 2024 and tools for securely powering AI deployments at the edge.

Intel Shows Strong AI Inference Performance

Today, MLCommons published results of its MLPerf Inference v3.1 performance benchmark for GPT-J, the 6 billion parameter large language model, as well as computer vision and natural language processing models. Intel submitted results for Habana Gaudi 2 accelerators, 4th Gen Intel Xeon Scalable processors, and Intel Xeon CPU Max Series. The results show Intel's competitive performance for AI inference and reinforce the company's commitment to making artificial intelligence more accessible at scale across the continuum of AI workloads - from client and edge to the network and cloud.

"As demonstrated through the recent MLCommons results, we have a strong, competitive AI product portfolio, designed to meet our customers' needs for high-performance, high-efficiency deep learning inference and training, for the complete spectrum of AI models - from the smallest to the largest - with leading price/performance." -Sandra Rivera, Intel executive vice president and general manager of the Data Center and AI Group

Intel Brings Gaudi2 Accelerator to China, to Fill Gap Created By NVIDIA Export Limitations

Intel has responded to the high demand for advanced chips in mainland China by bringing its processor, the Gaudi2, to the market. This move comes as the country grapples with US export restrictions, leading to a thriving market for smuggled NVIDIA GPUs. At a press conference in Beijing, Intel presented the Gaudi2 processor as an alternative to NVIDIA's A100 GPU, widely used for training AI systems. Despite US export controls, Intel recognizes the importance of the Chinese market, with 27 percent of its 2022 revenue generated from China. NVIDIA has also tried to comply with restrictions by offering modified versions of its GPUs, but limited supplies have driven the demand for smuggled GPUs. Intel's Gaudi2 aims to provide Chinese companies with various hardware options and bolster their ability to deploy AI through cloud and smart-edge technologies. By partnering with Inspur Group, a major AI server manufacturer, Intel plans to build Gaudi2-powered machines tailored explicitly for the Chinese market.

China's AI ambitions face potential challenges as the US government considers restricting Chinese companies access to American cloud computing services. This move could impede the utilization of advanced AI chips by major players like Amazon Web Services and Microsoft for their Chinese clients. Additionally, there are reports of a potential expansion of the US export ban to include NVIDIA's A800 GPU. As China continues to push forward with its AI development projects, Intel's introduction of the Gaudi2 processor helps country's demand for advanced chips. Balancing export controls and technological requirements within this complex trade landscape remains a crucial task for both companies and governments involved in the Chinese AI industry.

MLCommons Shares Intel Habana Gaudi2 and 4th Gen Intel Xeon Scalable AI Benchmark Results

Today, MLCommons published results of its industry AI performance benchmark, MLPerf Training 3.0, in which both the Habana Gaudi2 deep learning accelerator and the 4th Gen Intel Xeon Scalable processor delivered impressive training results.

"The latest MLPerf results published by MLCommons validates the TCO value Intel Xeon processors and Intel Gaudi deep learning accelerators provide to customers in the area of AI. Xeon's built-in accelerators make it an ideal solution to run volume AI workloads on general-purpose processors, while Gaudi delivers competitive performance for large language models and generative AI. Intel's scalable systems with optimized, easy-to-program open software lowers the barrier for customers and partners to deploy a broad array of AI-based solutions in the data center from the cloud to the intelligent edge." - Sandra Rivera, Intel executive vice president and general manager of the Data Center and AI Group
Return to Keyword Browsing
May 1st, 2024 03:49 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts