News Posts matching #InfiniBand

Return to Keyword Browsing

NVIDIA's Next-Gen "Rubin" AI GPU Development 6 Months Ahead of Schedule: Report

The "Rubin" architecture succeeds NVIDIA's current "Blackwell," which powers the company's AI GPUs, as well as the upcoming GeForce RTX 50-series gaming GPUs. NVIDIA will likely not build gaming GPUs with "Rubin," just like it didn't with "Hopper," and for the most part, "Volta." NVIDIA's AI GPU product roadmap put out at SC'24 puts "Blackwell" firmly in charge of the company's AI GPU product stack throughout 2025, with "Rubin" only succeeding it in the following year, for a two-year run in the market, being capped off with a "Rubin Ultra" larger GPU slated for 2027. A new report by United Daily News (UDN), a Taiwan-based publication, says that the development of "Rubin" is running 6 months ahead of schedule.

Being 6 months ahead of schedule doesn't necessarily mean that the product will launch sooner. It would give NVIDIA headroom to get "Rubin" better evaluated in the industry, and make last-minute changes to the product if needed; or even advance the launch if it wants to. The first AI GPU powered by "Rubin" will feature 8-high HBM4 memory stacks. The company will also introduce the "Vera" CPU, the long-awaited successor to "Grace." It will also introduce the X1600 InfiniBand/Ethernet network processor. According to the SC'24 roadmap by NVIDIA, these three would've seen a 2026 launch. Then in 2027, the company would follow up with an even larger AI GPU based on the same "Rubin" architecture, codenamed "Rubin Ultra." This features 12-high HBM4 stacks. NVIDIA's current GB200 "Blackwell" is a tile-based GPU, with two dies that have full cache-coherence. "Rubin" is rumored to feature four tiles.

Marvell Unveils Industry's First 3nm 1.6 Tbps PAM4 Interconnect Platform to Scale Accelerated Infrastructure

Marvell Technology, Inc., a leader in data infrastructure semiconductor solutions, today introduced Marvell Ara, the industry's first 3 nm 1.6 Tbps PAM4 interconnects platform featuring 200 Gbps electrical and optical interfaces. Building on the success of the Nova 2 DSP, the industry's first 5 nm 1.6 Tbps PAM4 DSP with 200 Gbps electrical and optical interfaces, Ara leverages the comprehensive Marvell 3 nm platform with industry-leading 200 Gbps SerDes and integrated optical modulator drivers, to reduce 1.6 Tbps optical module power by over 20%. The energy efficiency improvement reduces operational costs and enables new AI server and networking architectures to address the need for higher bandwidth and performance for AI workloads, within the significant power constraints of the data center.

Ara, the industry's first 3 nm PAM4 optical DSP, builds on six generations of Marvell leadership in PAM4 optical DSP technology. It integrates eight 200 Gbps electrical lanes to the host and eight 200 Gbps optical lanes, enabling 1.6 Tbps in a compact, standardized module form factor. Leveraging 3 nm technology and laser driver integration, Ara reduces module design complexity, power consumption and cost, setting a new benchmark for next-generation AI and cloud infrastructure.

NVIDIA and Microsoft Showcase Blackwell Preview, Omniverse Industrial AI and RTX AI PCs at Microsoft Ignite

NVIDIA and Microsoft today unveiled product integrations designed to advance full-stack NVIDIA AI development on Microsoft platforms and applications. At Microsoft Ignite, Microsoft announced the launch of the first cloud private preview of the Azure ND GB200 V6 VM series, based on the NVIDIA Blackwell platform. The Azure ND GB200 v6 will be a new AI-optimized virtual machine (VM) series and combines the NVIDIA GB200 NVL72 rack design with NVIDIA Quantum InfiniBand networking.

In addition, Microsoft revealed that Azure Container Apps now supports NVIDIA GPUs, enabling simplified and scalable AI deployment. Plus, the NVIDIA AI platform on Azure includes new reference workflows for industrial AI and an NVIDIA Omniverse Blueprint for creating immersive, AI-powered visuals. At Ignite, NVIDIA also announced multimodal small language models (SLMs) for RTX AI PCs and workstations, enhancing digital human interactions and virtual assistants with greater realism.

TOP500: El Capitan Achieves Top Spot, Frontier and Aurora Follow Behind

The 64th edition of the TOP500 reveals that El Capitan has achieved the top spot and is officially the third system to reach exascale computing after Frontier and Aurora. Both systems have since moved down to No. 2 and No. 3 spots, respectively. Additionally, new systems have found their way onto the Top 10.

The new El Capitan system at the Lawrence Livermore National Laboratory in California, U.S.A., has debuted as the most powerful system on the list with an HPL score of 1.742 EFlop/s. It has 11,039,616 combined CPU and GPU cores and is based on AMD 4th generation EPYC processors with 24 cores at 1.8 GHz and AMD Instinct MI300A accelerators. El Capitan relies on a Cray Slingshot 11 network for data transfer and achieves an energy efficiency of 58.89 GigaFLOPS/watt. This power efficiency rating helped El Capitan achieve No. 18 on the GREEN500 list as well.

NVIDIA B200 "Blackwell" Records 2.2x Performance Improvement Over its "Hopper" Predecessor

We know that NVIDIA's latest "Blackwell" GPUs are fast, but how much faster are they over the previous generation "Hopper"? Thanks to the latest MLPerf Training v4.1 results, NVIDIA's HGX B200 Blackwell platform has demonstrated massive performance gains, measuring up to 2.2x improvement per GPU compared to its HGX H200 Hopper. The latest results, verified by MLCommons, reveal impressive achievements in large language model (LLM) training. The Blackwell architecture, featuring HBM3e high-bandwidth memory and fifth-generation NVLink interconnect technology, achieved double the performance per GPU for GPT-3 pre-training and a 2.2x boost for Llama 2 70B fine-tuning compared to the previous Hopper generation. Each benchmark system incorporated eight Blackwell GPUs operating at a 1,000 W TDP, connected via NVLink Switch for scale-up.

The network infrastructure utilized NVIDIA ConnectX-7 SuperNICs and Quantum-2 InfiniBand switches, enabling high-speed node-to-node communication for distributed training workloads. While previous Hopper-based systems required 256 GPUs to optimize performance for the GPT-3 175B benchmark, Blackwell accomplished the same task with just 64 GPUs, leveraging its larger HBM3e memory capacity and bandwidth. One thing to look out for is the upcoming GB200 NVL72 system, which promises even more significant gains past the 2.2x. It features expanded NVLink domains, higher memory bandwidth, and tight integration with NVIDIA Grace CPUs, complemented by ConnectX-8 SuperNIC and Quantum-X800 switch technologies. With faster switching and better data movement with Grace-Blackwell integration, we could see even more software optimization from NVIDIA to push the performance envelope.

NVIDIA Ethernet Networking Accelerates World's Largest AI Supercomputer, Built by xAI

NVIDIA today announced that xAI's Colossus supercomputer cluster comprising 100,000 NVIDIA Hopper GPUs in Memphis, Tennessee, achieved this massive scale by using the NVIDIA Spectrum-X Ethernet networking platform, which is designed to deliver superior performance to multi-tenant, hyperscale AI factories using standards-based Ethernet, for its Remote Direct Memory Access (RDMA) network.

Colossus, the world's largest AI supercomputer, is being used to train xAI's Grok family of large language models, with chatbots offered as a feature for X Premium subscribers. xAI is in the process of doubling the size of Colossus to a combined total of 200,000 NVIDIA Hopper GPUs.

Supermicro Adds New Petascale JBOF All-Flash Storage Solution Integrating NVIDIA BlueField-3 DPU for AI Data Pipeline Acceleration

Supermicro, Inc., a Total IT Solution Provider for AI, Cloud, Storage, and 5G/Edge, is launching a new optimized storage system for high performance AI training, inference and HPC workloads. This JBOF (Just a Bunch of Flash) system utilizes up to four NVIDIA BlueField-3 data processing units (DPUs) in a 2U form factor to run software-defined storage workloads. Each BlueField-3 DPU features 400 Gb Ethernet or InfiniBand networking and hardware acceleration for high computation storage and networking workloads such as encryption, compression and erasure coding, as well as AI storage expansion. The state-of-the-art, dual port JBOF architecture enables active-active clustering ensuring high availability for scale up mission critical storage applications as well as scale-out storage such as object storage and parallel file systems.

"Supermicro's new high performance JBOF Storage System is designed using our Building Block approach which enables support for either E3.S or U.2 form-factor SSDs and the latest PCIe Gen 5 connectivity for the SSDs and the DPU networking and storage platform," said Charles Liang, president and CEO of Supermicro. "Supermicro's system design supports 24 or 36 SSD's enabling up to 1.105PB of raw capacity using 30.71 TB SSDs. Our balanced network and storage I/O design can saturate the full 400 Gb/s BlueField-3 line-rate realizing more than 250 GB/s bandwidth of the Gen 5 SSDs."

Marvell Demonstrates Industry-Leading 3 nm PCIe Gen 7 Connectivity

Marvell Technology, Inc., a leader in data infrastructure semiconductor solutions, demonstrates its industry-leading 3 nm PCIe Gen 7 connectivity at the OCP Global Summit, October 15-17, at the San Jose Convention Center. PCIe Gen 7 doubles data transfer speeds, enabling continued scaling of compute fabrics inside accelerated server platforms, general-purpose servers, CXL systems and disaggregated infrastructure. Building on its widely deployed PAM4 technology and utilizing its industry-leading accelerated infrastructure silicon platform, Marvell has developed the industry's most comprehensive interconnect portfolio addressing all high-bandwidth optical and copper connections in AI data centers. This innovative portfolio empowers cloud data center operators to optimize their infrastructure for their specific architectures and workloads to meet the exponential demands of AI.

Marvell pioneered PAM4 technology over a decade ago and leads the industry in PAM4 interconnect shipments. Today, most of the optical interconnects used in data center backend and frontend networks are based on PAM4 technology. As compared to PCIe Gen 5, which was based on NRZ modulation, PCIe Gen 6 and 7 require the use of PAM4 modulation. With its recent PCIe Gen 6 retimer announcement and this PCIe Gen 7 demonstration, Marvell extends its industry-leading PAM4-based optical and copper interconnect portfolio beyond Ethernet and InfiniBand into copper and optical PCIe, CXL and proprietary compute fabric links.

Supermicro Adds New Max-Performance Intel-Based X14 Servers

Supermicro, Inc. a Total IT Solution Provider for AI/ML, HPC, Cloud, Storage, and 5G/Edge, today adds new maximum performance GPU, multi-node, and rackmount systems to the X14 portfolio, which are based on the Intel Xeon 6900 Series Processors with P-Cores (formerly codenamed Granite Rapids-AP). The new industry-leading selection of workload-optimized servers addresses the needs of modern data centers, enterprises, and service providers. Joining the efficiency-optimized X14 servers leveraging the Xeon 6700 Series Processors with E-cores launched in June 2024, today's additions bring maximum compute density and power to the Supermicro X14 lineup to create the industry's broadest range of optimized servers supporting a wide variety of workloads from demanding AI, HPC, media, and virtualization to energy-efficient edge, scale-out cloud-native, and microservices applications.

"Supermicro X14 systems have been completely re-engineered to support the latest technologies including next-generation CPUs, GPUs, highest bandwidth and lowest latency with MRDIMMs, PCIe 5.0, and EDSFF E1.S and E3.S storage," said Charles Liang, president and CEO of Supermicro. "Not only can we now offer more than 15 families, but we can also use these designs to create customized solutions with complete rack integration services and our in-house developed liquid cooling solutions."

Oracle Offers First Zettascale Cloud Computing Cluster

Oracle today announced the first zettascale cloud computing clusters accelerated by the NVIDIA Blackwell platform. Oracle Cloud Infrastructure (OCI) is now taking orders for the largest AI supercomputer in the cloud—available with up to 131,072 NVIDIA Blackwell GPUs.

"We have one of the broadest AI infrastructure offerings and are supporting customers that are running some of the most demanding AI workloads in the cloud," said Mahesh Thiagarajan, executive vice president, Oracle Cloud Infrastructure. "With Oracle's distributed cloud, customers have the flexibility to deploy cloud and AI services wherever they choose while preserving the highest levels of data and AI sovereignty."

Marvell Expands Connectivity Portfolio With New PCIe Gen 6 Retimer Product Line

Marvell Technology, a leader in data infrastructure semiconductor solutions, today expanded its connectivity portfolio with the launch of the new Alaska P PCIe retimer product line built to scale data center compute fabrics inside accelerated servers, general-purpose servers, CXL systems and disaggregated infrastructure. The first two products, 8- and 16-lane PCIe Gen 6 retimers, connect AI accelerators, GPUs, CPUs and other components inside server systems.

Artificial intelligence (AI) and machine learning (ML) applications are driving data flows and connections inside server systems at significantly higher bandwidth, necessitating PCIe retimers to meet the required connection distances at faster speeds. PCIe is the industry standard for inside-server-system connections between AI accelerators, GPUs, CPUs and other server components. AI models are doubling their computation requirements every six months1 and are now the primary driver of the PCIe roadmap, with PCIe Gen 6 becoming a requirement.

Blackwell Shipments Imminent, Total CoWoS Capacity Expected to Surge by Over 70% in 2025

TrendForce reports that NVIDIA's Hopper H100 began to see a reduction in shortages in 1Q24. The new H200 from the same platform is expected to gradually ramp in Q2, with the Blackwell platform entering the market in Q3 and expanding to data center customers in Q4. However, this year will still primarily focus on the Hopper platform, which includes the H100 and H200 product lines. The Blackwell platform—based on how far supply chain integration has progressed—is expected to start ramping up in Q4, accounting for less than 10% of the total high-end GPU market.

The die size of Blackwell platform chips like the B100 is twice that of the H100. As Blackwell becomes mainstream in 2025, the total capacity of TSMC's CoWoS is projected to grow by 150% in 2024 and by over 70% in 2025, with NVIDIA's demand occupying nearly half of this capacity. For HBM, the NVIDIA GPU platform's evolution sees the H100 primarily using 80 GB of HBM3, while the 2025 B200 will feature 288 GB of HBM3e—a 3-4 fold increase in capacity per chip. The three major manufacturers' expansion plans indicate that HBM production volume will likely double by 2025.

NVIDIA Announces Financial Results for 1Q Fiscal 2025, Data Center Revenue Now 10x Gaming Revenue

NVIDIA (NASDAQ: NVDA) today reported revenue for the first quarter ended April 28, 2024, of $26.0 billion, up 18% from the previous quarter and up 262% from a year ago. For the quarter, GAAP earnings per diluted share was $5.98, up 21% from the previous quarter and up 629% from a year ago. Non-GAAP earnings per diluted share was $6.12, up 19% from the previous quarter and up 461% from a year ago. "The next industrial revolution has begun—companies and countries are partnering with NVIDIA to shift the trillion-dollar traditional data centers to accelerated computing and build a new type of data center—AI factories—to produce a new commodity: artificial intelligence," said Jensen Huang, founder and CEO of NVIDIA. "AI will bring significant productivity gains to nearly every industry and help companies be more cost- and energy-efficient, while expanding revenue opportunities.

"Our data center growth was fueled by strong and accelerating demand for generative AI training and inference on the Hopper platform. Beyond cloud service providers, generative AI has expanded to consumer internet companies, and enterprise, sovereign AI, automotive and healthcare customers, creating multiple multibillion-dollar vertical markets. "We are poised for our next wave of growth. The Blackwell platform is in full production and forms the foundation for trillion-parameter-scale generative AI. Spectrum-X opens a brand-new market for us to bring large-scale AI to Ethernet-only data centers. And NVIDIA NIM is our new software offering that delivers enterprise-grade, optimized generative AI to run on CUDA everywhere—from the cloud to on-prem data centers and RTX AI PCs—through our expansive network of ecosystem partners."

TOP500: Frontier Keeps Top Spot, Aurora Officially Becomes the Second Exascale Machine

The 63rd edition of the TOP500 reveals that Frontier has once again claimed the top spot, despite no longer being the only exascale machine on the list. Additionally, a new system has found its way into the Top 10.

The Frontier system at Oak Ridge National Laboratory in Tennessee, USA remains the most powerful system on the list with an HPL score of 1.206 EFlop/s. The system has a total of 8,699,904 combined CPU and GPU cores, an HPE Cray EX architecture that combines 3rd Gen AMD EPYC CPUs optimized for HPC and AI with AMD Instinct MI250X accelerators, and it relies on Cray's Slingshot 11 network for data transfer. On top of that, this machine has an impressive power efficiency rating of 52.93 GFlops/Watt - putting Frontier at the No. 13 spot on the GREEN500.

NVIDIA Grace Hopper Ignites New Era of AI Supercomputing

Driving a fundamental shift in the high-performance computing industry toward AI-powered systems, NVIDIA today announced nine new supercomputers worldwide are using NVIDIA Grace Hopper Superchips to speed scientific research and discovery. Combined, the systems deliver 200 exaflops, or 200 quintillion calculations per second, of energy-efficient AI processing power.

New Grace Hopper-based supercomputers coming online include EXA1-HE, in France, from CEA and Eviden; Helios at Academic Computer Centre Cyfronet, in Poland, from Hewlett Packard Enterprise (HPE); Alps at the Swiss National Supercomputing Centre, from HPE; JUPITER at the Jülich Supercomputing Centre, in Germany; DeltaAI at the National Center for Supercomputing Applications at the University of Illinois Urbana-Champaign; and Miyabi at Japan's Joint Center for Advanced High Performance Computing - established between the Center for Computational Sciences at the University of Tsukuba and the Information Technology Center at the University of Tokyo.

Dell Expands Generative AI Solutions Portfolio, Selects NVIDIA Blackwell GPUs

Dell Technologies is strengthening its collaboration with NVIDIA to help enterprises adopt AI technologies. By expanding the Dell Generative AI Solutions portfolio, including with the new Dell AI Factory with NVIDIA, organizations can accelerate integration of their data, AI tools and on-premises infrastructure to maximize their generative AI (GenAI) investments. "Our enterprise customers are looking for an easy way to implement AI solutions—that is exactly what Dell Technologies and NVIDIA are delivering," said Michael Dell, founder and CEO, Dell Technologies. "Through our combined efforts, organizations can seamlessly integrate data with their own use cases and streamline the development of customized GenAI models."

"AI factories are central to creating intelligence on an industrial scale," said Jensen Huang, founder and CEO, NVIDIA. "Together, NVIDIA and Dell are helping enterprises create AI factories to turn their proprietary data into powerful insights."

Microsoft and NVIDIA Announce Major Integrations to Accelerate Generative AI for Enterprises Everywhere

At GTC on Monday, Microsoft Corp. and NVIDIA expanded their longstanding collaboration with powerful new integrations that leverage the latest NVIDIA generative AI and Omniverse technologies across Microsoft Azure, Azure AI services, Microsoft Fabric and Microsoft 365.

"Together with NVIDIA, we are making the promise of AI real, helping to drive new benefits and productivity gains for people and organizations everywhere," said Satya Nadella, Chairman and CEO, Microsoft. "From bringing the GB200 Grace Blackwell processor to Azure, to new integrations between DGX Cloud and Microsoft Fabric, the announcements we are making today will ensure customers have the most comprehensive platforms and tools across every layer of the Copilot stack, from silicon to software, to build their own breakthrough AI capability."

"AI is transforming our daily lives - opening up a world of new opportunities," said Jensen Huang, founder and CEO of NVIDIA. "Through our collaboration with Microsoft, we're building a future that unlocks the promise of AI for customers, helping them deliver innovative solutions to the world."

NVIDIA Launches Blackwell-Powered DGX SuperPOD for Generative AI Supercomputing at Trillion-Parameter Scale

NVIDIA today announced its next-generation AI supercomputer—the NVIDIA DGX SuperPOD powered by NVIDIA GB200 Grace Blackwell Superchips—for processing trillion-parameter models with constant uptime for superscale generative AI training and inference workloads.

Featuring a new, highly efficient, liquid-cooled rack-scale architecture, the new DGX SuperPOD is built with NVIDIA DGX GB200 systems and provides 11.5 exaflops of AI supercomputing at FP4 precision and 240 terabytes of fast memory—scaling to more with additional racks.

NVIDIA Blackwell Platform Arrives to Power a New Era of Computing

Powering a new era of computing, NVIDIA today announced that the NVIDIA Blackwell platform has arrived—enabling organizations everywhere to build and run real-time generative AI on trillion-parameter large language models at up to 25x less cost and energy consumption than its predecessor.

The Blackwell GPU architecture features six transformative technologies for accelerated computing, which will help unlock breakthroughs in data processing, engineering simulation, electronic design automation, computer-aided drug design, quantum computing and generative AI—all emerging industry opportunities for NVIDIA.

NVIDIA Calls for Global Investment into Sovereign AI

Nations have long invested in domestic infrastructure to advance their economies, control their own data and take advantage of technology opportunities in areas such as transportation, communications, commerce, entertainment and healthcare. AI, the most important technology of our time, is turbocharging innovation across every facet of society. It's expected to generate trillions of dollars in economic dividends and productivity gains. Countries are investing in sovereign AI to develop and harness such benefits on their own. Sovereign AI refers to a nation's capabilities to produce artificial intelligence using its own infrastructure, data, workforce and business networks.

Why Sovereign AI Is Important
The global imperative for nations to invest in sovereign AI capabilities has grown since the rise of generative AI, which is reshaping markets, challenging governance models, inspiring new industries and transforming others—from gaming to biopharma. It's also rewriting the nature of work, as people in many fields start using AI-powered "copilots." Sovereign AI encompasses both physical and data infrastructures. The latter includes sovereign foundation models, such as large language models, developed by local teams and trained on local datasets to promote inclusiveness with specific dialects, cultures and practices. For example, speech AI models can help preserve, promote and revitalize indigenous languages. And LLMs aren't just for teaching AIs human languages, but for writing software code, protecting consumers from financial fraud, teaching robots physical skills and much more.

NVIDIA Grace Hopper Systems Gather at GTC

The spirit of software pioneer Grace Hopper will live on at NVIDIA GTC. Accelerated systems using powerful processors - named in honor of the pioneer of software programming - will be on display at the global AI conference running March 18-21, ready to take computing to the next level. System makers will show more than 500 servers in multiple configurations across 18 racks, all packing NVIDIA GH200 Grace Hopper Superchips. They'll form the largest display at NVIDIA's booth in the San Jose Convention Center, filling the MGX Pavilion.

MGX Speeds Time to Market
NVIDIA MGX is a blueprint for building accelerated servers with any combination of GPUs, CPUs and data processing units (DPUs) for a wide range of AI, high performance computing and NVIDIA Omniverse applications. It's a modular reference architecture for use across multiple product generations and workloads. GTC attendees can get an up-close look at MGX models tailored for enterprise, cloud and telco-edge uses, such as generative AI inference, recommenders and data analytics. The pavilion will showcase accelerated systems packing single and dual GH200 Superchips in 1U and 2U chassis, linked via NVIDIA BlueField-3 DPUs and NVIDIA Quantum-2 400 Gb/s InfiniBand networks over LinkX cables and transceivers. The systems support industry standards for 19- and 21-inch rack enclosures, and many provide E1.S bays for nonvolatile storage.

Intel and Ohio Supercomputer Center Double AI Processing Power with New HPC Cluster

A collaboration including Intel, Dell Technologies, Nvidia and the Ohio Supercomputer Center (OSC), today introduces Cardinal, a cutting-edge high-performance computing (HPC) cluster. Purpose-built to meet the increasing demand for HPC resources in Ohio across research, education and industry innovation, particularly in artificial intelligence (AI).

AI and machine learning are integral tools in scientific, engineering and biomedical fields for solving complex research inquiries. As these technologies continue to demonstrate efficacy, academic domains such as agricultural sciences, architecture and social studies are embracing their potential. Cardinal is equipped with the hardware capable of meeting the demands of expanding AI workloads. In both capabilities and capacity, the new cluster will be a substantial upgrade from the system it will replace, the Owens Cluster launched in 2016.

NVIDIA Unveils "Eos" to Public - a Top Ten Supercomputer

Providing a peek at the architecture powering advanced AI factories, NVIDIA released a video that offers the first public look at Eos, its latest data-center-scale supercomputer. An extremely large-scale NVIDIA DGX SuperPOD, Eos is where NVIDIA developers create their AI breakthroughs using accelerated computing infrastructure and fully optimized software. Eos is built with 576 NVIDIA DGX H100 systems, NVIDIA Quantum-2 InfiniBand networking and software, providing a total of 18.4 exaflops of FP8 AI performance. Revealed in November at the Supercomputing 2023 trade show, Eos—named for the Greek goddess said to open the gates of dawn each day—reflects NVIDIA's commitment to advancing AI technology.

Eos Supercomputer Fuels Innovation
Each DGX H100 system is equipped with eight NVIDIA H100 Tensor Core GPUs. Eos features a total of 4,608 H100 GPUs. As a result, Eos can handle the largest AI workloads to train large language models, recommender systems, quantum simulations and more. It's a showcase of what NVIDIA's technologies can do, when working at scale. Eos is arriving at the perfect time. People are changing the world with generative AI, from drug discovery to chatbots to autonomous machines and beyond. To achieve these breakthroughs, they need more than AI expertise and development skills. They need an AI factory—a purpose-built AI engine that's always available and can help ramp their capacity to build AI models at scale Eos delivers. Ranked No. 9 in the TOP 500 list of the world's fastest supercomputers, Eos pushes the boundaries of AI technology and infrastructure.

NVIDIA Turbocharges Generative AI Training in MLPerf Benchmarks

NVIDIA's AI platform raised the bar for AI training and high performance computing in the latest MLPerf industry benchmarks. Among many new records and milestones, one in generative AI stands out: NVIDIA Eos - an AI supercomputer powered by a whopping 10,752 NVIDIA H100 Tensor Core GPUs and NVIDIA Quantum-2 InfiniBand networking - completed a training benchmark based on a GPT-3 model with 175 billion parameters trained on one billion tokens in just 3.9 minutes. That's a nearly 3x gain from 10.9 minutes, the record NVIDIA set when the test was introduced less than six months ago.

The benchmark uses a portion of the full GPT-3 data set behind the popular ChatGPT service that, by extrapolation, Eos could now train in just eight days, 73x faster than a prior state-of-the-art system using 512 A100 GPUs. The acceleration in training time reduces costs, saves energy and speeds time-to-market. It's heavy lifting that makes large language models widely available so every business can adopt them with tools like NVIDIA NeMo, a framework for customizing LLMs. In a new generative AI test ‌this round, 1,024 NVIDIA Hopper architecture GPUs completed a training benchmark based on the Stable Diffusion text-to-image model in 2.5 minutes, setting a high bar on this new workload. By adopting these two tests, MLPerf reinforces its leadership as the industry standard for measuring AI performance, since generative AI is the most transformative technology of our time.

IBM Unleashes the Potential of Data and AI with its Next-Generation IBM Storage Scale System 6000

Today, IBM introduced the new IBM Storage Scale System 6000, a cloud-scale global data platform designed to meet today's data intensive and AI workload demands, and the latest offering in the IBM Storage for Data and AI portfolio.

For the seventh consecutive year and counting, IBM is a 2022 Gartner Magic Quadrant for Distributed File Systems and Object Storage Leader, recognized for its vision and execution. The new IBM Storage Scale System 6000 seeks to build on IBM's leadership position with an enhanced high performance parallel file system designed for data intensive use-cases. It provides up to 7M IOPs and up to 256 GB/s throughput for read only workloads per system in a 4U (four rack units) footprint.
Return to Keyword Browsing
Dec 21st, 2024 21:26 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts