News Posts matching #H200

Return to Keyword Browsing

NVIDIA Blackwell Takes Pole Position in Latest MLPerf Inference Results

In the latest MLPerf Inference V5.0 benchmarks, which reflect some of the most challenging inference scenarios, the NVIDIA Blackwell platform set records - and marked NVIDIA's first MLPerf submission using the NVIDIA GB200 NVL72 system, a rack-scale solution designed for AI reasoning. Delivering on the promise of cutting-edge AI takes a new kind of compute infrastructure, called AI factories. Unlike traditional data centers, AI factories do more than store and process data - they manufacture intelligence at scale by transforming raw data into real-time insights. The goal for AI factories is simple: deliver accurate answers to queries quickly, at the lowest cost and to as many users as possible.

The complexity of pulling this off is significant and takes place behind the scenes. As AI models grow to billions and trillions of parameters to deliver smarter replies, the compute required to generate each token increases. This requirement reduces the number of tokens that an AI factory can generate and increases cost per token. Keeping inference throughput high and cost per token low requires rapid innovation across every layer of the technology stack, spanning silicon, network systems and software.

Quantum Machines Announces NVIDIA DGX Quantum Early Access Program

Quantum Machines (QM), the leading provider of advanced quantum control solutions, has recently announced the NVIDIA DGX Quantum Early Customer Program, with a cohort of six leading research groups and quantum computer builders. NVIDIA DGX Quantum, a reference architecture jointly developed by NVIDIA and QM, is the first tightly integrated quantum-classical computing solution, designed to unlock new frontiers in quantum computing research and development. As quantum computers scale, their reliance on classical resources for essential operations, such as quantum error correction (QEC) and parameter drift compensation, grows exponentially. NVIDIA DGX Quantum provides access to the classical acceleration needed to support this progress, advancing the path toward practical quantum supercomputers.

NVIDIA DGX Quantum leverages OPX1000, the best-in-class, modular high-density hybrid control platform, seamlessly interfacing with NVIDIA GH200 Grace Hopper Superchips. This solution brings accelerated computing into the heart of the quantum computing stack for the first time, achieving an ultra-low round-trip latency of less than 4 µs between quantum control and AI supercomputers - faster than any other approach. The NVIDIA DGX Quantum Early Customer Program is now underway, with selected leading academic institutions, national labs, and commercial quantum computer builders participating. These include the Engineering Quantum Systems group (equs.mit.edu) led by MIT Professor William D. Oliver, the Israeli Quantum Computing Center (IQCC), quantum hardware developer Diraq, the Quantum Circuit group (led by Ecole Normale Supérieure de Lyon Professor Benjamin Huard), and more.

Lenovo Announces Hybrid AI Advantage with NVIDIA Blackwell Support

Today, at NVIDIA GTC, Lenovo unveiled new Lenovo Hybrid AI Advantage with NVIDIA solutions designed to accelerate AI adoption and boost business productivity by fast-tracking agentic AI that can reason, plan and take action to reach goals faster. The validated, full-stack AI solutions enable enterprises to quickly build and deploy AI agents for a broad range of high-demand use cases, increasing productivity, agility and trust while accelerating the next wave of AI reasoning for the new era of agentic AI.

New global IDC research commissioned by Lenovo reveals that ROI remains the greatest AI adoption barrier, despite a three-fold spend increase. AI agents are revolutionizing enterprise workflows and lowering barriers to ROI by supporting employees with complex problem-solving, coding, and multistep planning that drives speed, innovation and productivity. As CIOs and business leaders seek tangible return on AI investment, Lenovo is delivering hybrid AI solutions that unleash and customize agentic AI at every scale.

Supermicro Expands Enterprise AI Portfolio With Support for Upcoming NVIDIA RTX PRO 6000 Blackwell Server Edition and NVIDIA H200 NVL Platform

Supermicro, Inc., a Total IT Solution Provider for AI/ML, HPC, Cloud, Storage, and 5G/Edge, today announced support for the new NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs on a range of workload-optimized GPU servers and workstations. Specifically optimized for the NVIDIA Blackwell generation of PCIe GPUs, the broad range of Supermicro servers will enable more enterprises to leverage accelerated computing for LLM-inference and fine-tuning, agentic AI, visualization, graphics & rendering, and virtualization. Many Supermicro GPU-optimized systems are NVIDIA Certified, guaranteeing compatibility and support for NVIDIA AI Enterprise to simplify the process of developing and deploying production AI.

"Supermicro leads the industry with its broad portfolio of application optimized GPU servers that can be deployed in a wide range of enterprise environments with very short lead times," said Charles Liang, president and CEO of Supermicro. "Our support for the NVIDIA RTX PRO 6000 Blackwell Server Edition GPU adds yet another dimension of performance and flexibility for customers looking to deploy the latest in accelerated computing capabilities from the data center to the intelligent edge. Supermicro's broad range of PCIe GPU-optimized products also support NVIDIA H200 NVL in 2-way and 4-way NVIDIA NVLink configurations to maximize inference performance for today's state-of-the-art AI models, as well as accelerating HPC workloads."

Dell Technologies Accelerates Enterprise AI Innovation from PC to Data Center with NVIDIA 

Marking one year since the launch of the Dell AI Factory with NVIDIA, Dell Technologies (NYSE: DELL) announces new AI PCs, infrastructure, software and services advancements to accelerate enterprise AI innovation at any scale. Successful AI deployments are vital for enterprises to remain competitive, but challenges like system integration and skill gaps can delay the value enterprises realize from AI. More than 75% of organizations want their infrastructure providers to deliver capabilities across all aspects of the AI adoption journey, driving customer demand for simplified AI deployments that can scale.

As the top provider of AI centric infrastructure, Dell Technologies - in collaboration with NVIDIA - provides a consistent experience across AI infrastructure, software and services, offering customers a one-stop shop to scale AI initiatives from deskside to large-scale data center deployments.

NVIDIA Accelerates Science and Engineering With CUDA-X Libraries Powered by GH200 and GB200 Superchips

Scientists and engineers of all kinds are equipped to solve tough problems a lot faster with NVIDIA CUDA-X libraries powered by NVIDIA GB200 and GH200 superchips. Announced today at the NVIDIA GTC global AI conference, developers can now take advantage of tighter automatic integration and coordination between CPU and GPU resources - enabled by CUDA-X working with these latest superchip architectures - resulting in up to 11x speedups for computational engineering tools and 5x larger calculations compared with using traditional accelerated computing architectures.

This greatly accelerates and improves workflows in engineering simulation, design optimization and more, helping scientists and researchers reach groundbreaking results faster. NVIDIA released CUDA in 2006, opening up a world of applications to the power of accelerated computing. Since then, NVIDIA has built more than 900 domain-specific NVIDIA CUDA-X libraries and AI models, making it easier to adopt accelerated computing and driving incredible scientific breakthroughs. Now, CUDA-X brings accelerated computing to a broad new set of engineering disciplines, including astronomy, particle physics, quantum physics, automotive, aerospace and semiconductor design.

Global Top 10 IC Design Houses See 49% YoY Growth in 2024, NVIDIA Commands Half the Market

TrendForce reveals that the combined revenue of the world's top 10 IC design houses reached approximately US$249.8 billion in 2024, marking a 49% YoY increase. The booming AI industry has fueled growth across the semiconductor sector, with NVIDIA leading the charge, posting an astonishing 125% revenue growth, widening its lead over competitors, and solidifying its dominance in the IC industry.

Looking ahead to 2025, advancements in semiconductor manufacturing will further enhance AI computing power, with LLMs continuing to emerge. Open-source models like DeepSeek could lower AI adoption costs, accelerating AI penetration from servers to personal devices. This shift positions edge AI devices as the next major growth driver for the semiconductor industry.

ASUS Showcases Servers Based on Intel Xeon 6, Intel Gaudi 3 at CloudFest 2025

ASUS today announced its showcase of comprehensive AI infrastructure solutions at CloudFest 2025, bringing together cutting-edge hardware powered by Intel Xeon 6 processors, NVIDIA GPUs and AMD EPYC processors. The company will also highlight its integrated software platforms, reinforcing its position as a total AI solution provider for enterprises seeking seamless AI deployments from edge to cloud.

Intel Xeon 6-based AI solutions and Gaudi 3 Acceleration for generative AI inferencing and fine tuning training
ASUS Intel Xeon 6-based servers leverage the Data Center Modular Hardware System (DC-MHS) architecture, providing unparalleled scalability, cost-efficiency and simplified maintenance. ASUS will showcase a comprehensive Intel Xeon 6 family of processors at CloudFest 2025, including the RS700-E12, RS720Q-E12. and ESC8000-E12P-series servers. The ESC800-E12P-series servers will debut the Intel Gaudi 3 AI accelerator PCIe card. This lineup underscores the ASUS commitment to delivering comprehensive AI solutions that integrate cutting-edge hardware with enterprise-grade software platforms for seamless, scalable AI deployments, highlighting Intel's latest innovations for high-performance AI training, inference, and cloud-native workloads.

GIGABYTE Showcases Future-Ready AI and HPC Technologies for High-Efficiency Computing at SCA 2025

Giga Computing, a subsidiary of GIGABYTE and a pioneer in AI-driven enterprise computing, is set to make a significant impact at Supercomputing Asia 2025 (SCA25) in Singapore (March 11-13). At booth #D5, GIGABYTE showcases its latest advancements in liquid cooling, solutions for AI training and high-performance computing (HPC). The booth highlights GIGABYTE's innovative technology and comprehensive direct liquid cooling (DLC) strategies, reinforcing its commitment to energy-efficient, high-performance computing.

Revolutionizing AI Training with DLC
A key highlight of GIGABYTE's showcase is the NVIDIA HGX H200 platform, a next-generation solution for AI workloads. GIGABYTE is presenting both its liquid-cooled G4L3-SD1 server and its air-cooled G893 series, providing businesses with advanced cooling solutions tailored for high-performance demands. The G4L3-SD1 server, equipped with CoolIT Systems' cold plates, effectively cools Intel Xeon CPUs and eight NVIDIA H200 GPUs, ensuring optimal performance with enhanced energy efficiency.

AMD's Pain Point is ROCm Software, NVIDIA's CUDA Software is Still Superior for AI Development: Report

The battle of AI acceleration in the data center is, as most readers are aware, insanely competitive, with NVIDIA offering a top-tier software stack. However, AMD has tried in recent years to capture a part of the revenue that hyperscalers and OEMs are willing to spend with its Instinct MI300X accelerator lineup for AI and HPC. Despite having decent hardware, the company is not close to bridging the gap software-wise with its competitor, NVIDIA. According to the latest report from SemiAnalysis, a research and consultancy firm, they have run a five-month experiment using Instinct MI300X for training and benchmark runs. And the findings were surprising: even with better hardware, AMD's software stack, including ROCm, has massively degraded AMD's performance.

"When comparing NVIDIA's GPUs to AMD's MI300X, we found that the potential on paper advantage of the MI300X was not realized due to a lack within AMD public release software stack and the lack of testing from AMD," noted SemiAnalysis, breaking down arguments in the report further, adding that "AMD's software experience is riddled with bugs rendering out of the box training with AMD is impossible. We were hopeful that AMD could emerge as a strong competitor to NVIDIA in training workloads, but, as of today, this is unfortunately not the case. The CUDA moat has yet to be crossed by AMD due to AMD's weaker-than-expected software Quality Assurance (QA) culture and its challenging out-of-the-box experience."

Microsoft Acquired Nearly 500,000 NVIDIA "Hopper" GPUs This Year

Microsoft is heavily investing in enabling its company and cloud infrastructure to support the massive AI expansion. The Redmond giant has acquired nearly half a million of the NVIDIA "Hopper" family of GPUs to support this effort. According to market research company Omdia, Microsoft was the biggest hyperscaler, with data center CapEx and GPU expenditure reaching a record high. The company acquired precisely 485,000 NVIDIA "Hopper" GPUs, including H100, H200, and H20, resulting in more than $30 billion spent on servers alone. To put things into perspective, this is about double that of the next-biggest GPU purchaser, Chinese ByteDance, who acquired about 230,000 sanction-abiding H800 GPUs and regular H100s sources from third parties.

Regarding US-based companies, the only ones that have come close to the GPU acquisition rate are Meta, Tesla/xAI, Amazon, and Google. They have acquired around 200,000 GPUs on average while significantly boosting their in-house chip design efforts. "NVIDIA GPUs claimed a tremendously high share of the server capex," Vlad Galabov, director of cloud and data center research at Omdia, noted, adding, "We're close to the peak." Hyperscalers like Amazon, Google, and Meta have been working on their custom solutions for AI training and inference. For example, Google has its TPU, Amazon has its Trainium and Inferentia chips, and Meta has its MTIA. Hyperscalers are eager to develop their in-house solutions, but NVIDIA's grip on the software stack paired with timely product updates seems hard to break. The latest "Blackwell" chips are projected to get even bigger orders, so only the sky (and the local power plant) is the limit.

Advantech Introduces Its GPU Server SKY-602E3 With NVIDIA H200 NVL

Advantech, a leading global provider of industrial edge AI solutions, is excited to introduce its GPU server SKY-602E3 equipped with the NVIDIA H200 NVL platform. This powerful combination is set to accelerate the offline LLM for manufacturing, providing unprecedented levels of performance and efficiency. The NVIDIA H200 NVL, requiring 600 W passive cooling, is fully supported by the compact and efficient SKY-602E3 GPU server, making it an ideal solution for demanding edge AI applications.

Core of Factory LLM Deployment: AI Vision
The SKY-602E3 GPU server excels in supporting large language models (LLMs) for AI inference and training. It features four PCIe 5.0 x16 slots, delivering high bandwidth for intensive tasks, and four PCIe 5.0 x8 slots, providing enhanced flexibility for GPU and frame grabber card expansion. The half-width design of the SKY-602E3 makes it an excellent choice for workstation environments. Additionally, the server can be equipped with the NVIDIA H200 NVL platform, which offers 1.7x more performance than the NVIDIA H100 NVL, freeing up additional PCIe slots for other expansion needs.

NVIDIA and Microsoft Showcase Blackwell Preview, Omniverse Industrial AI and RTX AI PCs at Microsoft Ignite

NVIDIA and Microsoft today unveiled product integrations designed to advance full-stack NVIDIA AI development on Microsoft platforms and applications. At Microsoft Ignite, Microsoft announced the launch of the first cloud private preview of the Azure ND GB200 V6 VM series, based on the NVIDIA Blackwell platform. The Azure ND GB200 v6 will be a new AI-optimized virtual machine (VM) series and combines the NVIDIA GB200 NVL72 rack design with NVIDIA Quantum InfiniBand networking.

In addition, Microsoft revealed that Azure Container Apps now supports NVIDIA GPUs, enabling simplified and scalable AI deployment. Plus, the NVIDIA AI platform on Azure includes new reference workflows for industrial AI and an NVIDIA Omniverse Blueprint for creating immersive, AI-powered visuals. At Ignite, NVIDIA also announced multimodal small language models (SLMs) for RTX AI PCs and workstations, enhancing digital human interactions and virtual assistants with greater realism.

NVIDIA Announces Hopper H200 NVL PCIe GPU Availability at SC24, Promising 1.3x HPC Performance Over H100 NVL

Since its introduction, the NVIDIA Hopper architecture has transformed the AI and high-performance computing (HPC) landscape, helping enterprises, researchers and developers tackle the world's most complex challenges with higher performance and greater energy efficiency. During the Supercomputing 2024 conference, NVIDIA announced the availability of the NVIDIA H200 NVL PCIe GPU - the latest addition to the Hopper family. H200 NVL is ideal for organizations with data centers looking for lower-power, air-cooled enterprise rack designs with flexible configurations to deliver acceleration for every AI and HPC workload, regardless of size.

According to a recent survey, roughly 70% of enterprise racks are 20kW and below and use air cooling. This makes PCIe GPUs essential, as they provide granularity of node deployment, whether using one, two, four or eight GPUs - enabling data centers to pack more computing power into smaller spaces. Companies can then use their existing racks and select the number of GPUs that best suits their needs. Enterprises can use H200 NVL to accelerate AI and HPC applications, while also improving energy efficiency through reduced power consumption. With a 1.5x memory increase and 1.2x bandwidth increase over NVIDIA H100 NVL, companies can use H200 NVL to fine-tune LLMs within a few hours and deliver up to 1.7x faster inference performance. For HPC workloads, performance is boosted up to 1.3x over H100 NVL and 2.5x over the NVIDIA Ampere architecture generation.

NVIDIA B200 "Blackwell" Records 2.2x Performance Improvement Over its "Hopper" Predecessor

We know that NVIDIA's latest "Blackwell" GPUs are fast, but how much faster are they over the previous generation "Hopper"? Thanks to the latest MLPerf Training v4.1 results, NVIDIA's HGX B200 Blackwell platform has demonstrated massive performance gains, measuring up to 2.2x improvement per GPU compared to its HGX H200 Hopper. The latest results, verified by MLCommons, reveal impressive achievements in large language model (LLM) training. The Blackwell architecture, featuring HBM3e high-bandwidth memory and fifth-generation NVLink interconnect technology, achieved double the performance per GPU for GPT-3 pre-training and a 2.2x boost for Llama 2 70B fine-tuning compared to the previous Hopper generation. Each benchmark system incorporated eight Blackwell GPUs operating at a 1,000 W TDP, connected via NVLink Switch for scale-up.

The network infrastructure utilized NVIDIA ConnectX-7 SuperNICs and Quantum-2 InfiniBand switches, enabling high-speed node-to-node communication for distributed training workloads. While previous Hopper-based systems required 256 GPUs to optimize performance for the GPT-3 175B benchmark, Blackwell accomplished the same task with just 64 GPUs, leveraging its larger HBM3e memory capacity and bandwidth. One thing to look out for is the upcoming GB200 NVL72 system, which promises even more significant gains past the 2.2x. It features expanded NVLink domains, higher memory bandwidth, and tight integration with NVIDIA Grace CPUs, complemented by ConnectX-8 SuperNIC and Quantum-X800 switch technologies. With faster switching and better data movement with Grace-Blackwell integration, we could see even more software optimization from NVIDIA to push the performance envelope.

Cisco Unveils Plug-and-Play AI Solutions Powered by NVIDIA H100 and H200 Tensor Core GPUs

Today, Cisco announced new additions to its data center infrastructure portfolio: an AI server family purpose-built for GPU-intensive AI workloads with NVIDIA accelerated computing, and AI PODs to simplify and de-risk AI infrastructure investment. They give organizations an adaptable and scalable path to AI, supported by Cisco's industry-leading networking capabilities.

"Enterprise customers are under pressure to deploy AI workloads, especially as we move toward agentic workflows and AI begins solving problems on its own," said Jeetu Patel, Chief Product Officer, Cisco. "Cisco innovations like AI PODs and the GPU server strengthen the security, compliance, and processing power of those workloads as customers navigate their AI journeys from inferencing to training."

Intel Won't Compete Against NVIDIA's High-End AI Dominance Soon, Starts Laying Off Over 2,200 Workers Across US

Intel's taking a different path with its Gaudi 3 accelerator chips. It's staying away from the high-demand market for training big AI models, which has made NVIDIA so successful. Instead, Intel wants to help businesses that need cheaper AI solutions to train and run smaller specific models and open-source options. At a recent event, Intel talked up Gaudi 3's "price performance advantage" over NVIDIA's H100 GPU for inference tasks. Intel says Gaudi 3 is faster and more cost-effective than the H100 when running Llama 3 and Llama 2 models of different sizes.

Intel also claims that Gaudi 3 is as power-efficient as the H100 for large language model (LLM) inference with small token outputs and does even better with larger outputs. The company even suggests Gaudi 3 beats NVIDIA's newer H200 in LLM inference throughput for large token outputs. However, Gaudi 3 doesn't match up to the H100 in overall floating-point operation throughput for 16-bit and 8-bit formats. For bfloat16 and 8-bit floating-point precision matrix math, Gaudi 3 hits 1,835 TFLOPS in each format, while the H100 reaches 1,979 TFLOPS for BF16 and 3,958 TFLOPS for FP8.

GIGABYTE Announces New Liquid Cooled Solutions for NVIDIA HGX H200

Giga Computing, a subsidiary of GIGABYTE and an industry leader in generative AI servers and advanced cooling technologies, today announced new flagship GIGABYTE G593 series servers supporting direct liquid cooling (DLC) technology to advance green data centers using NVIDIA HGX H200 GPU. As DLC technology is becoming a necessity for many data centers, GIGABYTE continues to increase its product portfolio with new DLC solutions for GPU and CPU technologies, and for these new G593 servers the cold plates are made by CoolIT Systems.

G593 Series - Tailored Cooling
The GPU-centric G593 series is custom engineered to house an 8-GPU baseboard, and its design had foresight for both air and liquid cooling. The compact 5U chassis leads the industry in its readily scalable nature, fitting up to sixty-four GPUs in a single rack and supporting 100kW of IT hardware. This helps to consolidate the IT hardware, and in turn, decrease the data center footprint. The G593 series servers for DLC are in response to the rising customer demand for greater energy efficiency. Liquids have a higher thermal conductivity than air, so they can rapidly and effectively remove heat from hot components to maintain lower operating temperatures. And by relying on water and heat exchangers, the overall energy consumption of the data center is reduced.

ASUS Announces ESC N8-E11 AI Server with NVIDIA HGX H200

ASUS today announced the latest marvel in the groundbreaking lineup of ASUS AI servers - ESC N8-E11, featuring the intensely powerful NVIDIA HGX H200 platform. With this AI titan, ASUS has secured its first industry deal, showcasing the exceptional performance, reliability and desirability of ESC N8-E11 with HGX H200, as well as the ability of ASUS to move first and fast in creating strong, beneficial partnerships with forward-thinking organizations seeking the world's most powerful AI solutions.

Shipments of the ESC N8-E11 with NVIDIA HGX H200 are scheduled to begin in early Q4 2024, marking a new milestone in the ongoing ASUS commitment to excellence. ASUS has been actively supporting clients by assisting in the development of cooling solutions to optimize overall PUE, guaranteeing that every ESC N8-E11 unit delivers top-tier efficiency and performance - ready to power the new era of AI.

NVIDIA Blackwell Sets New Standard for Generative AI in MLPerf Inference Benchmark

As enterprises race to adopt generative AI and bring new services to market, the demands on data center infrastructure have never been greater. Training large language models is one challenge, but delivering LLM-powered real-time services is another. In the latest round of MLPerf industry benchmarks, Inference v4.1, NVIDIA platforms delivered leading performance across all data center tests. The first-ever submission of the upcoming NVIDIA Blackwell platform revealed up to 4x more performance than the NVIDIA H100 Tensor Core GPU on MLPerf's biggest LLM workload, Llama 2 70B, thanks to its use of a second-generation Transformer Engine and FP4 Tensor Cores.

The NVIDIA H200 Tensor Core GPU delivered outstanding results on every benchmark in the data center category - including the latest addition to the benchmark, the Mixtral 8x7B mixture of experts (MoE) LLM, which features a total of 46.7 billion parameters, with 12.9 billion parameters active per token. MoE models have gained popularity as a way to bring more versatility to LLM deployments, as they're capable of answering a wide variety of questions and performing more diverse tasks in a single deployment. They're also more efficient since they only activate a few experts per inference - meaning they deliver results much faster than dense models of a similar size.

NVIDIA's New B200A Targets OEM Customers; High-End GPU Shipments Expected to Grow 55% in 2025

Despite recent rumors speculating on NVIDIA's supposed cancellation of the B100 in favor of the B200A, TrendForce reports that NVIDIA is still on track to launch both the B100 and B200 in the 2H24 as it aims to target CSP customers. Additionally, a scaled-down B200A is planned for other enterprise clients, focusing on edge AI applications.

TrendForce reports that NVIDIA will prioritize the B100 and B200 for CSP customers with higher demand due to the tight production capacity of CoWoS-L. Shipments are expected to commence after 3Q24. In light of yield and mass production challenges with CoWoS-L, NVIDIA is also planning the B200A for other enterprise clients, utilizing CoWoS-S packaging technology.

NVIDIA MLPerf Training Results Showcase Unprecedented Performance and Elasticity

The full-stack NVIDIA accelerated computing platform has once again demonstrated exceptional performance in the latest MLPerf Training v4.0 benchmarks. NVIDIA more than tripled the performance on the large language model (LLM) benchmark, based on GPT-3 175B, compared to the record-setting NVIDIA submission made last year. Using an AI supercomputer featuring 11,616 NVIDIA H100 Tensor Core GPUs connected with NVIDIA Quantum-2 InfiniBand networking, NVIDIA achieved this remarkable feat through larger scale - more than triple that of the 3,584 H100 GPU submission a year ago - and extensive full-stack engineering.

Thanks to the scalability of the NVIDIA AI platform, Eos can now train massive AI models like GPT-3 175B even faster, and this great AI performance translates into significant business opportunities. For example, in NVIDIA's recent earnings call, we described how LLM service providers can turn a single dollar invested into seven dollars in just four years running the Llama 3 70B model on NVIDIA HGX H200 servers. This return assumes an LLM service provider serving Llama 3 70B at $0.60/M tokens, with an HGX H200 server throughput of 24,000 tokens/second.

Blackwell Shipments Imminent, Total CoWoS Capacity Expected to Surge by Over 70% in 2025

TrendForce reports that NVIDIA's Hopper H100 began to see a reduction in shortages in 1Q24. The new H200 from the same platform is expected to gradually ramp in Q2, with the Blackwell platform entering the market in Q3 and expanding to data center customers in Q4. However, this year will still primarily focus on the Hopper platform, which includes the H100 and H200 product lines. The Blackwell platform—based on how far supply chain integration has progressed—is expected to start ramping up in Q4, accounting for less than 10% of the total high-end GPU market.

The die size of Blackwell platform chips like the B100 is twice that of the H100. As Blackwell becomes mainstream in 2025, the total capacity of TSMC's CoWoS is projected to grow by 150% in 2024 and by over 70% in 2025, with NVIDIA's demand occupying nearly half of this capacity. For HBM, the NVIDIA GPU platform's evolution sees the H100 primarily using 80 GB of HBM3, while the 2025 B200 will feature 288 GB of HBM3e—a 3-4 fold increase in capacity per chip. The three major manufacturers' expansion plans indicate that HBM production volume will likely double by 2025.

TOP500: Frontier Keeps Top Spot, Aurora Officially Becomes the Second Exascale Machine

The 63rd edition of the TOP500 reveals that Frontier has once again claimed the top spot, despite no longer being the only exascale machine on the list. Additionally, a new system has found its way into the Top 10.

The Frontier system at Oak Ridge National Laboratory in Tennessee, USA remains the most powerful system on the list with an HPL score of 1.206 EFlop/s. The system has a total of 8,699,904 combined CPU and GPU cores, an HPE Cray EX architecture that combines 3rd Gen AMD EPYC CPUs optimized for HPC and AI with AMD Instinct MI250X accelerators, and it relies on Cray's Slingshot 11 network for data transfer. On top of that, this machine has an impressive power efficiency rating of 52.93 GFlops/Watt - putting Frontier at the No. 13 spot on the GREEN500.

NVIDIA Grace Hopper Ignites New Era of AI Supercomputing

Driving a fundamental shift in the high-performance computing industry toward AI-powered systems, NVIDIA today announced nine new supercomputers worldwide are using NVIDIA Grace Hopper Superchips to speed scientific research and discovery. Combined, the systems deliver 200 exaflops, or 200 quintillion calculations per second, of energy-efficient AI processing power.

New Grace Hopper-based supercomputers coming online include EXA1-HE, in France, from CEA and Eviden; Helios at Academic Computer Centre Cyfronet, in Poland, from Hewlett Packard Enterprise (HPE); Alps at the Swiss National Supercomputing Centre, from HPE; JUPITER at the Jülich Supercomputing Centre, in Germany; DeltaAI at the National Center for Supercomputing Applications at the University of Illinois Urbana-Champaign; and Miyabi at Japan's Joint Center for Advanced High Performance Computing - established between the Center for Computational Sciences at the University of Tsukuba and the Information Technology Center at the University of Tokyo.
Return to Keyword Browsing
Apr 3rd, 2025 21:19 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts