News Posts matching #GB200

Return to Keyword Browsing

NVIDIA's Next-Gen "Rubin" AI GPU Development 6 Months Ahead of Schedule: Report

The "Rubin" architecture succeeds NVIDIA's current "Blackwell," which powers the company's AI GPUs, as well as the upcoming GeForce RTX 50-series gaming GPUs. NVIDIA will likely not build gaming GPUs with "Rubin," just like it didn't with "Hopper," and for the most part, "Volta." NVIDIA's AI GPU product roadmap put out at SC'24 puts "Blackwell" firmly in charge of the company's AI GPU product stack throughout 2025, with "Rubin" only succeeding it in the following year, for a two-year run in the market, being capped off with a "Rubin Ultra" larger GPU slated for 2027. A new report by United Daily News (UDN), a Taiwan-based publication, says that the development of "Rubin" is running 6 months ahead of schedule.

Being 6 months ahead of schedule doesn't necessarily mean that the product will launch sooner. It would give NVIDIA headroom to get "Rubin" better evaluated in the industry, and make last-minute changes to the product if needed; or even advance the launch if it wants to. The first AI GPU powered by "Rubin" will feature 8-high HBM4 memory stacks. The company will also introduce the "Vera" CPU, the long-awaited successor to "Grace." It will also introduce the X1600 InfiniBand/Ethernet network processor. According to the SC'24 roadmap by NVIDIA, these three would've seen a 2026 launch. Then in 2027, the company would follow up with an even larger AI GPU based on the same "Rubin" architecture, codenamed "Rubin Ultra." This features 12-high HBM4 stacks. NVIDIA's current GB200 "Blackwell" is a tile-based GPU, with two dies that have full cache-coherence. "Rubin" is rumored to feature four tiles.

NVIDIA and Microsoft Showcase Blackwell Preview, Omniverse Industrial AI and RTX AI PCs at Microsoft Ignite

NVIDIA and Microsoft today unveiled product integrations designed to advance full-stack NVIDIA AI development on Microsoft platforms and applications. At Microsoft Ignite, Microsoft announced the launch of the first cloud private preview of the Azure ND GB200 V6 VM series, based on the NVIDIA Blackwell platform. The Azure ND GB200 v6 will be a new AI-optimized virtual machine (VM) series and combines the NVIDIA GB200 NVL72 rack design with NVIDIA Quantum InfiniBand networking.

In addition, Microsoft revealed that Azure Container Apps now supports NVIDIA GPUs, enabling simplified and scalable AI deployment. Plus, the NVIDIA AI platform on Azure includes new reference workflows for industrial AI and an NVIDIA Omniverse Blueprint for creating immersive, AI-powered visuals. At Ignite, NVIDIA also announced multimodal small language models (SLMs) for RTX AI PCs and workstations, enhancing digital human interactions and virtual assistants with greater realism.

NVIDIA Prepares GB200 NVL4: Four "Blackwell" GPUs and Two "Grace" CPUs in a 5,400 W Server

At SC24, NVIDIA announced its latest compute-dense AI accelerators in the form of GB200 NVL4, a single-server solution that expands the company's "Blackwell" series portfolio. The new platform features an impressive combination of four "Blackwell" GPUs and two "Grace" CPUs on a single board. The GB200 NVL4 boasts remarkable specifications for a single-server system, including 768 GB of HBM3E memory across its four Blackwell GPUs, delivering a combined memory bandwidth of 32 TB/s. The system's two Grace CPUs have 960 GB of LPDDR5X memory, making it a powerhouse for demanding AI workloads. A key feature of the NVL4 design is its NVLink interconnect technology, which enables communication between all processors on the board. This integration is important for maintaining optimal performance across the system's multiple processing units, especially during large training runs or inferencing a multi-trillion parameter model.

Performance comparisons with previous generations show significant improvements, with NVIDIA claiming the GB200 GPUs deliver 2.2x faster overall performance and 1.8x quicker training capabilities compared to their GH200 NVL4 predecessor. The system's power consumption reaches 5,400 watts, which effectively doubles the 2,700-watt requirement of the GB200 NVL2 model, its smaller sibling that features two GPUs instead of four. NVIDIA is working closely with OEM partners to bring various Blackwell solutions to market, including the DGX B200, GB200 Grace Blackwell Superchip, GB200 Grace Blackwell NVL2, GB200 Grace Blackwell NVL4, and GB200 Grace Blackwell NVL72. Fitting 5,400 W of TDP in a single server will require liquid cooling for optimal performance, and the GB200 NVL4 is expected to go inside server racks for hyperscaler customers, which usually have a custom liquid cooling systems inside their data centers.

Supermicro Delivers Direct-Liquid-Optimized NVIDIA Blackwell Solutions

Supermicro, Inc., a Total IT Solution Provider for AI, Cloud, Storage, and 5G/Edge, is announcing the highest-performing SuperCluster, an end-to-end AI data center solution featuring the NVIDIA Blackwell platform for the era of trillion-parameter-scale generative AI. The new SuperCluster will significantly increase the number of NVIDIA HGX B200 8-GPU systems in a liquid-cooled rack, resulting in a large increase in GPU compute density compared to Supermicro's current industry-leading liquid-cooled NVIDIA HGX H100 and H200-based SuperClusters. In addition, Supermicro is enhancing the portfolio of its NVIDIA Hopper systems to address the rapid adoption of accelerated computing for HPC applications and mainstream enterprise AI.

"Supermicro has the expertise, delivery speed, and capacity to deploy the largest liquid-cooled AI data center projects in the world, containing 100,000 GPUs, which Supermicro and NVIDIA contributed to and recently deployed," said Charles Liang, president and CEO of Supermicro. "These Supermicro SuperClusters reduce power needs due to DLC efficiencies. We now have solutions that use the NVIDIA Blackwell platform. Using our Building Block approach allows us to quickly design servers with NVIDIA HGX B200 8-GPU, which can be either liquid-cooled or air-cooled. Our SuperClusters provide unprecedented density, performance, and efficiency, and pave the way toward even more dense AI computing solutions in the future. The Supermicro clusters use direct liquid cooling, resulting in higher performance, lower power consumption for the entire data center, and reduced operational expenses."

ASRock Rack Brings End-to-End AI and HPC Server Portfolio to SC24

ASRock Rack Inc., a leading innovative server company, today announces its presence at SC24, held at the Georgia World Congress Center in Atlanta from November 18-21. At booth #3609, ASRock Rack will showcase a comprehensive high-performance portfolio of server boards, systems, and rack solutions with NVIDIA accelerated computing platforms, helping address the needs of enterprises, organizations, and data centers.

Artificial intelligence (AI) and high-performance computing (HPC) continue to reshape technology. ASRock Rack is presenting a complete suite of solutions spanning edge, on-premise, and cloud environments, engineered to meet the demand of AI and HPC. The 2U short-depth MECAI, incorporating the NVIDIA GH200 Grace Hopper Superchip, is developed to supercharge accelerated computing and generative AI in space-constrained environments. The 4U10G-TURIN2 and 4UXGM-GNR2, supporting ten and eight NVIDIA H200 NVL PCIe GPUs respectively, are aiming to help enterprises and researchers tackle every AI and HPC challenge with enhanced performance and greater energy efficiency. NVIDIA H200 NVL is ideal for lower-power, air-cooled enterprise rack designs that require flexible configurations, delivering acceleration for AI and HPC workloads regardless of size.

GIGABYTE Showcases a Leading AI and Enterprise Portfolio at Supercomputing 2024

Giga Computing, a subsidiary of GIGABYTE and an industry leader in generative AI servers and advanced cooling technologies, shows off at SC24 how the GIGABYTE enterprise portfolio provides solutions for all applications, from cloud computing to AI to enterprise IT, including energy-efficient liquid-cooling technologies. This portfolio is made more complete by long-term collaborations with leading technology companies and emerging industry leaders, which will be showcased at GIGABYTE booth #3123 at SC24 (Nov. 19-21) in Atlanta. The booth is sectioned to put the spotlight on strategic technology collaborations, as well as direct liquid cooling partners.

The GIGABYTE booth will showcase an array of NVIDIA platforms built to keep up with the diversity of workloads and degrees of demands in applications of AI & HPC hardware. For a rack-scale AI solution using the NVIDIA GB200 NVL72 design, GIGABYTE displays how seventy-two GPUs can be in one rack with eighteen GIGABYTE servers each housing two NVIDIA Grace CPUs and four NVIDIA Blackwell GPUs. Another platform at the GIGABYTE booth is the NVIDIA HGX H200 platform. GIGABYTE exhibits both its liquid-cooling G4L3-SD1 server and an air-cooled version, G593-SD1.

ASUS Presents Next-Gen Infrastructure Solutions With Advanced Cooling Portfolio at SC24

ASUS today announced its next-generation infrastructure solutions at SC24, unveiling an extensive server lineup and advanced cooling solutions, all designed to propel the future of AI. The product showcase will reveal how ASUS is working with NVIDIA and Ubitus/Ubilink to prove the immense computational power of supercomputers, using AI-powered avatar and robot demonstrations that leverage the newly-inaugurated data center. It is Taiwan's largest supercomputing facility, constructed by ASUS, and is also notable for offering flexible green-energy options to customers that desire them. As a total solution provider with a proven track record in pioneering AI supercomputing, ASUS continuously drives maximized value for customers.

To fuel digital transformation in enterprise through high-performance computing (HPC) and AI-driven architecture, ASUS provides a full line-up of server systems—ready for every scenario. ASUS AI POD, a complete rack solution equipped with NVIDIA GB200 NVL72 platform, integrates GPUs, CPUs and switches in seamless, high-speed direct communication, enhancing the training of trillion-parameter LLMs and enabling real-time inference. It features the NVIDIA GB200 Grace Blackwell Superchip and fifth-generation NVIDIA NVLink technology, while offering both liquid-to-air and liquid-to-liquid cooling options to maximize AI computing performance.

CoolIT Announces the World's Highest Density Liquid-to-Liquid Coolant Distribution Unit

CoolIT Systems (CoolIT), the world leader in liquid cooling systems for AI and high-performance computing, introduces the CHx1000, the world's highest-density liquid-to-liquid coolant distribution unit (CDU). Designed for mission-critical applications, the CHx1000 is purpose-built to cool the NVIDIA Blackwell platform and other demanding AI workloads where liquid cooling is now necessary.

"CoolIT created the CHx1000 to provide the high capacity and pressure delivery required to direct liquid cool NVIDIA Blackwell and future generations of high-performance AI accelerators," said Patrick McGinn, CoolIT's COO. "Besides exceptional performance, serviceability and reliability are central to the CHx1000's design. The single rack-sized unit is fully front and back serviceable with hot-swappable critical components. Precision coolant controls and multiple levels of redundancy provide for steady, uninterrupted operation."

NVIDIA "Blackwell" NVL72 Servers Reportedly Require Redesign Amid Overheating Problems

According to The Information, NVIDIA's latest "Blackwell" processors are reportedly encountering significant thermal management issues in high-density server configurations, potentially affecting deployment timelines for major tech companies. The challenges emerge specifically in NVL72 GB200 racks housing 72 GB200 processors, which can consume up to 120 kilowatts of power per rack, weighting a "mere" 3,000 pounds (or about 1.5 tons). These thermal concerns have prompted NVIDIA to revisit and modify its server rack designs multiple times to prevent performance degradation and potential hardware damage. Hyperscalers like Google, Meta, and Microsoft, who rely heavily on NVIDIA GPUs for training their advanced language models, have allegedly expressed concerns about possible delays in their data center deployment schedules.

The thermal management issues follow earlier setbacks related to a design flaw in the Blackwell production process. The problem stemmed from the complex CoWoS-L packaging technology, which connects dual chiplets using RDL interposer and LSI bridges. Thermal expansion mismatches between various components led to warping issues, requiring modifications to the GPU's metal layers and bump structures. A company spokesperson characterized these modifications as part of the standard development process, noting that a new photomask resolved this issue. The Information states that mass production of the revised Blackwell GPUs began in late October, with shipments expected to commence in late January. However, these timelines are unconfirmed by NVIDIA, and some server makers like Dell confirmed that these GB200 NVL72 liquid-cooled systems are shipping now, not in January, with CoreWave GPU cloud provider as a customer. The original report could be using older information, as Dell is one of NVIDIA's most significant partners and among the first in the supply chain to gain access to new GPU batches.

NVIDIA B200 "Blackwell" Records 2.2x Performance Improvement Over its "Hopper" Predecessor

We know that NVIDIA's latest "Blackwell" GPUs are fast, but how much faster are they over the previous generation "Hopper"? Thanks to the latest MLPerf Training v4.1 results, NVIDIA's HGX B200 Blackwell platform has demonstrated massive performance gains, measuring up to 2.2x improvement per GPU compared to its HGX H200 Hopper. The latest results, verified by MLCommons, reveal impressive achievements in large language model (LLM) training. The Blackwell architecture, featuring HBM3e high-bandwidth memory and fifth-generation NVLink interconnect technology, achieved double the performance per GPU for GPT-3 pre-training and a 2.2x boost for Llama 2 70B fine-tuning compared to the previous Hopper generation. Each benchmark system incorporated eight Blackwell GPUs operating at a 1,000 W TDP, connected via NVLink Switch for scale-up.

The network infrastructure utilized NVIDIA ConnectX-7 SuperNICs and Quantum-2 InfiniBand switches, enabling high-speed node-to-node communication for distributed training workloads. While previous Hopper-based systems required 256 GPUs to optimize performance for the GPT-3 175B benchmark, Blackwell accomplished the same task with just 64 GPUs, leveraging its larger HBM3e memory capacity and bandwidth. One thing to look out for is the upcoming GB200 NVL72 system, which promises even more significant gains past the 2.2x. It features expanded NVLink domains, higher memory bandwidth, and tight integration with NVIDIA Grace CPUs, complemented by ConnectX-8 SuperNIC and Quantum-X800 switch technologies. With faster switching and better data movement with Grace-Blackwell integration, we could see even more software optimization from NVIDIA to push the performance envelope.

NVIDIA Ships Over One Billion RISC-V Cores This Year Inside Its Accelerators, Up to 40 Cores Per Chip

During the 2024 RISC-V Summit in Santa Clara, California, NVIDIA was one of the presenting members. RISC-V, being a free and open-source instruction set architecture, is an interesting choice for many companies looking to develop custom solutions. NVIDIA designs accelerators for AI and graphics processing, all of which are equipped with up to tens of thousands of cores. To manage these cores, NVIDIA has developed a custom RISC-V processor called "NV-RISCV," which is a replacement for its predecessor "Falcon." Unlike Falcon, NV-RISCV is based on an open-source ISA and is customized much more deeply, with features like more customized caches and special instructions. Initially, the company reported better performance over its Falcon GPU System Processor (GSP), and NV-RISCV is now running in millions of NVIDIA chips.

Thanks to a post on X by Nick Brown, we learn that NVIDIA is shipping roughly one billion RISC-V cores in the year 2024. Each NVIDIA chip includes between 10 and 40 RISC-V cores, depending on the chip size and complexity. Some more complex designs, like GB200, require massive data coordination, meaning that more cores are needed to handle these requests and distribute them. This includes chip-to-chip interfaces, context switching, memory controller, camera handling, video codecs, display output, resource management, power management, and more. NVIDIA has developed a total of over 20 custom extensions for RISC-V cores, which all serve their specific use cases.

Micron SSDs Qualified for Recommended Vendor List on NVIDIA GB200 NVL72

Micron Technology, Inc., today announced that its 9550 PCIe Gen 5 E1.S data center SSDs have been added to the NVIDIA recommended vendor list (RVL) for the NVIDIA GB200 NVL72 system and its derivatives. The GB200 NVL72 uses the GB200 Grace Blackwell Superchip to deliver rack-scale, energy-efficient AI infrastructure. The enablement of PCIe Gen 5 storage in the system makes the Micron 9550 SSD an ideal fit for optimizing performance and power efficiency in AI workloads like large-scale training of AI models, real-time trillion-parameter language model inference and high-performance computing (HPC) tasks.

Micron 9550 delivers world-class AI workload performance and power efficiency:
Compared with other industry offerings, the 9550 SSD delivers up to 34% higher throughput for NVIDIA Magnum IO GPUDirect (GDS) and up to 33% faster workload completion times in graph neural network (GNN) training with Big Accelerator Memory (BaM). The Micron 9550 SSD saves energy and sets new sustainability benchmarks by consuming 81% less SSD energy per 1 TB transferred than other SSD offerings with NVIDIA Magnum IO GDS and up to 43% lower SSD power in GNN training with BaM.

Meta Shows Open-Architecture NVIDIA "Blackwell" GB200 System for Data Center

During the Open Compute Project (OCP) Summit 2024, Meta, one of the prime members of the OCP project, showed its NVIDIA "Blackwell" GB200 systems for its massive data centers. We previously covered Microsoft's Azure server rack with GB200 GPUs featuring one-third of the rack space for computing and two-thirds for cooling. A few days later, Google showed off its smaller GB200 system, and today, Meta is showing off its GB200 system—the smallest of the bunch. To train a dense transformer large language model with 405B parameters and a context window of up to 128k tokens, like the Llama 3.1 405B, Meta must redesign its data center infrastructure to run a distributed training job on two 24,000 GPU clusters. That is 48,000 GPUs used for training a single AI model.

Called "Catalina," it is built on the NVIDIA Blackwell platform, emphasizing modularity and adaptability while incorporating the latest NVIDIA GB200 Grace Blackwell Superchip. To address the escalating power requirements of GPUs, Catalina introduces the Orv3, a high-power rack capable of delivering up to 140kW. The comprehensive liquid-cooled setup encompasses a power shelf supporting various components, including a compute tray, switch tray, the Orv3 HPR, Wedge 400 fabric switch with 12.8 Tbps switching capacity, management switch, battery backup, and a rack management controller. Interestingly, Meta also upgraded its "Grand Teton" system for internal usage, such as deep learning recommendation models (DLRMs) and content understanding with AMD Instinct MI300X. Those are used to inference internal models, and MI300X appears to provide the best performance per Dollar for inference. According to Meta, the computational demand stemming from AI will continue to increase exponentially, so more NVIDIA and AMD GPUs is needed, and we can't wait to see what the company builds.

SK hynix Showcases Memory Solutions at the 2024 OCP Global Summit

SK hynix is showcasing its leading AI and data center memory products at the 2024 Open Compute Project (OCP) Global Summit held October 15-17 in San Jose, California. The annual summit brings together industry leaders to discuss advancements in open source hardware and data center technologies. This year, the event's theme is "From Ideas to Impact," which aims to foster the realization of theoretical concepts into real-world technologies.

In addition to presenting its advanced memory products at the summit, SK hynix is also strengthening key industry partnerships and sharing its AI memory expertise through insightful presentations. This year, the company is holding eight sessions—up from five in 2023—on topics including HBM and CMS.

MSI Unveils AI Servers Powered by NVIDIA MGX at OCP 2024

MSI, a leading global provider of high-performance server solutions, proudly announced it is showcasing new AI servers powered by the NVIDIA MGX platform—designed to address the increasing demand for scalable, energy-efficient AI workloads in modern data centers—at the OCP Global Summit 2024, booth A6. This collaboration highlights MSI's continued commitment to advancing server solutions, focusing on cutting-edge AI acceleration and high-performance computing (HPC).

The NVIDIA MGX platform offers a flexible architecture that enables MSI to deliver purpose-built solutions optimized for AI, HPC, and LLMs. By leveraging this platform, MSI's AI server solutions provide exceptional scalability, efficiency, and enhanced GPU density—key factors in meeting the growing computational demands of AI workloads. Tapping into MSI's engineering expertise and NVIDIA's advanced AI technologies, these AI servers based on the MGX architecture deliver unparalleled compute power, positioning data centers to maximize performance and power efficiency while paving the way for the future of AI-driven infrastructure.

Lenovo Announces New Liquid Cooled Servers for Intel Xeon and NVIDIA Blackwell Platforms

At Lenovo Tech World 2024, we announced new Supercomputing servers for HPC and AI workloads. These new water-cooled servers use the latest processor and accelerator technology from Intel and NVIDIA.

ThinkSystem SC750 V4
Engineered for large-scale cloud infrastructures and High Performance Computing (HPC), the Lenovo ThinkSystem SC750 V4 Neptune excels in intensive simulations and complex modeling. It's designed to handle technical computing, grid deployments, and analytics workloads in various fields such as research, life sciences, energy, engineering, and financial simulation.

Google Shows Production NVIDIA "Blackwell" GB200 NVL System for Cloud

Last week, we got a preview of Microsoft's Azure production-ready NVIDIA "Blackwell" GB200 system, showing that only a third of the rack that goes in the data center is actually holding the compute elements, with the other two-thirds holding the cooling compartment to cool down the immense heat output from tens of GB200 GPUs. Today, Google is showing off a part of its own infrastructure ahead of the Google Cloud App Dev & Infrastructure Summit, taking place on October 30, digitally as an event. Shown below are two racks standing side by side, connecting NVIDIA "Blackwell" GB200 NVL cards with the rest of the Google infrastructure. Unlike Microsoft Azure, Google Cloud uses a different data center design in its facilities.

There is one rack with power distribution units, networking switches, and cooling distribution units, all connected to the compute rack, which houses power supplies, GPUs, and CPU servers. Networking equipment is present, and it connects to Google's "global" data center network, which is Google's own data center fabric. We are not sure what is the fabric connection of choice between these racks; as for optimal performance, NVIDIA recommends InfiniBand (Mellanox acquisition). However, given that Google's infrastructure is set up differently, there may be Ethernet switches present. Interestingly, Google's design of GB200 racks differs from Azure's, as it uses additional rack space to distribute the coolant to its local heat exchangers, i.e., coolers. We are curious to see if Google releases more information on infrastructure, as it has been known as the infrastructure king because of its ability to scale and keep everything organized.

Supermicro's Liquid-Cooled SuperClusters for AI Data Centers Powered by NVIDIA GB200 NVL72 and NVIDIA HGX B200 Systems

Supermicro, Inc., a Total IT Solution Provider for AI, Cloud, Storage, and 5G/Edge, is accelerating the industry's transition to liquid-cooled data centers with the NVIDIA Blackwell platform to deliver a new paradigm of energy-efficiency for the rapidly heightened energy demand of new AI infrastructures. Supermicro's industry-leading end-to-end liquid-cooling solutions are powered by the NVIDIA GB200 NVL72 platform for exascale computing in a single rack and have started sampling to select customers for full-scale production in late Q4. In addition, the recently announced Supermicro X14 and H14 4U liquid-cooled systems and 10U air-cooled systems are production-ready for the NVIDIA HGX B200 8-GPU system.

"We're driving the future of sustainable AI computing, and our liquid-cooled AI solutions are rapidly being adopted by some of the most ambitious AI Infrastructure projects in the world with over 2000 liquid-cooled racks shipped since June 2024," said Charles Liang, president and CEO of Supermicro. "Supermicro's end-to-end liquid-cooling solution, with the NVIDIA Blackwell platform, unlocks the computational power, cost-effectiveness, and energy-efficiency of the next generation of GPUs, such as those that are part of the NVIDIA GB200 NVL72, an exascale computer contained in a single rack. Supermicro's extensive experience in deploying liquid-cooled AI infrastructure, along with comprehensive on-site services, management software, and global manufacturing capacity, provides customers a distinct advantage in transforming data centers with the most powerful and sustainable AI solutions."

NVIDIA Contributes Blackwell Platform Design to Open Hardware Ecosystem, Accelerating AI Infrastructure Innovation

To drive the development of open, efficient and scalable data center technologies, NVIDIA today announced that it has contributed foundational elements of its NVIDIA Blackwell accelerated computing platform design to the Open Compute Project (OCP) and broadened NVIDIA Spectrum-X support for OCP standards.

At this year's OCP Global Summit, NVIDIA will be sharing key portions of the NVIDIA GB200 NVL72 system electro-mechanical design with the OCP community — including the rack architecture, compute and switch tray mechanicals, liquid-cooling and thermal environment specifications, and NVIDIA NVLink cable cartridge volumetrics — to support higher compute density and networking bandwidth.

Western Digital Enterprise SSDs Certified to Support NVIDIA GB200 NVL72 System for Compute-Intensive AI Environments

Western Digital Corp. today announced that its PCIe Gen 5 DC SN861 E.1S enterprise-class NVMe SSDs have been certified to support the NVIDIA GB200 NVL72 rack-scale system.

The rapid rise of AI, ML, and large language models (LLMs) is creating a challenge for companies with two opposing forces. Data generation and consumption are accelerating, while organizations face pressure to quickly derive value from this data. Performance, scalability, and efficiency are essential for AI technology stacks as storage demands rise. Certified to be compatible with the GB200 NVL72 system, Western Digital's enterprise SSD addresses the growing needs of the AI market for high-speed accelerated computing combined with low latency to serve compute-intensive AI environments.

NVIDIA "Blackwell" GPUs are Sold Out for 12 Months, Customers Ordering in 100K GPU Quantities

NVIDIA's "Blackwell" series of GPUs, including B100, B200, and GB200, are reportedly sold out for 12 months or an entire year. This directly means that if a new customer is willing to order a new Blackwell GPU now, there is a 12-month waitlist to get that GPU. Analyst from Morgan Stanley Joe Moore confirmed that in a meeting with NVIDIA and its investors, NVIDIA executives confirmed that the demand for "Blackwell" is so great that there is a 12-month backlog to fulfill first before shipping to anyone else. We expect that this includes customers like Amazon, META, Microsoft, Google, Oracle, and others, who are ordering GPUs in insane quantities to keep up with the demand from their customers.

The previous generation of "Hopper" GPUs was ordered in 10s of thousands of GPUs, while this "Blackwell" generation was ordered in 100s of thousands of GPUs simultaneously. For NVIDIA, that is excellent news, as that demand is expected to continue. The only one standing in the way of customers is TSMC, which manufactures these GPUs as fast as possible to meet demand. NVIDIA is one of TSMC's largest customers, so wafer allocation at TSMC's facilities is only expected to grow. We are now officially in the era of the million-GPU data centers, and we can only question at what point this massive growth stops or if it will stop at all in the near future.

NVIDIA Might Consider Major Design Shift for Future 300 GPU Series

NVIDIA is reportedly considering a significant design change for its GPU products, shifting from the current on-board solution to an independent GPU socket design following the GB200 shipment in Q4, according to reports from MoneyDJ and the Economic Daily News quoted by TrendForce. This move is not new in the industry, AMD has already introduced socket design in 2023 with their MI300A series via Supermicro dedicated servers. The B300 series, expected to become NVIDIA's mainstream product in the second half of 2025, is rumored to be the main beneficiary of this design change that could improve yield rates, though it may come with some performance trade-offs.

According to the Economic Daily News, the socket design will simplify after-sales service and server board maintenance, allowing users to replace or upgrade the GPUs quickly. The report also pointed out that based on the slot design, boards will contain up to four NVIDIA GPUs and a CPU, with each GPU having its dedicated slot. This will bring benefits for Taiwanese manufacturers like Foxconn and LOTES, who will supply different components and connectors. The move seems logical since with the current on-board design, once a GPU becomes faulty, the entire motherboard needs to be replaced, leading to significant downtime and high operational and maintenance costs.

NVIDIA "Blackwell" GB200 Server Dedicates Two-Thirds of Space to Cooling at Microsoft Azure

Late Tuesday, Microsoft Azure shared an interesting picture on its social media platform X, showcasing the pinnacle of GPU-accelerated servers—NVIDIA "Blackwell" GB200-powered AI systems. Microsoft is one of NVIDIA's largest customers, and the company often receives products first to integrate into its cloud and company infrastructure. Even NVIDIA listens to feedback from companies like Microsoft about designing future products, especially those like the now-canceled NVL36x2 system. The picture below shows a massive cluster that roughly divides the compute area into a single-third of the entire system, with a gigantic two-thirds of the system dedicated to closed-loop liquid cooling.

The entire system is connected using Infiniband networking, a standard for GPU-accelerated systems due to its lower latency in packet transfer. While the details of the system are scarce, we can see that the integrated closed-loop liquid cooling allows the GPU racks to be in a 1U form for increased density. Given that these systems will go into the wider Microsoft Azure data centers, a system needs to be easily maintained and cooled. There are indeed limits in power and heat output that Microsoft's data centers can handle, so these types of systems often fit inside internal specifications that Microsoft designs. There are more compute-dense systems, of course, like NVIDIA's NVL72, but hyperscalers should usually opt for other custom solutions that fit into their data center specifications. Finally, Microsoft noted that we can expect to see more details at the upcoming Microsoft Ignite conference in November and learn more about its GB200-powered AI systems.

Foxconn to Build Taiwan's Fastest AI Supercomputer With NVIDIA Blackwell

NVIDIA and Foxconn are building Taiwan's largest supercomputer, marking a milestone in the island's AI advancement. The project, Hon Hai Kaohsiung Super Computing Center, revealed Tuesday at Hon Hai Tech Day, will be built around NVIDIA's groundbreaking Blackwell architecture and feature the GB200 NVL72 platform, which includes a total of 64 racks and 4,608 Tensor Core GPUs. With an expected performance of over 90 exaflops of AI performance, the machine would easily be considered the fastest in Taiwan.

Foxconn plans to use the supercomputer, once operational, to power breakthroughs in cancer research, large language model development and smart city innovations, positioning Taiwan as a global leader in AI-driven industries. Foxconn's "three-platform strategy" focuses on smart manufacturing, smart cities and electric vehicles. The new supercomputer will play a pivotal role in supporting Foxconn's ongoing efforts in digital twins, robotic automation and smart urban infrastructure, bringing AI-assisted services to urban areas like Kaohsiung.

ASUS Presents Comprehensive AI Server Lineup

ASUS today announced its ambitious All in AI initiative, marking a significant leap into the server market with a complete AI infrastructure solution, designed to meet the evolving demands of AI-driven applications from edge, inference and generative AI the new, unparalleled wave of AI supercomputing. ASUS has proven its expertise lies in striking the perfect balance between hardware and software, including infrastructure and cluster architecture design, server installation, testing, onboarding, remote management and cloud services - positioning the ASUS brand and AI server solutions to lead the way in driving innovation and enabling the widespread adoption of AI across industries.

Meeting diverse AI needs
In partnership with NVIDIA, Intel and AMD, ASUS offer comprehensive AI-infrastructure solutions with robust software platforms and services, from entry-level AI servers and machine-learning solutions to full racks and data centers for large-scale supercomputing. At the forefront is the ESC AI POD with NVIDIA GB200 NVL72, a cutting-edge rack designed to accelerate trillion-token LLM training and real-time inference operations. Complemented by the latest NVIDIA Blackwell GPUs, NVIDIA Grace CPUs and 5th Gen NVIDIA NVLink technology, ASUS servers ensure unparalleled computing power and efficiency.
Return to Keyword Browsing
Dec 21st, 2024 20:21 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts