Monday, March 18th 2024

NVIDIA Blackwell Platform Arrives to Power a New Era of Computing

Powering a new era of computing, NVIDIA today announced that the NVIDIA Blackwell platform has arrived—enabling organizations everywhere to build and run real-time generative AI on trillion-parameter large language models at up to 25x less cost and energy consumption than its predecessor.

The Blackwell GPU architecture features six transformative technologies for accelerated computing, which will help unlock breakthroughs in data processing, engineering simulation, electronic design automation, computer-aided drug design, quantum computing and generative AI—all emerging industry opportunities for NVIDIA.
"For three decades we've pursued accelerated computing, with the goal of enabling transformative breakthroughs like deep learning and AI," said Jensen Huang, founder and CEO of NVIDIA. "Generative AI is the defining technology of our time. Blackwell is the engine to power this new industrial revolution. Working with the most dynamic companies in the world, we will realize the promise of AI for every industry."

Among the many organizations expected to adopt Blackwell are Amazon Web Services, Dell Technologies, Google, Meta, Microsoft, OpenAI, Oracle, Tesla and xAI.

Sundar Pichai, CEO of Alphabet and Google: "Scaling services like Search and Gmail to billions of users has taught us a lot about managing compute infrastructure. As we enter the AI platform shift, we continue to invest deeply in infrastructure for our own products and services, and for our Cloud customers. We are fortunate to have a longstanding partnership with NVIDIA, and look forward to bringing the breakthrough capabilities of the Blackwell GPU to our Cloud customers and teams across Google, including Google DeepMind, to accelerate future discoveries."

Andy Jassy, president and CEO of Amazon: "Our deep collaboration with NVIDIA goes back more than 13 years, when we launched the world's first GPU cloud instance on AWS. Today we offer the widest range of GPU solutions available anywhere in the cloud, supporting the world's most technologically advanced accelerated workloads. It's why the new NVIDIA Blackwell GPU will run so well on AWS and the reason that NVIDIA chose AWS to co-develop Project Ceiba, combining NVIDIA's next-generation Grace Blackwell Superchips with the AWS Nitro System's advanced virtualization and ultra-fast Elastic Fabric Adapter networking, for NVIDIA's own AI research and development. Through this joint effort between AWS and NVIDIA engineers, we're continuing to innovate together to make AWS the best place for anyone to run NVIDIA GPUs in the cloud."

Michael Dell, founder and CEO of Dell Technologies: "Generative AI is critical to creating smarter, more reliable and efficient systems. Dell Technologies and NVIDIA are working together to shape the future of technology. With the launch of Blackwell, we will continue to deliver the next-generation of accelerated products and services to our customers, providing them with the tools they need to drive innovation across industries."

Demis Hassabis, cofounder and CEO of Google DeepMind: "The transformative potential of AI is incredible, and it will help us solve some of the world's most important scientific problems. Blackwell's breakthrough technological capabilities will provide the critical compute needed to help the world's brightest minds chart new scientific discoveries."

Mark Zuckerberg, founder and CEO of Meta: "AI already powers everything from our large language models to our content recommendations, ads, and safety systems, and it's only going to get more important in the future. We're looking forward to using NVIDIA's Blackwell to help train our open-source Llama models and build the next generation of Meta AI and consumer products."

Satya Nadella, executive chairman and CEO of Microsoft: "We are committed to offering our customers the most advanced infrastructure to power their AI workloads. By bringing the GB200 Grace Blackwell processor to our datacenters globally, we are building on our long-standing history of optimizing NVIDIA GPUs for our cloud, as we make the promise of AI real for organizations everywhere."

Sam Altman, CEO of OpenAI: "Blackwell offers massive performance leaps, and will accelerate our ability to deliver leading-edge models. We're excited to continue working with NVIDIA to enhance AI compute."

Larry Ellison, chairman and CTO of Oracle: "Oracle's close collaboration with NVIDIA will enable qualitative and quantitative breakthroughs in AI, machine learning and data analytics. In order for customers to uncover more actionable insights, an even more powerful engine like Blackwell is needed, which is purpose-built for accelerated computing and generative AI."

Elon Musk, CEO of Tesla and xAI: "There is currently nothing better than NVIDIA hardware for AI."

Named in honor of David Harold Blackwell—a mathematician who specialized in game theory and statistics, and the first Black scholar inducted into the National Academy of Sciences—the new architecture succeeds the NVIDIA Hopper architecture, launched two years ago.

Blackwell Innovations to Fuel Accelerated Computing and Generative AI
Blackwell's six revolutionary technologies, which together enable AI training and real-time LLM inference for models scaling up to 10 trillion parameters, include:
  • World's Most Powerful Chip—Packed with 208 billion transistors, Blackwell-architecture GPUs are manufactured using a custom-built 4NP TSMC process with two-reticle limit GPU dies connected by 10 TB/second chip-to-chip link into a single, unified GPU.
  • Second-Generation Transformer Engine—Fueled by new micro-tensor scaling support and NVIDIA's advanced dynamic range management algorithms integrated into NVIDIA TensorRT -LLM and NeMo Megatron frameworks, Blackwell will support double the compute and model sizes with new 4-bit floating point AI inference capabilities.
  • Fifth-Generation NVLink—To accelerate performance for multitrillion-parameter and mixture-of-experts AI models, the latest iteration of NVIDIA NVLink delivers groundbreaking 1.8 TB/s bidirectional throughput per GPU, ensuring seamless high-speed communication among up to 576 GPUs for the most complex LLMs.
  • RAS Engine—Blackwell-powered GPUs include a dedicated engine for reliability, availability and serviceability. Additionally, the Blackwell architecture adds capabilities at the chip level to utilize AI-based preventative maintenance to run diagnostics and forecast reliability issues. This maximizes system uptime and improves resiliency for massive-scale AI deployments to run uninterrupted for weeks or even months at a time and to reduce operating costs.
  • Secure AI—Advanced confidential computing capabilities protect AI models and customer data without compromising performance, with support for new native interface encryption protocols, which are critical for privacy-sensitive industries like healthcare and financial services.
  • Decompression Engine—A dedicated decompression engine supports the latest formats, accelerating database queries to deliver the highest performance in data analytics and data science. In the coming years, data processing, on which companies spend tens of billions of dollars annually, will be increasingly GPU-accelerated.
A Massive Superchip
The NVIDIA GB200 Grace Blackwell Superchip connects two NVIDIA B200 Tensor Core GPUs to the NVIDIA Grace CPU over a 900 GB/s ultra-low-power NVLink chip-to-chip interconnect.

For the highest AI performance, GB200-powered systems can be connected with the NVIDIA Quantum-X800 InfiniBand and Spectrum -X800 Ethernet platforms, also announced today, which deliver advanced networking at speeds up to 800 Gb/s.

The GB200 is a key component of the NVIDIA GB200 NVL72, a multi-node, liquid-cooled, rack-scale system for the most compute-intensive workloads. It combines 36 Grace Blackwell Superchips, which include 72 Blackwell GPUs and 36 Grace CPUs interconnected by fifth-generation NVLink. Additionally, GB200 NVL72 includes NVIDIA BlueField -3 data processing units to enable cloud network acceleration, composable storage, zero-trust security and GPU compute elasticity in hyperscale AI clouds. The GB200 NVL72 provides up to a 30x performance increase compared to the same number of NVIDIA H100 Tensor Core GPUs for LLM inference workloads, and reduces cost and energy consumption by up to 25x.

The platform acts as a single GPU with 1.4 exaflops of AI performance and 30 TB of fast memory, and is a building block for the newest DGX SuperPOD.

NVIDIA offers the HGX B200, a server board that links eight B200 GPUs through NVLink to support x86-based generative AI platforms. HGX B200 supports networking speeds up to 400 Gb/s through the NVIDIA Quantum-2 InfiniBand and Spectrum-X Ethernet networking platforms.

Global Network of Blackwell Partners
Blackwell-based products will be available from partners starting later this year.

AWS, Google Cloud, Microsoft Azure and Oracle Cloud Infrastructure will be among the first cloud service providers to offer Blackwell-powered instances, as will NVIDIA Cloud Partner program companies Applied Digital, CoreWeave, Crusoe, IBM Cloud and Lambda. Sovereign AI clouds will also provide Blackwell-based cloud services and infrastructure, including Indosat Ooredoo Hutchinson, Nebius, Nexgen Cloud, Oracle EU Sovereign Cloud, the Oracle US, UK, and Australian Government Clouds, Scaleway, Singtel, Northern Data Group's Taiga Cloud, Yotta Data Services' Shakti Cloud and YTL Power International.

GB200 will also be available on NVIDIA DGX Cloud, an AI platform co-engineered with leading cloud service providers that gives enterprise developers dedicated access to the infrastructure and software needed to build and deploy advanced generative AI models. AWS, Google Cloud and Oracle Cloud Infrastructure plan to host new NVIDIA Grace Blackwell-based instances later this year.

Cisco, Dell, Hewlett Packard Enterprise, Lenovo and Supermicro are expected to deliver a wide range of servers based on Blackwell products, as are Aivres, ASRock Rack, ASUS, Eviden, Foxconn, GIGABYTE, Inventec, Pegatron, QCT, Wistron, Wiwynn and ZT Systems.

Additionally, a growing network of software makers, including Ansys, Cadence and Synopsys—global leaders in engineering simulation—will use Blackwell-based processors to accelerate their software for designing and simulating electrical, mechanical and manufacturing systems and parts. Their customers can use generative AI and accelerated computing to bring products to market faster, at lower cost and with higher energy efficiency.

NVIDIA Software Support
The Blackwell product portfolio is supported by NVIDIA AI Enterprise, the end-to-end operating system for production-grade AI. NVIDIA AI Enterprise includes NVIDIA NIM inference microservices—also announced today—as well as AI frameworks, libraries and tools that enterprises can deploy on NVIDIA-accelerated clouds, data centers and workstations.
Source: NVIDIA
Add your own comment

20 Comments on NVIDIA Blackwell Platform Arrives to Power a New Era of Computing

#1
Readlight
100 Billion transistor gaming card.
Posted on Reply
#2
Philaphlous
The scale of increase over what we've had in just a couple of years is staggering....

If Elon says its the best...I believe it
Posted on Reply
#3
RUSerious
Readlight100 Billion transistor gaming card.
Hopefully, for at least the 5090. Still, nothing about consumer gfx cards yet.

Also, I hope AMD's 400 series AI/ML GPUs will be available by the end of this year - for their sake.
Posted on Reply
#4
xrli
The performance/transistor improvement of B100 over H100 is surprisingly not a lot. I guess they are taking smaller steps to get familiar with high-speed die-to-die interconnect. Each half of the die is also only working with a 4096-bit memory bus rather than the 5120-bit of H100... The actual selling point is the higher NVLink speed and the FP4 support. Together with NVSwitch, you will have hundreds of TBs of data running around behind a rack per second, absolutely insane!
Posted on Reply
#5
FoulOnWhite
Readlight100 Billion transistor gaming card.
Have a custom $25,000 video card built :p /s

lot of big corps salivating over this and putting gold trim on jensens bugatti vey
Posted on Reply
#6
sLowEnd
Looks like Nvidia is on the chiplet train now too


According to Anandtech, it's because of reticle size limits
The first thing to note is that the Blackwell GPU is going to be big. Literally. The B200 modules that it will go into will feature two GPU dies on a single package. That’s right, NVIDIA has finally gone chiplet with their flagship accelerator. While they are not disclosing the size of the individual dies, we’re told that they are “reticle-sized” dies, which should put them somewhere over 800mm2 each. The GH100 die itself was already approaching TSMC’s 4nm reticle limits, so there’s very little room for NVIDIA to grow here – at least without staying within a single die
www.anandtech.com/show/21310/nvidia-blackwell-architecture-and-b200b100-accelerators-announced-going-bigger-with-smaller-data
Posted on Reply
#7
Denver
The data Nvidia shows in the slides is skewed to self-aggrandizement, showing performance in sparsity operations only. The performance gain per GPU was only 13-14% versus the H100 in normal operations.

Posted on Reply
#8
Fourstaff
Cant wait to see benchmarks. On one hand its running at the same node as its predecessors, on the other hand Nvidia has done a lot of optimisation to match workload. They are sacrificing FP64 perf to increase INT8 perf.
Posted on Reply
#9
Wirko
sLowEndAccording to Anandtech, it's because of reticle size limits
Sure. Reticle size is 26 x 33 mm and has been a limiting factor in chip design since long before chiplets. It would probably be possible to stitch EUV and DUV exposures and make larger dies (with lower yields), but damn Cerebras, they hold all the patents!
Posted on Reply
#10
Minus Infinity
RUSeriousHopefully, for at least the 5090. Still, nothing about consumer gfx cards yet.

Also, I hope AMD's 400 series AI/ML GPUs will be available by the end of this year - for their sake.
desktop isn't coming for 9-12 months and it's 5090 first.
Posted on Reply
#11
N/A
Readlight100 Billion transistor gaming card.
only 33% more than 4090? that's a bit underwhelming. and 5090 is on the N3 node with +60% density.
Posted on Reply
#12
AnotherReader
DenverThe data Nvidia shows in the slides is skewed to self-aggrandizement, showing performance in sparsity operations only. The performance gain per GPU was only 13-14% versus the H100 in normal operations.

That's because the process node used to manufacture these is essentially the same as Hopper.
Posted on Reply
#13
Crackong
Doesn't translate into gaming performance since our games runs on general computing..not AI Tensor cores
Posted on Reply
#14
Bwaze
Impressive.

A long article about next gen GPU technology, and "gaming" is mentioned exactly once, and even that is not PC gaming related:

"Named in honor of David Harold Blackwell—a mathematician who specialized in game theory and statistics"

Will there even be a gaming GPU level of products?

:p
Posted on Reply
#15
Denver
N/Aonly 33% more than 4090? that's a bit underwhelming. and 5090 is on the N3 node with +60% density.
What... Where does this information come from? 60% density gain in what, an arm chip with almost no cache ? Even these only achieve a 42% improvement in density. :')
Posted on Reply
#16
Onasi
BwazeImpressive.

A long article about next gen GPU technology, and "gaming" is mentioned exactly once, and even that is not PC gaming related:

"Named in honor of David Harold Blackwell—a mathematician who specialized in game theory and statistics"

Will there even be a gaming GPU level of products?

:p
HPC hardware announced at an HPC focused event for enterprise datacenter clients doesn’t mention gaming in its press materials? You don’t say.
I have a sneaking suspicion that people just see NVidia and GPU in one sentence and it triggers some primal response to talk about gaming when it’s literally irrelevant here.
Don’t worry, gaming GPUs will come, probably around the end of the year, as was expected and as usually the case with NV. September or October is my prediction.
Posted on Reply
#17
Bwaze
OnasiHPC hardware announced at an HPC focused event for enterprise datacenter clients doesn’t mention gaming in its press materials? You don’t say.
I have a sneaking suspicion that people just see NVidia and GPU in one sentence and it triggers some primal response to talk about gaming when it’s literally irrelevant here.
Don’t worry, gaming GPUs will come, probably around the end of the year, as was expected and as usually the case with NV. September or October is my prediction.
GTC = GPU Technology Conference.

In the 2022 Nvidia GTC they talked about what Ada Technology will bring to Gaming, although it was always primarily Data Center oriented.

But yeah, at that time Data Center and Gaming had about the same revenue. Now? Data Center 18.4 billion dollars, Gaming 2.9 billion dollars per quarter.

And even that Gaming figure is highly doubtful, considering that cards remain on shelves for months.
Posted on Reply
#18
Onasi
BwazeGTC = GPU Technology Conference.

In the 2022 Nvidia GTC they talked about what Ada Technology will bring to Gaming, although it was always primarily Data Center oriented.

But yeah, at that time Data Center and Gaming had about the same revenue. Now? Data Center 18.4 billion dollars, Gaming 2.9 billion dollars per quarter.

And even that Gaming figure is highly doubtful, considering that cards remain on shelves for months.
GTC hasn’t been solely about gaming since its inception, if you actually look at the topics presented. Even the talks focused in 3D rendering are usually broad and not fully gaming centered. And every time there is a Spring session it is HPC focused exclusively. 2022? You mean the Autumn one, probably. I am literally right now looking at the list of sessions from Spring and I see exactly nothing major that would be gaming related. In Autumn though? Yeah, as I said, September-October is when they will actually talk about the consumer cards. Again, this is NV as usual, nothing new - consumer cards are end-of-year thing, expecting them to talk gaming in Spring is useless, they don’t do that typically, not for many years.
Posted on Reply
#19
ThrashZone
Hi,
Only power AI has is the power of spam delivery.
Posted on Reply
#20
RUSerious
Minus Infinitydesktop isn't coming for 9-12 months and it's 5090 first.
That's longer than I would have guessed. But, AI/ML Blackwell is the priority and competition from AMD won't be much of a factor with RDNA4. Not important, but it won't affect me - I'll probably be buying a RTX 50X0 toward the end of its cycle.
Posted on Reply
Add your own comment
May 21st, 2024 13:47 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts