Google Announces A3 Supercomputers with NVIDIA H100 GPUs, Purpose-built for AI

GFreeman · May 10, 2023

Implementing state-of-the-art artificial intelligence (AI) and machine learning (ML) models requires large amounts of computation, both to train the underlying models, and to serve those models once they're trained. Given the demands of these workloads, a one-size-fits-all approach is not enough - you need infrastructure that's purpose-built for AI.

Together with our partners, we offer a wide range of compute options for ML use cases such as large language models (LLMs), generative AI, and diffusion models. Recently, we announced G2 VMs, becoming the first cloud to offer the new NVIDIA L4 Tensor Core GPUs for serving generative AI workloads. Today, we're expanding that portfolio with the private preview launch of the next-generation A3 GPU supercomputer. Google Cloud now offers a complete range of GPU options for training and inference of ML models.

Google Compute Engine A3 supercomputers are purpose-built to train and serve the most demanding AI models that power today's generative AI and large language model innovation. Our A3 VMs combine NVIDIA H100 Tensor Core GPUs and Google's leading networking advancements to serve customers of all sizes:

A3 is the first GPU instance to use our custom-designed 200 Gbps IPUs, with GPU-to-GPU data transfers bypassing the CPU host and flowing over separate interfaces from other VM networks and data traffic. This enables up to 10x more network bandwidth compared to our A2 VMs, with low tail latencies and high bandwidth stability.
Our industry-unique intelligent Jupiter data center networking fabric scales to tens of thousands of highly interconnected GPUs and allows for full-bandwidth reconfigurable optical links that can adjust the topology on demand. For almost every workload structure, we achieve workload bandwidth that is indistinguishable from more expensive off-the-shelf non-blocking network fabrics, resulting in a lower TCO.
The A3 supercomputer's scale provides up to 26 exaFlops of AI performance, which considerably improves the time and costs for training large ML models.

As companies transition from training to serving their ML models, A3 VMs are also a strong fit for inference workloads, seeing up to a 30x inference performance boost when compared to our A2 VM's that are powered by NVIDIA A100 Tensor Core GPU*.

Purpose-built for performance and scale
A3 GPU VMs were purpose-built to deliver the highest-performance training for today's ML workloads, complete with modern CPU, improved host memory, next-generation NVIDIA GPUs and major network upgrades. Here are the key features of the A3:

8 H100 GPUs utilizing NVIDIA's Hopper architecture, delivering 3x compute throughput
3.6 TB/s bisectional bandwidth between A3's 8 GPUs via NVIDIA NVSwitch and NVLink 4.0
Next-generation 4th Gen Intel Xeon Scalable processors
2 TB of host memory via 4800 MHz DDR5 DIMMs
10x greater networking bandwidth powered by our hardware-enabled IPUs, specialized inter-server GPU communication stack and NCCL optimizations

A3 GPU VMs are a step forward for customers developing the most advanced ML models. By considerably speeding up the training and inference of ML models, A3 VMs enable businesses to train more complex ML models at a fast speed, creating an opportunity for our customer to build large language models (LLMs), generative AI, and diffusion models to help optimize operations and stay ahead of the competition.

This announcement builds on our partnership with NVIDIA to offer a full range of GPU options for training and inference of ML models to our customers.

"Google Cloud's A3 VMs, powered by next-generation NVIDIA H100 GPUs, will accelerate training and serving of generative AI applications," said Ian Buck, vice president of hyperscale and high performance computing at NVIDIA. "On the heels of Google Cloud's recently launched G2 instances, we're proud to continue our work with Google Cloud to help transform enterprises around the world with purpose-built AI infrastructure."

Fully-managed AI infrastructure optimized for performance and cost
For customers looking to develop complex ML models without the maintenance, you can deploy A3 VMs on Vertex AI, an end-to-end platform for building ML models on fully-managed infrastructure that's purpose-built for low-latency serving and high-performance training. Today, at Google I/O 2023, we're pleased to build on these offerings by both opening generative AI support in Vertex AI to more customers, and by introducing new features and foundation models.

For customers looking to architect their own custom software stack, customers can also deploy A3 VMs on Google Kubernetes Engine (GKE) and Compute Engine, so that you can train and serve the latest foundation models, while enjoying support for autoscaling, workload orchestration, and automatic upgrades.

"Google Cloud's A3 VM instances provide us with the computational power and scale for our most demanding training and inference workloads. We're looking forward to taking advantage of their expertise in the AI space and leadership in large-scale infrastructure to deliver a strong platform for our ML workloads." -Noam Shazeer, CEO, Character.AI

At Google Cloud, AI is in our DNA. We've applied decades of experience running global scale computing for AI. We designed that infrastructure to scale and be optimized for running a wide variety of AI workloads - and now, we're making it available to you.

View at TechPowerUp Main Site | Source

Anymal · May 10, 2023

Nvidia FTW!

bobsled · May 10, 2023

Good to see so much time and money sunk into an entirely unproductive application.

zo0lykas · May 11, 2023

bobsled said:
Good to see so much time and money sunk into an entirely unproductive application.

This unproductive application replace around 10000 20000 works places in next year..

Thats how is work

Readlight · May 11, 2023

Translate game language with correct voices, for all world languages.

System Name	Alienation from family
Processor	i7 7700k
Motherboard	Hero VIII
Cooling	Macho revB
Memory	16gb Hyperx
Video Card(s)	Asus 1080ti Strix OC
Storage	960evo 500gb
Display(s)	AOC 4k
Case	Define R2 XL
Power Supply	Be f*ing Quiet 600W M Gold
Mouse	NoName
Keyboard	NoNameless HP
Software	You have nothing on me
Benchmark Scores	Personal record 100m sprint: 60m

Processor	5800x
Motherboard	b550-e
Cooling	full - custom liquid loop
Memory	cl16 - 32gb
Video Card(s)	6800xt
Storage	nvme 1TB + ssd 750gb
Display(s)	xg32vc
Case	hyte y60
Power Supply	1000W - gold
Software	10

System Name	Fujitsu Siemens, HP Workstation
Processor	Athlon x2 5000+ 3.1GHz, i5 2400
Motherboard	Asus
Memory	4GB Samsung
Video Card(s)	rx 460 4gb
Storage	750 Evo 250 +2tb
Display(s)	Asus 1680x1050 4K HDR
Audio Device(s)	Pioneer
Power Supply	430W
Mouse	Acme
Keyboard	Trust

Google Announces A3 Supercomputers with NVIDIA H100 GPUs, Purpose-built for AI

GFreeman

News Editor

Anymal

bobsled

zo0lykas

Readlight