• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Google Announces A3 Supercomputers with NVIDIA H100 GPUs, Purpose-built for AI

GFreeman

News Editor
Staff member
Joined
Mar 6, 2023
Messages
1,521 (2.43/day)
Implementing state-of-the-art artificial intelligence (AI) and machine learning (ML) models requires large amounts of computation, both to train the underlying models, and to serve those models once they're trained. Given the demands of these workloads, a one-size-fits-all approach is not enough - you need infrastructure that's purpose-built for AI.

Together with our partners, we offer a wide range of compute options for ML use cases such as large language models (LLMs), generative AI, and diffusion models. Recently, we announced G2 VMs, becoming the first cloud to offer the new NVIDIA L4 Tensor Core GPUs for serving generative AI workloads. Today, we're expanding that portfolio with the private preview launch of the next-generation A3 GPU supercomputer. Google Cloud now offers a complete range of GPU options for training and inference of ML models.



Google Compute Engine A3 supercomputers are purpose-built to train and serve the most demanding AI models that power today's generative AI and large language model innovation. Our A3 VMs combine NVIDIA H100 Tensor Core GPUs and Google's leading networking advancements to serve customers of all sizes:

  • A3 is the first GPU instance to use our custom-designed 200 Gbps IPUs, with GPU-to-GPU data transfers bypassing the CPU host and flowing over separate interfaces from other VM networks and data traffic. This enables up to 10x more network bandwidth compared to our A2 VMs, with low tail latencies and high bandwidth stability.
  • Our industry-unique intelligent Jupiter data center networking fabric scales to tens of thousands of highly interconnected GPUs and allows for full-bandwidth reconfigurable optical links that can adjust the topology on demand. For almost every workload structure, we achieve workload bandwidth that is indistinguishable from more expensive off-the-shelf non-blocking network fabrics, resulting in a lower TCO.
  • The A3 supercomputer's scale provides up to 26 exaFlops of AI performance, which considerably improves the time and costs for training large ML models.

As companies transition from training to serving their ML models, A3 VMs are also a strong fit for inference workloads, seeing up to a 30x inference performance boost when compared to our A2 VM's that are powered by NVIDIA A100 Tensor Core GPU*.

Purpose-built for performance and scale
A3 GPU VMs were purpose-built to deliver the highest-performance training for today's ML workloads, complete with modern CPU, improved host memory, next-generation NVIDIA GPUs and major network upgrades. Here are the key features of the A3:

  • 8 H100 GPUs utilizing NVIDIA's Hopper architecture, delivering 3x compute throughput
  • 3.6 TB/s bisectional bandwidth between A3's 8 GPUs via NVIDIA NVSwitch and NVLink 4.0
  • Next-generation 4th Gen Intel Xeon Scalable processors
  • 2 TB of host memory via 4800 MHz DDR5 DIMMs
  • 10x greater networking bandwidth powered by our hardware-enabled IPUs, specialized inter-server GPU communication stack and NCCL optimizations

A3 GPU VMs are a step forward for customers developing the most advanced ML models. By considerably speeding up the training and inference of ML models, A3 VMs enable businesses to train more complex ML models at a fast speed, creating an opportunity for our customer to build large language models (LLMs), generative AI, and diffusion models to help optimize operations and stay ahead of the competition.

This announcement builds on our partnership with NVIDIA to offer a full range of GPU options for training and inference of ML models to our customers.

"Google Cloud's A3 VMs, powered by next-generation NVIDIA H100 GPUs, will accelerate training and serving of generative AI applications," said Ian Buck, vice president of hyperscale and high performance computing at NVIDIA. "On the heels of Google Cloud's recently launched G2 instances, we're proud to continue our work with Google Cloud to help transform enterprises around the world with purpose-built AI infrastructure."

Fully-managed AI infrastructure optimized for performance and cost
For customers looking to develop complex ML models without the maintenance, you can deploy A3 VMs on Vertex AI, an end-to-end platform for building ML models on fully-managed infrastructure that's purpose-built for low-latency serving and high-performance training. Today, at Google I/O 2023, we're pleased to build on these offerings by both opening generative AI support in Vertex AI to more customers, and by introducing new features and foundation models.

For customers looking to architect their own custom software stack, customers can also deploy A3 VMs on Google Kubernetes Engine (GKE) and Compute Engine, so that you can train and serve the latest foundation models, while enjoying support for autoscaling, workload orchestration, and automatic upgrades.

"Google Cloud's A3 VM instances provide us with the computational power and scale for our most demanding training and inference workloads. We're looking forward to taking advantage of their expertise in the AI space and leadership in large-scale infrastructure to deliver a strong platform for our ML workloads." -Noam Shazeer, CEO, Character.AI

At Google Cloud, AI is in our DNA. We've applied decades of experience running global scale computing for AI. We designed that infrastructure to scale and be optimized for running a wide variety of AI workloads - and now, we're making it available to you.

View at TechPowerUp Main Site | Source
 
Joined
Jul 10, 2015
Messages
754 (0.22/day)
Location
Sokovia
System Name Alienation from family
Processor i7 7700k
Motherboard Hero VIII
Cooling Macho revB
Memory 16gb Hyperx
Video Card(s) Asus 1080ti Strix OC
Storage 960evo 500gb
Display(s) AOC 4k
Case Define R2 XL
Power Supply Be f*ing Quiet 600W M Gold
Mouse NoName
Keyboard NoNameless HP
Software You have nothing on me
Benchmark Scores Personal record 100m sprint: 60m
Nvidia FTW!
 
Joined
Sep 2, 2014
Messages
660 (0.18/day)
Location
Scotland
Processor 5800x
Motherboard b550-e
Cooling full - custom liquid loop
Memory cl16 - 32gb
Video Card(s) 6800xt
Storage nvme 1TB + ssd 750gb
Display(s) xg32vc
Case hyte y60
Power Supply 1000W - gold
Software 10
Good to see so much time and money sunk into an entirely unproductive application.
This unproductive application replace around 10000 20000 works places in next year..

Thats how is work
 
Joined
Sep 15, 2015
Messages
1,068 (0.32/day)
Location
Latvija
System Name Fujitsu Siemens, HP Workstation
Processor Athlon x2 5000+ 3.1GHz, i5 2400
Motherboard Asus
Memory 4GB Samsung
Video Card(s) rx 460 4gb
Storage 750 Evo 250 +2tb
Display(s) Asus 1680x1050 4K HDR
Audio Device(s) Pioneer
Power Supply 430W
Mouse Acme
Keyboard Trust
Translate game language with correct voices, for all world languages.
 
Top