News Posts matching #NVIDIA H100

Return to Keyword Browsing

Supermicro Expands AI Solutions with the Upcoming NVIDIA HGX H200 and MGX Grace Hopper Platforms Featuring HBM3e Memory

Supermicro, Inc., a Total IT Solution Provider for AI, Cloud, Storage, and 5G/Edge, is expanding its AI reach with the upcoming support for the new NVIDIA HGX H200 built with H200 Tensor Core GPUs. Supermicro's industry leading AI platforms, including 8U and 4U Universal GPU Systems, are drop-in ready for the HGX H200 8-GPU, 4-GPU, and with nearly 2x capacity and 1.4x higher bandwidth HBM3e memory compared to the NVIDIA H100 Tensor Core GPU. In addition, the broadest portfolio of Supermicro NVIDIA MGX systems supports the upcoming NVIDIA Grace Hopper Superchip with HBM3e memory. With unprecedented performance, scalability, and reliability, Supermicro's rack scale AI solutions accelerate the performance of computationally intensive generative AI, large language Model (LLM) training, and HPC applications while meeting the evolving demands of growing model sizes. Using the building block architecture, Supermicro can quickly bring new technology to market, enabling customers to become more productive sooner.

Supermicro is also introducing the industry's highest density server with NVIDIA HGX H100 8-GPUs systems in a liquid cooled 4U system, utilizing the latest Supermicro liquid cooling solution. The industry's most compact high performance GPU server enables data center operators to reduce footprints and energy costs while offering the highest performance AI training capacity available in a single rack. With the highest density GPU systems, organizations can reduce their TCO by leveraging cutting-edge liquid cooling solutions.

GIGABYTE Demonstrates the Future of Computing at Supercomputing 2023 with Advanced Cooling and Scaled Data Centers

GIGABYTE Technology, Giga Computing, a subsidiary of GIGABYTE and an industry leader in high-performance servers, server motherboards, and workstations, continues to be a leader in cooling IT hardware efficiently and in developing diverse server platforms for Arm and x86 processors, as well as AI accelerators. At SC23, GIGABYTE (booth #355) will showcase some standout platforms, including for the NVIDIA GH200 Grace Hopper Superchip and next-gen AMD Instinct APU. To better introduce its extensive lineup of servers, GIGABYTE will address the most important needs in supercomputing data centers, such as how to cool high-performance IT hardware efficiently and power AI that is capable of real-time analysis and fast time to results.

Advanced Cooling
For many data centers, it is becoming apparent that their cooling infrastructure must radically shift to keep pace with new IT hardware that continues to generate more heat and requires rapid heat transfer. Because of this, GIGABYTE has launched advanced cooling solutions that allow IT hardware to maintain ideal performance while being more energy-efficient and maintaining the same data center footprint. At SC23, its booth will have a single-phase immersion tank, the A1P0-EA0, which offers a one-stop immersion cooling solution. GIGABYTE is experienced in implementing immersion cooling with immersion-ready servers, immersion tanks, oil, tools, and services spanning the globe. Another cooling solution showcased at SC23 will be direct liquid cooling (DLC), and in particular, the new GIGABYTE cold plates and cooling modules for the NVIDIA Grace CPU Superchip, NVIDIA Grace Hopper Superchip, AMD EPYC 9004 processor, and 4th Gen Intel Xeon processor.

NVIDIA Turbocharges Generative AI Training in MLPerf Benchmarks

NVIDIA's AI platform raised the bar for AI training and high performance computing in the latest MLPerf industry benchmarks. Among many new records and milestones, one in generative AI stands out: NVIDIA Eos - an AI supercomputer powered by a whopping 10,752 NVIDIA H100 Tensor Core GPUs and NVIDIA Quantum-2 InfiniBand networking - completed a training benchmark based on a GPT-3 model with 175 billion parameters trained on one billion tokens in just 3.9 minutes. That's a nearly 3x gain from 10.9 minutes, the record NVIDIA set when the test was introduced less than six months ago.

The benchmark uses a portion of the full GPT-3 data set behind the popular ChatGPT service that, by extrapolation, Eos could now train in just eight days, 73x faster than a prior state-of-the-art system using 512 A100 GPUs. The acceleration in training time reduces costs, saves energy and speeds time-to-market. It's heavy lifting that makes large language models widely available so every business can adopt them with tools like NVIDIA NeMo, a framework for customizing LLMs. In a new generative AI test ‌this round, 1,024 NVIDIA Hopper architecture GPUs completed a training benchmark based on the Stable Diffusion text-to-image model in 2.5 minutes, setting a high bar on this new workload. By adopting these two tests, MLPerf reinforces its leadership as the industry standard for measuring AI performance, since generative AI is the most transformative technology of our time.

NVIDIA NeMo: Designers Tap Generative AI for a Chip Assist

A research paper released this week describes ways generative AI can assist one of the most complex engineering efforts: designing semiconductors. The work demonstrates how companies in highly specialized fields can train large language models (LLMs) on their internal data to build assistants that increase productivity.

Few pursuits are as challenging as semiconductor design. Under a microscope, a state-of-the-art chip like an NVIDIA H100 Tensor Core GPU (above) looks like a well-planned metropolis, built with tens of billions of transistors, connected on streets 10,000x thinner than a human hair. Multiple engineering teams coordinate for as long as two years to construct one of these digital mega cities. Some groups define the chip's overall architecture, some craft and place a variety of ultra-small circuits, and others test their work. Each job requires specialized methods, software programs and computer languages.

Alphacool Unveils Single-Slot ES H100 80GB HBM PCIe Water Block for NVIDIA H100 GPU

Alphacool expands the portfolio of the Enterprise Solutions series for GPU water coolers and presents the new ES H100 80 GB HBM PCIe. In order to dissipate the enormous waste heat of this GPU generation in the best possible way, the cooler is positioned close to the components to be cooled in an exemplary manner. Alphacool uses only copper for all water-bearing parts. Copper has almost twice the thermal conductivity of aluminium, making it a much better choice of material for a water cooler. The fully chrome plated copper base makes it resistant to acids, scratches and damage. The matte carbon finish gives the cooler a classy appearance. At the same time, this makes it interesting for private users who want to do without aRGB lighting.

The water cooler is specially designed for use in narrow server cases. To save space in width and height, the connections have been moved to the back. This also allows for easier hosing inside the server rack. Thanks to the very compact design, only 1 slot is required for mounting the cooler in the server rack instead of 1.5 slots as before. This further reduction of required space is a strong argument for the use of the ES Copper/Carbon GPU watercooler. Smart and efficient.

NVIDIA Partners With Foxconn to Build Factories and Systems for the AI Industrial Revolution

NVIDIA today announced that it is collaborating with Hon Hai Technology Group (Foxconn) to accelerate the AI industrial revolution. Foxconn will integrate NVIDIA technology to develop a new class of data centers powering a wide range of applications—including digitalization of manufacturing and inspection workflows, development of AI-powered electric vehicle and robotics platforms, and a growing number of language-based generative AI services.

Announced in a fireside chat with NVIDIA founder and CEO Jensen Huang and Foxconn Chairman and CEO Young Liu at Hon Hai Tech Day, in Taipei, the collaboration starts with the creation of AI factories—an NVIDIA GPU computing infrastructure specially built for processing, refining and transforming vast amounts of data into valuable AI models and tokens—based on the NVIDIA accelerated computing platform, including the latest NVIDIA GH200 Grace Hopper Superchip and NVIDIA AI Enterprise software.

EK Fluid Works Enhances Portfolio with NVIDIA H100 GPU Integration

EK, the leading PC liquid cooling solutions provider, has expanded its hardware support for the EK Fluid Works systems by integrating the state-of-the-art NVIDIA H100 PCIe Tensor Core GPU. NVIDIA's latest release, acclaimed for its unprecedented performance, scalability, and security across diverse workloads, has discovered its ultimate home in EK Fluid Works servers and workstations.

Notably, EK's commitment to sustainability transforms these systems into eco-friendlier platforms, unlocking the full potential of Large Language Models (LLM), machine learning, and AI model training. EK Fluid Works systems emerge as the top choice for those seeking the unleashed power of NVIDIA H100 Tensor Core GPUs, offering an impressive array of efficiency benefits, including:
  • Unparalleled returns on investment
  • The lowest total cost of operation (TCO/OpEx)
  • Minimal additional capital expenditure (CapEx)

Dell Technologies Expands Generative AI Portfolio

Dell Technologies expands its Dell Generative AI Solutions portfolio, helping businesses transform how they work along every step of their generative AI (GenAI) journeys. "To maximize AI efforts and support workloads across public clouds, on-premises environments and at the edge, companies need a robust data foundation with the right infrastructure, software and services," said Jeff Boudreau, chief AI officer, Dell Technologies. "That's what we are building with our expanded validated designs, professional services, modern data lakehouse and the world's broadest GenAI solutions portfolio."

Customizing GenAI models to maximize proprietary data
The Dell Validated Design for Generative AI with NVIDIA for Model Customization offers pre-trained models that extract intelligence from data without building models from scratch. This solution provides best practices for customizing and fine-tuning GenAI models based on desired outcomes while helping keep information secure and on-premises. With a scalable blueprint for customization, organizations now have multiple ways to tailor GenAI models to accomplish specific tasks with their proprietary data. Its modular and flexible design supports a wide range of computational requirements and use cases, spanning training diffusion, transfer learning and prompt tuning.

EK Launches New EK-PRO Line of GPU Water Blocks for H100 GPUs

EK, the leading provider of cutting-edge computer cooling solutions, is introducing an enterprise-level GPU water block tailored for NVIDIA H100 Tensor Core PCIe data center GPUs. The EK-Pro GPU WB H100 Rack - Ni + Inox is a high-performance water block meticulously engineered to achieve an ultra-compact design, allowing it to occupy just a single PCIe slot compared to the stock 2-slot cooling system. This premium water block features a rack-style terminal, significantly reducing assembly height and enhancing compatibility with various chassis types. By spanning the entire PCB, it efficiently cools the GPU, HBM VRAM, and the VRM (voltage regulation module), with cooling liquid channeled directly over these critical components.

NVIDIA H100 Tensor Core GPUs provide a giant leap in computing power, perfect for accelerated computing. Its ground-breaking increase in performance offers up to 30X more performance in certain applications like large language models for AI and up to 7X performance boost in HPC workloads like genome sequencing, for example.

NVIDIA GH200 Superchip Aces MLPerf Inference Benchmarks

In its debut on the MLPerf industry benchmarks, the NVIDIA GH200 Grace Hopper Superchip ran all data center inference tests, extending the leading performance of NVIDIA H100 Tensor Core GPUs. The overall results showed the exceptional performance and versatility of the NVIDIA AI platform from the cloud to the network's edge. Separately, NVIDIA announced inference software that will give users leaps in performance, energy efficiency and total cost of ownership.

GH200 Superchips Shine in MLPerf
The GH200 links a Hopper GPU with a Grace CPU in one superchip. The combination provides more memory, bandwidth and the ability to automatically shift power between the CPU and GPU to optimize performance. Separately, NVIDIA HGX H100 systems that pack eight H100 GPUs delivered the highest throughput on every MLPerf Inference test in this round. Grace Hopper Superchips and H100 GPUs led across all MLPerf's data center tests, including inference for computer vision, speech recognition and medical imaging, in addition to the more demanding use cases of recommendation systems and the large language models (LLMs) used in generative AI.

Google Introduces Cloud TPU v5e and Announces A3 Instance Availability

We're at a once-in-a-generation inflection point in computing. The traditional ways of designing and building computing infrastructure are no longer adequate for the exponentially growing demands of workloads like generative AI and LLMs. In fact, the number of parameters in LLMs has increased by 10x per year over the past five years. As a result, customers need AI-optimized infrastructure that is both cost effective and scalable.

For two decades, Google has built some of the industry's leading AI capabilities: from the creation of Google's Transformer architecture that makes gen AI possible, to our AI-optimized infrastructure, which is built to deliver the global scale and performance required by Google products that serve billions of users like YouTube, Gmail, Google Maps, Google Play, and Android. We are excited to bring decades of innovation and research to Google Cloud customers as they pursue transformative opportunities in AI. We offer a complete solution for AI, from computing infrastructure optimized for AI to the end-to-end software and services that support the full lifecycle of model training, tuning, and serving at global scale.

Google Cloud and NVIDIA Expand Partnership to Advance AI Computing, Software and Services

Google Cloud Next—Google Cloud and NVIDIA today announced new AI infrastructure and software for customers to build and deploy massive models for generative AI and speed data science workloads.

In a fireside chat at Google Cloud Next, Google Cloud CEO Thomas Kurian and NVIDIA founder and CEO Jensen Huang discussed how the partnership is bringing end-to-end machine learning services to some of the largest AI customers in the world—including by making it easy to run AI supercomputers with Google Cloud offerings built on NVIDIA technologies. The new hardware and software integrations utilize the same NVIDIA technologies employed over the past two years by Google DeepMind and Google research teams.

NVIDIA H100 Tensor Core GPU Used on New Azure Virtual Machine Series Now Available

Microsoft Azure users can now turn to the latest NVIDIA accelerated computing technology to train and deploy their generative AI applications. Available today, the Microsoft Azure ND H100 v5 VMs using NVIDIA H100 Tensor Core GPUs and NVIDIA Quantum-2 InfiniBand networking—enables scaling generative AI, high performance computing (HPC) and other applications with a click from a browser. Available to customers across the U.S., the new instance arrives as developers and researchers are using large language models (LLMs) and accelerated computing to uncover new consumer and business use cases.

The NVIDIA H100 GPU delivers supercomputing-class performance through architectural innovations, including fourth-generation Tensor Cores, a new Transformer Engine for accelerating LLMs and the latest NVLink technology that lets GPUs talk to each other at 900 GB/s. The inclusion of NVIDIA Quantum-2 CX7 InfiniBand with 3,200 Gbps cross-node bandwidth ensures seamless performance across the GPUs at massive scale, matching the capabilities of top-performing supercomputers globally.

GIGABYTE Leads MLPerf Training v3.0 Benchmarks with Top-Performing Accelerators in GIGABYTE Servers

GIGABYTE Technology: The latest MLPerf Training v3.0 benchmark results are out, and the GIGABYTE G593-SD0 server has emerged as a leader in this round of testing. Going head-to-head against impressive systems, GIGABYTE's servers secured top positions in various categories, showcasing their prowess in handling real-world machine learning use cases. With an unparalleled focus on performance, efficiency, and reliability, GIGABYTE has once again proven its commitment to driving progress in the field of AI.

GIGABYTE, one of the founding members of MLCommons, has been actively contributing to the organization's efforts in designing and planning systems to benchmark fairly. Understanding the importance of replicating real-world scenarios in AI development, GIGABYTE's collaboration with MLCommons has been instrumental in shaping the benchmark tasks to encompass critical use cases such as image recognition, object detection, speech-to-text, natural language processing, and recommendation engines. By actively engaging with end applications, GIGABYTE ensures that its servers are designed to meet the highest standards, delivering supreme performance, and facilitating meaningful comparisons between different ML systems.

Dell Technologies Expands AI Offerings, in Collaboration with NVIDIA

Dell Technologies introduces new offerings to help customers quickly and securely build generative AI (GenAI) models on-premises to accelerate improved outcomes and drive new levels of intelligence. New Dell Generative AI Solutions, expanding upon our May's Project Helix announcement, span IT infrastructure, PCs and professional services to simplify the adoption of full-stack GenAI with large language models (LLM), meeting organizations wherever they are in their GenAI journey. These solutions help organizations, of all sizes and across industries, securely transform and deliver better outcomes.

"Generative AI represents an inflection point that is driving fundamental change in the pace of innovation while improving the customer experience and enabling new ways to work," Jeff Clarke, vice chairman and co-chief operating officer, Dell Technologies, said on a recent investor call. "Customers, big and small, are using their own data and business context to train, fine-tune and inference on Dell infrastructure solutions to incorporate advanced AI into their core business processes effectively and efficiently."

NVIDIA H100 GPUs Now Available on AWS Cloud

AWS users can now access the leading performance demonstrated in industry benchmarks of AI training and inference. The cloud giant officially switched on a new Amazon EC2 P5 instance powered by NVIDIA H100 Tensor Core GPUs. The service lets users scale generative AI, high performance computing (HPC) and other applications with a click from a browser.

The news comes in the wake of AI's iPhone moment. Developers and researchers are using large language models (LLMs) to uncover new applications for AI almost daily. Bringing these new use cases to market requires the efficiency of accelerated computing. The NVIDIA H100 GPU delivers supercomputing-class performance through architectural innovations including fourth-generation Tensor Cores, a new Transformer Engine for accelerating LLMs and the latest NVLink technology that lets GPUs talk to each other at 900 GB/sec.

Comino Launches Water Block for NVIDIA H100 PCIe Accelerator Card

A relatively new player in the water cooling industry, Comino, has recently introduced its latest product: a water block for the NVIDIA H100 PCIe accelerator card. The new block provides full coverage with cooling to the GPU, GDDR, and VRM. In the design, Comino only used non-corrosive materials such as copper, stainless steel, aluminium, and Plastic. The core of the block uses copper, while the frame and backplate use aluminium. The company claims that at a coolant temperature of 20°C, the temperature of the GH100 chip with Comino water blocks will be 30º-40°C.

Comino uses "deformational cutting" technology to create a copper fin as thin as 0.1 mm with a 0.1 mm channel and 3 mm height. In Comino water blocks, micro fins are optimized for a low-pressure drop with a thickness of 0.25 mm, channel - 0.25 mm, and 2.7 mm height. The block itself is a single-slot solution with fitting adapters on the back and a 90º adapter option for workstation implementation. More information is available on the Comino website. You can see the images below.

Inflection AI Builds Supercomputer with 22,000 NVIDIA H100 GPUs

The AI hype continues to push hardware shipments, especially for servers with GPUs that are in very high demand. Another example is the latest feat of AI startup, Inflection AI. Building foundational AI models, the Inflection AI crew has secured an order of 22,000 NVIDIA H100 GPUs and built a supercomputer. Assuming a configuration of a single Intel Xeon CPU with eight GPUs, almost 700 four-node racks should go into the supercomputer. Scaling and connecting 22,000 GPUs is easier than it is to acquire them, as NVIDIA's H100 GPUs are selling out everywhere due to the enormous demand for AI applications both on and off premises.

Getting 22,000 H100 GPUs is the biggest challenge here, and Inflection AI managed to get them by having NVIDIA as an investor in the startup. The supercomputer is estimated to cost around one billion USD and consume 31 Mega-Watts of power. The Inflection AI startup is now valued at 1.5 billion USD at the time of writing.

NVIDIA H100 GPUs Set Standard for Generative AI in Debut MLPerf Benchmark

In a new industry-standard benchmark, a cluster of 3,584 H100 GPUs at cloud service provider CoreWeave trained a massive GPT-3-based model in just 11 minutes. Leading users and industry-standard benchmarks agree: NVIDIA H100 Tensor Core GPUs deliver the best AI performance, especially on the large language models (LLMs) powering generative AI.

H100 GPUs set new records on all eight tests in the latest MLPerf training benchmarks released today, excelling on a new MLPerf test for generative AI. That excellence is delivered both per-accelerator and at-scale in massive servers. For example, on a commercially available cluster of 3,584 H100 GPUs co-developed by startup Inflection AI and operated by CoreWeave, a cloud service provider specializing in GPU-accelerated workloads, the system completed the massive GPT-3-based training benchmark in less than eleven minutes.

NVIDIA H100 Hopper GPU Tested for Gaming, Slower Than Integrated GPU

NVIDIA's H100 Hopper GPU is a device designed for pure AI and other compute workloads, with the least amount of consideration for gaming workloads that involve graphics processing. However, it is still interesting to see how this 30,000 USD GPU fairs in comparison to other gaming GPUs and whether it is even possible to run games on it. It turns out that it is technically feasible but not making much sense, as the Chinese YouTube channel Geekerwan notes. Based on the GH100 GPU SKU with 14,592 CUDA, the H100 PCIe version tested here can achieve 204.9 TeraFLOPS at FP16, 51.22 TeraFLOPS at FP32, and 25.61 TeraFLOPS at FP64, with its natural power laying in accelerating AI workloads.

However, how does it fare in gaming benchmarks? Not very well, as the testing shows. It scored 2681 points in 3DMark Time Spy, which is lower than AMD's integrated Radeon 680M, which managed to score 2710 points. Interestingly, the GH100 has only 24 ROPs (render output units), while the gaming-oriented GA102 (highest-end gaming GPU SKU) has 112 ROPs. This is self-explanatory and provides a clear picture as to why the H100 GPU is used for computing only. Since it doesn't have any display outputs, the system needed another regular GPU to provide the picture, while the computation happened on the H100 GPU.

ASUS Unveils ESC N8-E11, an HGX H100 Eight-GPU Server

ASUS today announced ESC N8-E11, its most advanced HGX H100 eight-GPU AI server, along with a comprehensive PCI Express (PCIe) GPU server portfolio—the ESC8000 and ESC4000 series empowered by Intel and AMD platforms to support higher CPU and GPU TDPs to accelerate the development of AI and data science.

ASUS is one of the few HPC solution providers with its own all-dimensional resources that consist of the ASUS server business unit, Taiwan Web Service (TWS) and ASUS Cloud—all part of the ASUS group. This uniquely positions ASUS to deliver in-house AI server design, data-center infrastructure, and AI software-development capabilities, plus a diverse ecosystem of industrial hardware and software partners.

Giga Computing Goes Big with Green Computing and HPC and AI at Computex

Giga Computing, a subsidiary of GIGABYTE and an industry leader in high-performance servers, server motherboards, and workstations, today announced a major presence at Computex 2023, held May 30 to June 2, with a GIGABYTE booth that inspires while showcasing more than fifty servers that span GIGABYTE's comprehensive enterprise portfolio, including green computing solutions that feature liquid cooled servers and immersion cooling technology. The international computer expo attracts over 100,000 visitors annually and GIGABYTE will be ready with a spacious and attractive booth that will draw in curious minds, and at the same time there will be plenty of knowledgeable staff to answer questions about how our products are being utilized today.

The slogan for Computex 2023 is "Together we create." And just like parts that make a whole, GIGABYTE's slogan of "Future of COMPUTING" embodies all the distinct computing products from consumer to enterprise applications. For the enterprise business unit, there will be sections with themes: "Win Big with AI HPC," "Advance Data Centers," and "Embrace Sustainability." Each theme will show off cutting edge technologies that span x86 and ARM platforms, and great attention is placed on solutions that address challenges that come with more powerful computing.

NVIDIA Grace Drives Wave of New Energy-Efficient Arm Supercomputers

NVIDIA today announced a supercomputer built on the NVIDIA Grace CPU Superchip, adding to a wave of new energy-efficient supercomputers based on the Arm Neoverse platform. The Isambard 3 supercomputer to be based at the Bristol & Bath Science Park, in the U.K., will feature 384 Arm-based NVIDIA Grace CPU Superchips to power medical and scientific research, and is expected to deliver 6x the performance and energy efficiency of Isambard 2, placing it among Europe's most energy-efficient systems.

It will achieve about 2.7 petaflops of FP64 peak performance and consume less than 270 kilowatts of power, ranking it among the world's three greenest non-accelerated supercomputers. The project is being led by the University of Bristol, as part of the research consortium the GW4 Alliance, together with the universities of Bath, Cardiff and Exeter.

Supermicro Launches Industry's First NVIDIA HGX H100 8 and 4-GPU H100 Servers with Liquid Cooling

Supermicro, Inc., a Total IT Solution Provider for Cloud, AI/ML, Storage, and 5G/Edge, continues to expand its data center offerings with liquid cooled NVIDIA HGX H100 rack scale solutions. Advanced liquid cooling technologies entirely from Supermicro reduce the lead time for a complete installation, increase performance, and result in lower operating expenses while significantly reducing the PUE of data centers. Savings for a data center are estimated to be 40% for power when using Supermicro liquid cooling solutions compared to an air-cooled data center. In addition, up to 86% reduction in direct cooling costs compared to existing data centers may be realized.

"Supermicro continues to lead the industry supporting the demanding needs of AI workloads and modern data centers worldwide," said Charles Liang, president, and CEO of Supermicro. "Our innovative GPU servers that use our liquid cooling technology significantly lower the power requirements of data centers. With the amount of power required to enable today's rapidly evolving large scale AI models, optimizing TCO and the Total Cost to Environment (TCE) is crucial to data center operators. We have proven expertise in designing and building entire racks of high-performance servers. These GPU systems are designed from the ground up for rack scale integration with liquid cooling to provide superior performance, efficiency, and ease of deployments, allowing us to meet our customers' requirements with a short lead time."

Google Announces A3 Supercomputers with NVIDIA H100 GPUs, Purpose-built for AI

Implementing state-of-the-art artificial intelligence (AI) and machine learning (ML) models requires large amounts of computation, both to train the underlying models, and to serve those models once they're trained. Given the demands of these workloads, a one-size-fits-all approach is not enough - you need infrastructure that's purpose-built for AI.

Together with our partners, we offer a wide range of compute options for ML use cases such as large language models (LLMs), generative AI, and diffusion models. Recently, we announced G2 VMs, becoming the first cloud to offer the new NVIDIA L4 Tensor Core GPUs for serving generative AI workloads. Today, we're expanding that portfolio with the private preview launch of the next-generation A3 GPU supercomputer. Google Cloud now offers a complete range of GPU options for training and inference of ML models.
Return to Keyword Browsing
May 21st, 2024 06:53 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts