News Posts matching #NVIDIA H100

Return to Keyword Browsing

NVIDIA Unveils "Eos" to Public - a Top Ten Supercomputer

Providing a peek at the architecture powering advanced AI factories, NVIDIA released a video that offers the first public look at Eos, its latest data-center-scale supercomputer. An extremely large-scale NVIDIA DGX SuperPOD, Eos is where NVIDIA developers create their AI breakthroughs using accelerated computing infrastructure and fully optimized software. Eos is built with 576 NVIDIA DGX H100 systems, NVIDIA Quantum-2 InfiniBand networking and software, providing a total of 18.4 exaflops of FP8 AI performance. Revealed in November at the Supercomputing 2023 trade show, Eos—named for the Greek goddess said to open the gates of dawn each day—reflects NVIDIA's commitment to advancing AI technology.

Eos Supercomputer Fuels Innovation
Each DGX H100 system is equipped with eight NVIDIA H100 Tensor Core GPUs. Eos features a total of 4,608 H100 GPUs. As a result, Eos can handle the largest AI workloads to train large language models, recommender systems, quantum simulations and more. It's a showcase of what NVIDIA's technologies can do, when working at scale. Eos is arriving at the perfect time. People are changing the world with generative AI, from drug discovery to chatbots to autonomous machines and beyond. To achieve these breakthroughs, they need more than AI expertise and development skills. They need an AI factory—a purpose-built AI engine that's always available and can help ramp their capacity to build AI models at scale Eos delivers. Ranked No. 9 in the TOP 500 list of the world's fastest supercomputers, Eos pushes the boundaries of AI technology and infrastructure.

Lenovo HPC Infrastructure Powers Pre-Exascale Supercomputer Marenostrum 5 to Enable New Scientific Advances and Solve Global Challenges

Lenovo (HKSE: 992) (ADR: LNVGY) has today announced that the General Purpose Partition of the MareNostrum 5, a new pre-exascale supercomputer running on Lenovo's HPC infrastructure, has been classified as the top x86 general-purpose cluster on the recently published TOP500 list of the most powerful supercomputers globally.

Officially inaugurated at Barcelona Supercomputing Center on December 21st, MareNostrum 5 has been built for the European High Performance Computing Joint Undertaking (EuroHPC JU). The pre-exascale supercomputer will bolster the EU's mission to provide Europe with the most advanced supercomputing technology and accelerate the capacity for artificial intelligence (AI) research, enabling new scientific advances that will help solve global challenges. It aims to empower a wide range of complex HPC-specific applications, from climate research and engineering to material science and earth sciences, adeptly handling tasks that extend beyond the capabilities of cloud computing.

OpenAI Reportedly Talking to TSMC About Custom Chip Venture

OpenAI is reported to be initiating R&D on a proprietary AI processing solution—the research organization's CEO, Sam Altman, has commented on the in-efficient operation of datacenters running NVIDIA H100 and A100 GPUs. He foresees a future scenario where his company becomes less reliant on Team Green's off-the-shelf AI-crunchers, with a deployment of bespoke AI processors. A short Reuters interview also underlined Altman's desire to find alternatives sources of power: "It motivates us to go invest more in (nuclear) fusion." The growth of artificial intelligence industries has put an unprecedented strain on energy providers, so tech firms could be semi-forced into seeking out frugal enterprise hardware.

The Financial Times has followed up on last week's Bloomberg report of OpenAI courting investment partners in the Middle East. FT's news piece alleges that Altman is in talks with billionaire businessman Sheikh Tahnoon bin Zayed al-Nahyan, a very well connected member of the United Arab Emirates Royal Family. OpenAI's leadership is reportedly negotiating with TSMC—The Financial Times alleges that Taiwan's top chip foundry is an ideal manufacturing partner. This revelation contradicts Bloomberg's recent reports of a potential custom OpenAI AI chip venture involving purpose-built manufacturing facilities. The whole project is said to be at an early stage of development, so Altman and his colleagues are most likely exploring a variety of options.

Meta Will Acquire 350,000 H100 GPUs Worth More Than 10 Billion US Dollars

Mark Zuckerberg has shared some interesting insights about Meta's AI infrastructure buildout, which is on track to include an astonishing number of NVIDIA H100 Tensor GPUs. In the post on Instagram, Meta's CEO has noted the following: "We're currently training our next-gen model Llama 3, and we're building massive compute infrastructure to support our future roadmap, including 350k H100s by the end of this year -- and overall almost 600k H100s equivalents of compute if you include other GPUs." That means that the company will enhance its AI infrastructure with 350,000 H100 GPUs on top of the existing GPUs, which is equivalent to 250,000 H100 in terms of computing power, for a total of 600,000 H100-equivalent GPUs.

The raw number of GPUs installed comes at a steep price. With the average selling price of H100 GPU nearing 30,000 US dollars, Meta's investment will settle the company back around $10.5 billion. Other GPUs should be in the infrastructure, but most will comprise the NVIDIA Hopper family. Additionally, Meta is currently training the LLama 3 AI model, which will be much more capable than the existing LLama 2 family and will include better reasoning, coding, and math-solving capabilities. These models will be open-source. Later down the pipeline, as the artificial general intelligence (AGI) comes into play, Zuckerberg has noted that "Our long term vision is to build general intelligence, open source it responsibly, and make it widely available so everyone can benefit." So, expect to see these models in the GitHub repositories in the future.

Samsung and Naver Developing an AI Chip Claiming to be 8x More Power Efficient than NVIDIA H100

Naver, the firm behind the HyperCLOVA X large language model (LLM), has been working with Samsung Electronics toward the development of power-efficient AI accelerators. The collaboration brings Naver's expertise with Samsung's vast systems IP over silicon design, the ability to build complex SoCs, semiconductor fabrication, and its plethora of DRAM technologies. The two recently designed a proof of concept for an upcoming AI chip, which they iterated on an FPGA. Naver claims the AI chip it is co-developing with Samsung will be 8 times more energy efficient than an NVIDIA H100 AI accelerator, but did not elaborate on its actual throughput. Its solution, among other things, leverages energy-efficient LPDDR memory from Samsung. The two companies have been working on this project since December 2022.

Dell Partners with Imbue on New AI Compute Cluster Using Nearly 10,000 NVIDIA H100 GPUs

Dell Technologies and Imbue, an independent AI research company, have entered into a $150 million agreement to build a new high-performance computing cluster for training foundation models optimized for reasoning. Imbue is one of the few independent AI labs that develops its own foundation models, and trains them to have more advanced reasoning capabilities—like knowing when to ask for more information, analyzing and critiquing their own outputs, or breaking down a difficult goal into a plan and then executing on it. Imbue trains AI agents on top of those models that can do work for people across diverse fields in ways that are robust, safe, and useful. Imbue's goal is to create practical tools for building agents that could enable workers across a broad set of domains, including helping engineers write new code, analysts understand and draft complex policy proposals, and much more.

TOP500 Update: Frontier Remains No.1 With Aurora Coming in at No. 2

The 62nd edition of the TOP500 reveals that the Frontier system retains its top spot and is still the only exascale machine on the list. However, five new or upgraded systems have shaken up the Top 10.

Housed at the Oak Ridge National Laboratory (ORNL) in Tennessee, USA, Frontier leads the pack with an HPL score of 1.194 EFlop/s - unchanged from the June 2023 list. Frontier utilizes AMD EPYC 64C 2GHz processors and is based on the latest HPE Cray EX235a architecture. The system has a total of 8,699,904 combined CPU and GPU cores. Additionally, Frontier has an impressive power efficiency rating of 52.59 GFlops/watt and relies on HPE's Slingshot 11 network for data transfer.

Supermicro Expands AI Solutions with the Upcoming NVIDIA HGX H200 and MGX Grace Hopper Platforms Featuring HBM3e Memory

Supermicro, Inc., a Total IT Solution Provider for AI, Cloud, Storage, and 5G/Edge, is expanding its AI reach with the upcoming support for the new NVIDIA HGX H200 built with H200 Tensor Core GPUs. Supermicro's industry leading AI platforms, including 8U and 4U Universal GPU Systems, are drop-in ready for the HGX H200 8-GPU, 4-GPU, and with nearly 2x capacity and 1.4x higher bandwidth HBM3e memory compared to the NVIDIA H100 Tensor Core GPU. In addition, the broadest portfolio of Supermicro NVIDIA MGX systems supports the upcoming NVIDIA Grace Hopper Superchip with HBM3e memory. With unprecedented performance, scalability, and reliability, Supermicro's rack scale AI solutions accelerate the performance of computationally intensive generative AI, large language Model (LLM) training, and HPC applications while meeting the evolving demands of growing model sizes. Using the building block architecture, Supermicro can quickly bring new technology to market, enabling customers to become more productive sooner.

Supermicro is also introducing the industry's highest density server with NVIDIA HGX H100 8-GPUs systems in a liquid cooled 4U system, utilizing the latest Supermicro liquid cooling solution. The industry's most compact high performance GPU server enables data center operators to reduce footprints and energy costs while offering the highest performance AI training capacity available in a single rack. With the highest density GPU systems, organizations can reduce their TCO by leveraging cutting-edge liquid cooling solutions.

GIGABYTE Demonstrates the Future of Computing at Supercomputing 2023 with Advanced Cooling and Scaled Data Centers

GIGABYTE Technology, Giga Computing, a subsidiary of GIGABYTE and an industry leader in high-performance servers, server motherboards, and workstations, continues to be a leader in cooling IT hardware efficiently and in developing diverse server platforms for Arm and x86 processors, as well as AI accelerators. At SC23, GIGABYTE (booth #355) will showcase some standout platforms, including for the NVIDIA GH200 Grace Hopper Superchip and next-gen AMD Instinct APU. To better introduce its extensive lineup of servers, GIGABYTE will address the most important needs in supercomputing data centers, such as how to cool high-performance IT hardware efficiently and power AI that is capable of real-time analysis and fast time to results.

Advanced Cooling
For many data centers, it is becoming apparent that their cooling infrastructure must radically shift to keep pace with new IT hardware that continues to generate more heat and requires rapid heat transfer. Because of this, GIGABYTE has launched advanced cooling solutions that allow IT hardware to maintain ideal performance while being more energy-efficient and maintaining the same data center footprint. At SC23, its booth will have a single-phase immersion tank, the A1P0-EA0, which offers a one-stop immersion cooling solution. GIGABYTE is experienced in implementing immersion cooling with immersion-ready servers, immersion tanks, oil, tools, and services spanning the globe. Another cooling solution showcased at SC23 will be direct liquid cooling (DLC), and in particular, the new GIGABYTE cold plates and cooling modules for the NVIDIA Grace CPU Superchip, NVIDIA Grace Hopper Superchip, AMD EPYC 9004 processor, and 4th Gen Intel Xeon processor.

NVIDIA Turbocharges Generative AI Training in MLPerf Benchmarks

NVIDIA's AI platform raised the bar for AI training and high performance computing in the latest MLPerf industry benchmarks. Among many new records and milestones, one in generative AI stands out: NVIDIA Eos - an AI supercomputer powered by a whopping 10,752 NVIDIA H100 Tensor Core GPUs and NVIDIA Quantum-2 InfiniBand networking - completed a training benchmark based on a GPT-3 model with 175 billion parameters trained on one billion tokens in just 3.9 minutes. That's a nearly 3x gain from 10.9 minutes, the record NVIDIA set when the test was introduced less than six months ago.

The benchmark uses a portion of the full GPT-3 data set behind the popular ChatGPT service that, by extrapolation, Eos could now train in just eight days, 73x faster than a prior state-of-the-art system using 512 A100 GPUs. The acceleration in training time reduces costs, saves energy and speeds time-to-market. It's heavy lifting that makes large language models widely available so every business can adopt them with tools like NVIDIA NeMo, a framework for customizing LLMs. In a new generative AI test ‌this round, 1,024 NVIDIA Hopper architecture GPUs completed a training benchmark based on the Stable Diffusion text-to-image model in 2.5 minutes, setting a high bar on this new workload. By adopting these two tests, MLPerf reinforces its leadership as the industry standard for measuring AI performance, since generative AI is the most transformative technology of our time.

NVIDIA NeMo: Designers Tap Generative AI for a Chip Assist

A research paper released this week describes ways generative AI can assist one of the most complex engineering efforts: designing semiconductors. The work demonstrates how companies in highly specialized fields can train large language models (LLMs) on their internal data to build assistants that increase productivity.

Few pursuits are as challenging as semiconductor design. Under a microscope, a state-of-the-art chip like an NVIDIA H100 Tensor Core GPU (above) looks like a well-planned metropolis, built with tens of billions of transistors, connected on streets 10,000x thinner than a human hair. Multiple engineering teams coordinate for as long as two years to construct one of these digital mega cities. Some groups define the chip's overall architecture, some craft and place a variety of ultra-small circuits, and others test their work. Each job requires specialized methods, software programs and computer languages.

Alphacool Unveils Single-Slot ES H100 80GB HBM PCIe Water Block for NVIDIA H100 GPU

Alphacool expands the portfolio of the Enterprise Solutions series for GPU water coolers and presents the new ES H100 80 GB HBM PCIe. In order to dissipate the enormous waste heat of this GPU generation in the best possible way, the cooler is positioned close to the components to be cooled in an exemplary manner. Alphacool uses only copper for all water-bearing parts. Copper has almost twice the thermal conductivity of aluminium, making it a much better choice of material for a water cooler. The fully chrome plated copper base makes it resistant to acids, scratches and damage. The matte carbon finish gives the cooler a classy appearance. At the same time, this makes it interesting for private users who want to do without aRGB lighting.

The water cooler is specially designed for use in narrow server cases. To save space in width and height, the connections have been moved to the back. This also allows for easier hosing inside the server rack. Thanks to the very compact design, only 1 slot is required for mounting the cooler in the server rack instead of 1.5 slots as before. This further reduction of required space is a strong argument for the use of the ES Copper/Carbon GPU watercooler. Smart and efficient.

NVIDIA Partners With Foxconn to Build Factories and Systems for the AI Industrial Revolution

NVIDIA today announced that it is collaborating with Hon Hai Technology Group (Foxconn) to accelerate the AI industrial revolution. Foxconn will integrate NVIDIA technology to develop a new class of data centers powering a wide range of applications—including digitalization of manufacturing and inspection workflows, development of AI-powered electric vehicle and robotics platforms, and a growing number of language-based generative AI services.

Announced in a fireside chat with NVIDIA founder and CEO Jensen Huang and Foxconn Chairman and CEO Young Liu at Hon Hai Tech Day, in Taipei, the collaboration starts with the creation of AI factories—an NVIDIA GPU computing infrastructure specially built for processing, refining and transforming vast amounts of data into valuable AI models and tokens—based on the NVIDIA accelerated computing platform, including the latest NVIDIA GH200 Grace Hopper Superchip and NVIDIA AI Enterprise software.

EK Fluid Works Enhances Portfolio with NVIDIA H100 GPU Integration

EK, the leading PC liquid cooling solutions provider, has expanded its hardware support for the EK Fluid Works systems by integrating the state-of-the-art NVIDIA H100 PCIe Tensor Core GPU. NVIDIA's latest release, acclaimed for its unprecedented performance, scalability, and security across diverse workloads, has discovered its ultimate home in EK Fluid Works servers and workstations.

Notably, EK's commitment to sustainability transforms these systems into eco-friendlier platforms, unlocking the full potential of Large Language Models (LLM), machine learning, and AI model training. EK Fluid Works systems emerge as the top choice for those seeking the unleashed power of NVIDIA H100 Tensor Core GPUs, offering an impressive array of efficiency benefits, including:
  • Unparalleled returns on investment
  • The lowest total cost of operation (TCO/OpEx)
  • Minimal additional capital expenditure (CapEx)

Dell Technologies Expands Generative AI Portfolio

Dell Technologies expands its Dell Generative AI Solutions portfolio, helping businesses transform how they work along every step of their generative AI (GenAI) journeys. "To maximize AI efforts and support workloads across public clouds, on-premises environments and at the edge, companies need a robust data foundation with the right infrastructure, software and services," said Jeff Boudreau, chief AI officer, Dell Technologies. "That's what we are building with our expanded validated designs, professional services, modern data lakehouse and the world's broadest GenAI solutions portfolio."

Customizing GenAI models to maximize proprietary data
The Dell Validated Design for Generative AI with NVIDIA for Model Customization offers pre-trained models that extract intelligence from data without building models from scratch. This solution provides best practices for customizing and fine-tuning GenAI models based on desired outcomes while helping keep information secure and on-premises. With a scalable blueprint for customization, organizations now have multiple ways to tailor GenAI models to accomplish specific tasks with their proprietary data. Its modular and flexible design supports a wide range of computational requirements and use cases, spanning training diffusion, transfer learning and prompt tuning.

EK Launches New EK-PRO Line of GPU Water Blocks for H100 GPUs

EK, the leading provider of cutting-edge computer cooling solutions, is introducing an enterprise-level GPU water block tailored for NVIDIA H100 Tensor Core PCIe data center GPUs. The EK-Pro GPU WB H100 Rack - Ni + Inox is a high-performance water block meticulously engineered to achieve an ultra-compact design, allowing it to occupy just a single PCIe slot compared to the stock 2-slot cooling system. This premium water block features a rack-style terminal, significantly reducing assembly height and enhancing compatibility with various chassis types. By spanning the entire PCB, it efficiently cools the GPU, HBM VRAM, and the VRM (voltage regulation module), with cooling liquid channeled directly over these critical components.

NVIDIA H100 Tensor Core GPUs provide a giant leap in computing power, perfect for accelerated computing. Its ground-breaking increase in performance offers up to 30X more performance in certain applications like large language models for AI and up to 7X performance boost in HPC workloads like genome sequencing, for example.

NVIDIA GH200 Superchip Aces MLPerf Inference Benchmarks

In its debut on the MLPerf industry benchmarks, the NVIDIA GH200 Grace Hopper Superchip ran all data center inference tests, extending the leading performance of NVIDIA H100 Tensor Core GPUs. The overall results showed the exceptional performance and versatility of the NVIDIA AI platform from the cloud to the network's edge. Separately, NVIDIA announced inference software that will give users leaps in performance, energy efficiency and total cost of ownership.

GH200 Superchips Shine in MLPerf
The GH200 links a Hopper GPU with a Grace CPU in one superchip. The combination provides more memory, bandwidth and the ability to automatically shift power between the CPU and GPU to optimize performance. Separately, NVIDIA HGX H100 systems that pack eight H100 GPUs delivered the highest throughput on every MLPerf Inference test in this round. Grace Hopper Superchips and H100 GPUs led across all MLPerf's data center tests, including inference for computer vision, speech recognition and medical imaging, in addition to the more demanding use cases of recommendation systems and the large language models (LLMs) used in generative AI.

Google Introduces Cloud TPU v5e and Announces A3 Instance Availability

We're at a once-in-a-generation inflection point in computing. The traditional ways of designing and building computing infrastructure are no longer adequate for the exponentially growing demands of workloads like generative AI and LLMs. In fact, the number of parameters in LLMs has increased by 10x per year over the past five years. As a result, customers need AI-optimized infrastructure that is both cost effective and scalable.

For two decades, Google has built some of the industry's leading AI capabilities: from the creation of Google's Transformer architecture that makes gen AI possible, to our AI-optimized infrastructure, which is built to deliver the global scale and performance required by Google products that serve billions of users like YouTube, Gmail, Google Maps, Google Play, and Android. We are excited to bring decades of innovation and research to Google Cloud customers as they pursue transformative opportunities in AI. We offer a complete solution for AI, from computing infrastructure optimized for AI to the end-to-end software and services that support the full lifecycle of model training, tuning, and serving at global scale.

Google Cloud and NVIDIA Expand Partnership to Advance AI Computing, Software and Services

Google Cloud Next—Google Cloud and NVIDIA today announced new AI infrastructure and software for customers to build and deploy massive models for generative AI and speed data science workloads.

In a fireside chat at Google Cloud Next, Google Cloud CEO Thomas Kurian and NVIDIA founder and CEO Jensen Huang discussed how the partnership is bringing end-to-end machine learning services to some of the largest AI customers in the world—including by making it easy to run AI supercomputers with Google Cloud offerings built on NVIDIA technologies. The new hardware and software integrations utilize the same NVIDIA technologies employed over the past two years by Google DeepMind and Google research teams.

NVIDIA H100 Tensor Core GPU Used on New Azure Virtual Machine Series Now Available

Microsoft Azure users can now turn to the latest NVIDIA accelerated computing technology to train and deploy their generative AI applications. Available today, the Microsoft Azure ND H100 v5 VMs using NVIDIA H100 Tensor Core GPUs and NVIDIA Quantum-2 InfiniBand networking—enables scaling generative AI, high performance computing (HPC) and other applications with a click from a browser. Available to customers across the U.S., the new instance arrives as developers and researchers are using large language models (LLMs) and accelerated computing to uncover new consumer and business use cases.

The NVIDIA H100 GPU delivers supercomputing-class performance through architectural innovations, including fourth-generation Tensor Cores, a new Transformer Engine for accelerating LLMs and the latest NVLink technology that lets GPUs talk to each other at 900 GB/s. The inclusion of NVIDIA Quantum-2 CX7 InfiniBand with 3,200 Gbps cross-node bandwidth ensures seamless performance across the GPUs at massive scale, matching the capabilities of top-performing supercomputers globally.

GIGABYTE Leads MLPerf Training v3.0 Benchmarks with Top-Performing Accelerators in GIGABYTE Servers

GIGABYTE Technology: The latest MLPerf Training v3.0 benchmark results are out, and the GIGABYTE G593-SD0 server has emerged as a leader in this round of testing. Going head-to-head against impressive systems, GIGABYTE's servers secured top positions in various categories, showcasing their prowess in handling real-world machine learning use cases. With an unparalleled focus on performance, efficiency, and reliability, GIGABYTE has once again proven its commitment to driving progress in the field of AI.

GIGABYTE, one of the founding members of MLCommons, has been actively contributing to the organization's efforts in designing and planning systems to benchmark fairly. Understanding the importance of replicating real-world scenarios in AI development, GIGABYTE's collaboration with MLCommons has been instrumental in shaping the benchmark tasks to encompass critical use cases such as image recognition, object detection, speech-to-text, natural language processing, and recommendation engines. By actively engaging with end applications, GIGABYTE ensures that its servers are designed to meet the highest standards, delivering supreme performance, and facilitating meaningful comparisons between different ML systems.

Dell Technologies Expands AI Offerings, in Collaboration with NVIDIA

Dell Technologies introduces new offerings to help customers quickly and securely build generative AI (GenAI) models on-premises to accelerate improved outcomes and drive new levels of intelligence. New Dell Generative AI Solutions, expanding upon our May's Project Helix announcement, span IT infrastructure, PCs and professional services to simplify the adoption of full-stack GenAI with large language models (LLM), meeting organizations wherever they are in their GenAI journey. These solutions help organizations, of all sizes and across industries, securely transform and deliver better outcomes.

"Generative AI represents an inflection point that is driving fundamental change in the pace of innovation while improving the customer experience and enabling new ways to work," Jeff Clarke, vice chairman and co-chief operating officer, Dell Technologies, said on a recent investor call. "Customers, big and small, are using their own data and business context to train, fine-tune and inference on Dell infrastructure solutions to incorporate advanced AI into their core business processes effectively and efficiently."

NVIDIA H100 GPUs Now Available on AWS Cloud

AWS users can now access the leading performance demonstrated in industry benchmarks of AI training and inference. The cloud giant officially switched on a new Amazon EC2 P5 instance powered by NVIDIA H100 Tensor Core GPUs. The service lets users scale generative AI, high performance computing (HPC) and other applications with a click from a browser.

The news comes in the wake of AI's iPhone moment. Developers and researchers are using large language models (LLMs) to uncover new applications for AI almost daily. Bringing these new use cases to market requires the efficiency of accelerated computing. The NVIDIA H100 GPU delivers supercomputing-class performance through architectural innovations including fourth-generation Tensor Cores, a new Transformer Engine for accelerating LLMs and the latest NVLink technology that lets GPUs talk to each other at 900 GB/sec.

Comino Launches Water Block for NVIDIA H100 PCIe Accelerator Card

A relatively new player in the water cooling industry, Comino, has recently introduced its latest product: a water block for the NVIDIA H100 PCIe accelerator card. The new block provides full coverage with cooling to the GPU, GDDR, and VRM. In the design, Comino only used non-corrosive materials such as copper, stainless steel, aluminium, and Plastic. The core of the block uses copper, while the frame and backplate use aluminium. The company claims that at a coolant temperature of 20°C, the temperature of the GH100 chip with Comino water blocks will be 30º-40°C.

Comino uses "deformational cutting" technology to create a copper fin as thin as 0.1 mm with a 0.1 mm channel and 3 mm height. In Comino water blocks, micro fins are optimized for a low-pressure drop with a thickness of 0.25 mm, channel - 0.25 mm, and 2.7 mm height. The block itself is a single-slot solution with fitting adapters on the back and a 90º adapter option for workstation implementation. More information is available on the Comino website. You can see the images below.

Inflection AI Builds Supercomputer with 22,000 NVIDIA H100 GPUs

The AI hype continues to push hardware shipments, especially for servers with GPUs that are in very high demand. Another example is the latest feat of AI startup, Inflection AI. Building foundational AI models, the Inflection AI crew has secured an order of 22,000 NVIDIA H100 GPUs and built a supercomputer. Assuming a configuration of a single Intel Xeon CPU with eight GPUs, almost 700 four-node racks should go into the supercomputer. Scaling and connecting 22,000 GPUs is easier than it is to acquire them, as NVIDIA's H100 GPUs are selling out everywhere due to the enormous demand for AI applications both on and off premises.

Getting 22,000 H100 GPUs is the biggest challenge here, and Inflection AI managed to get them by having NVIDIA as an investor in the startup. The supercomputer is estimated to cost around one billion USD and consume 31 Mega-Watts of power. The Inflection AI startup is now valued at 1.5 billion USD at the time of writing.
Return to Keyword Browsing
Jan 28th, 2025 19:47 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts