News Posts matching #H100

Return to Keyword Browsing

TOP500 Update: Frontier Remains No.1 With Aurora Coming in at No. 2

The 62nd edition of the TOP500 reveals that the Frontier system retains its top spot and is still the only exascale machine on the list. However, five new or upgraded systems have shaken up the Top 10.

Housed at the Oak Ridge National Laboratory (ORNL) in Tennessee, USA, Frontier leads the pack with an HPL score of 1.194 EFlop/s - unchanged from the June 2023 list. Frontier utilizes AMD EPYC 64C 2GHz processors and is based on the latest HPE Cray EX235a architecture. The system has a total of 8,699,904 combined CPU and GPU cores. Additionally, Frontier has an impressive power efficiency rating of 52.59 GFlops/watt and relies on HPE's Slingshot 11 network for data transfer.

Supermicro Expands AI Solutions with the Upcoming NVIDIA HGX H200 and MGX Grace Hopper Platforms Featuring HBM3e Memory

Supermicro, Inc., a Total IT Solution Provider for AI, Cloud, Storage, and 5G/Edge, is expanding its AI reach with the upcoming support for the new NVIDIA HGX H200 built with H200 Tensor Core GPUs. Supermicro's industry leading AI platforms, including 8U and 4U Universal GPU Systems, are drop-in ready for the HGX H200 8-GPU, 4-GPU, and with nearly 2x capacity and 1.4x higher bandwidth HBM3e memory compared to the NVIDIA H100 Tensor Core GPU. In addition, the broadest portfolio of Supermicro NVIDIA MGX systems supports the upcoming NVIDIA Grace Hopper Superchip with HBM3e memory. With unprecedented performance, scalability, and reliability, Supermicro's rack scale AI solutions accelerate the performance of computationally intensive generative AI, large language Model (LLM) training, and HPC applications while meeting the evolving demands of growing model sizes. Using the building block architecture, Supermicro can quickly bring new technology to market, enabling customers to become more productive sooner.

Supermicro is also introducing the industry's highest density server with NVIDIA HGX H100 8-GPUs systems in a liquid cooled 4U system, utilizing the latest Supermicro liquid cooling solution. The industry's most compact high performance GPU server enables data center operators to reduce footprints and energy costs while offering the highest performance AI training capacity available in a single rack. With the highest density GPU systems, organizations can reduce their TCO by leveraging cutting-edge liquid cooling solutions.

GIGABYTE Demonstrates the Future of Computing at Supercomputing 2023 with Advanced Cooling and Scaled Data Centers

GIGABYTE Technology, Giga Computing, a subsidiary of GIGABYTE and an industry leader in high-performance servers, server motherboards, and workstations, continues to be a leader in cooling IT hardware efficiently and in developing diverse server platforms for Arm and x86 processors, as well as AI accelerators. At SC23, GIGABYTE (booth #355) will showcase some standout platforms, including for the NVIDIA GH200 Grace Hopper Superchip and next-gen AMD Instinct APU. To better introduce its extensive lineup of servers, GIGABYTE will address the most important needs in supercomputing data centers, such as how to cool high-performance IT hardware efficiently and power AI that is capable of real-time analysis and fast time to results.

Advanced Cooling
For many data centers, it is becoming apparent that their cooling infrastructure must radically shift to keep pace with new IT hardware that continues to generate more heat and requires rapid heat transfer. Because of this, GIGABYTE has launched advanced cooling solutions that allow IT hardware to maintain ideal performance while being more energy-efficient and maintaining the same data center footprint. At SC23, its booth will have a single-phase immersion tank, the A1P0-EA0, which offers a one-stop immersion cooling solution. GIGABYTE is experienced in implementing immersion cooling with immersion-ready servers, immersion tanks, oil, tools, and services spanning the globe. Another cooling solution showcased at SC23 will be direct liquid cooling (DLC), and in particular, the new GIGABYTE cold plates and cooling modules for the NVIDIA Grace CPU Superchip, NVIDIA Grace Hopper Superchip, AMD EPYC 9004 processor, and 4th Gen Intel Xeon processor.

ASUS Demonstrates AI and Immersion-Cooling Solutions at SC23

ASUS today announced a showcase of the latest AI solutions to empower innovation and push the boundaries of supercomputing, at Supercomputing 2023 (SC23) in Denver, Colorado, from 12-17 November, 2023. ASUS will demonstrate the latest AI advances, including generative-AI solutions and sustainability breakthroughs with Intel, to deliver the latest hybrid immersion-cooling solutions, plus lots more - all at booth number 257.

At SC23, ASUS will showcase the latest NVIDIA-qualified ESC N8A-E12 HGX H100 eight-GPU server empowered by dual-socket AMD EPYC 9004 processors and is designed for enterprise-level generative AI with market-leading integrated capabilities. Related to NVIDIA announcement on the latest NVIDIA H200 Tensor Core GPU at SC23, which is the first GPU to offer HBM3E for faster, larger memory to fuel the acceleration of generative AI and large language models, ASUS will offer an update of H100-based system with an H200-based drop-in replacement in 2024.

NVIDIA Turbocharges Generative AI Training in MLPerf Benchmarks

NVIDIA's AI platform raised the bar for AI training and high performance computing in the latest MLPerf industry benchmarks. Among many new records and milestones, one in generative AI stands out: NVIDIA Eos - an AI supercomputer powered by a whopping 10,752 NVIDIA H100 Tensor Core GPUs and NVIDIA Quantum-2 InfiniBand networking - completed a training benchmark based on a GPT-3 model with 175 billion parameters trained on one billion tokens in just 3.9 minutes. That's a nearly 3x gain from 10.9 minutes, the record NVIDIA set when the test was introduced less than six months ago.

The benchmark uses a portion of the full GPT-3 data set behind the popular ChatGPT service that, by extrapolation, Eos could now train in just eight days, 73x faster than a prior state-of-the-art system using 512 A100 GPUs. The acceleration in training time reduces costs, saves energy and speeds time-to-market. It's heavy lifting that makes large language models widely available so every business can adopt them with tools like NVIDIA NeMo, a framework for customizing LLMs. In a new generative AI test ‌this round, 1,024 NVIDIA Hopper architecture GPUs completed a training benchmark based on the Stable Diffusion text-to-image model in 2.5 minutes, setting a high bar on this new workload. By adopting these two tests, MLPerf reinforces its leadership as the industry standard for measuring AI performance, since generative AI is the most transformative technology of our time.

GIGABYTE Announces New Direct Liquid Cooling (DLC) Multi-Node Servers Ahead of SC23

GIGABYTE Technology, Giga Computing, a subsidiary of GIGABYTE and an industry leader in high-performance servers, server motherboards, and workstations, today announced direct liquid cooling (DLC) multi-node servers for NVIDIA Grace CPU & NVIDIA Grace Hopper Superchip. In addition, a DLC ready Intel-based server for the NVIDIA HGX H100 8-GPU platform and a high-density server for AMD EPYC 9004 processors. For the ultimate in efficiency, is also a new 12U single-phase immersion tank. All these mentioned products will be at GIGABYTE booth #355 at SC23.

Just announced high-density CPU servers include Intel Xeon-based H263-S63-LAN1 and AMD EPYC-based H273-Z80-LAN1. These 2U 4 node servers employ DLC for all eight CPUs, and although it is dense computing CPU performance achieves its full potential. In August, GIGABYTE announced new servers for NVIDIA HGX H100 GPU, and now adds the DLC version to the G593 series, G593-SD0-LAX1, for NVIDIA HGX H100 8-GPU.

NVIDIA NeMo: Designers Tap Generative AI for a Chip Assist

A research paper released this week describes ways generative AI can assist one of the most complex engineering efforts: designing semiconductors. The work demonstrates how companies in highly specialized fields can train large language models (LLMs) on their internal data to build assistants that increase productivity.

Few pursuits are as challenging as semiconductor design. Under a microscope, a state-of-the-art chip like an NVIDIA H100 Tensor Core GPU (above) looks like a well-planned metropolis, built with tens of billions of transistors, connected on streets 10,000x thinner than a human hair. Multiple engineering teams coordinate for as long as two years to construct one of these digital mega cities. Some groups define the chip's overall architecture, some craft and place a variety of ultra-small circuits, and others test their work. Each job requires specialized methods, software programs and computer languages.

Alphacool Unveils Single-Slot ES H100 80GB HBM PCIe Water Block for NVIDIA H100 GPU

Alphacool expands the portfolio of the Enterprise Solutions series for GPU water coolers and presents the new ES H100 80 GB HBM PCIe. In order to dissipate the enormous waste heat of this GPU generation in the best possible way, the cooler is positioned close to the components to be cooled in an exemplary manner. Alphacool uses only copper for all water-bearing parts. Copper has almost twice the thermal conductivity of aluminium, making it a much better choice of material for a water cooler. The fully chrome plated copper base makes it resistant to acids, scratches and damage. The matte carbon finish gives the cooler a classy appearance. At the same time, this makes it interesting for private users who want to do without aRGB lighting.

The water cooler is specially designed for use in narrow server cases. To save space in width and height, the connections have been moved to the back. This also allows for easier hosing inside the server rack. Thanks to the very compact design, only 1 slot is required for mounting the cooler in the server rack instead of 1.5 slots as before. This further reduction of required space is a strong argument for the use of the ES Copper/Carbon GPU watercooler. Smart and efficient.

U.S. Restricts Exports of NVIDIA GeForce RTX 4090 to China

The GeForce RTX 4090 gaming graphics card, both as an NVIDIA first-party Founders Edition, and custom-design by AIC partners, undergoes assembly in China. A new U.S. Government trade regulation restricts NVIDIA from selling it in the Chinese domestic market. The enthusiast-segment graphics card joins several other high performance AI processors, such as the "Hopper" H800, and "Ampere" A800. If you recall, the H800 and A800 are special China-specific variants of the H100 and A100, respectively, which come with performance reductions at the hardware-level, to fly below the AI processor performance limits set by the U.S. Government. The only reasons we can think of why these chips are on the list is if end-users in China have figured out ways around these performance limiters, or are buying in greater scale to achieve the desired performance. The fresh trade embargo released on October 17 covers the A100, A800, H100, H800, L40, L40S, and RTX 4090.

Samsung Notes: HBM4 Memory is Coming in 2025 with New Assembly and Bonding Technology

According to the editorial blog post published on the Samsung blog by SangJoon Hwang, Executive Vice President and Head of the DRAM Product & Technology Team at Samsung Electronics, we have information that High-Bandwidth Memory 4 (HBM4) is coming in 2025. In the recent timeline of HBM development, we saw the first appearance of HBM memory in 2015 with the AMD Radeon R9 Fury X. The second-generation HBM2 appeared with NVIDIA Tesla P100 in 2016, and the third-generation HBM3 saw the light of the day with NVIDIA Hopper GH100 GPU in 2022. Currently, Samsung has developed 9.8 Gbps HBM3E memory, which will start sampling to customers soon.

However, Samsung is more ambitious with development timelines this time, and the company expects to announce HBM4 in 2025, possibly with commercial products in the same calendar year. Interestingly, the HBM4 memory will have some technology optimized for high thermal properties, such as non-conductive film (NCF) assembly and hybrid copper bonding (HCB). The NCF is a polymer layer that enhances the stability of micro bumps and TSVs in the chip, so memory solder bump dies are protected from shock. Hybrid copper bonding is an advanced semiconductor packaging method that creates direct copper-to-copper connections between semiconductor components, enabling high-density, 3D-like packaging. It offers high I/O density, enhanced bandwidth, and improved power efficiency. It uses a copper layer as a conductor and oxide insulator instead of regular micro bumps to increase the connection density needed for HBM-like structures.

EK Fluid Works Enhances Portfolio with NVIDIA H100 GPU Integration

EK, the leading PC liquid cooling solutions provider, has expanded its hardware support for the EK Fluid Works systems by integrating the state-of-the-art NVIDIA H100 PCIe Tensor Core GPU. NVIDIA's latest release, acclaimed for its unprecedented performance, scalability, and security across diverse workloads, has discovered its ultimate home in EK Fluid Works servers and workstations.

Notably, EK's commitment to sustainability transforms these systems into eco-friendlier platforms, unlocking the full potential of Large Language Models (LLM), machine learning, and AI model training. EK Fluid Works systems emerge as the top choice for those seeking the unleashed power of NVIDIA H100 Tensor Core GPUs, offering an impressive array of efficiency benefits, including:
  • Unparalleled returns on investment
  • The lowest total cost of operation (TCO/OpEx)
  • Minimal additional capital expenditure (CapEx)

Microsoft to Unveil Custom AI Chips to Fight NVIDIA's Monopoly

According to sources close to The Information, Microsoft is supposed to unveil details about its upcoming custom silicon design for accelerating AI workloads. Allegedly, the incoming chip announcement is scheduled for November during Microsoft's annual Ignite conference. Held in Seattle from November 14 to 17, the conference is supposed to show all of the work that the company has been doing in the field of AI. The alleged launch of an AI chip will undoubtedly take center stage in the announcement, as the demand for AI accelerators has been so great that companies can't get their hands on GPUs. The sector is mainly dominated by NVIDIA, with its H100 and A100 GPUs powering most of the AI infrastructure worldwide.

With the launch of a custom AI chip codenamed Athena, Microsoft hopes to match or beat the performance of NVIDIA's offerings and reduce the cost of AI infrastructure. As the price of H100 GPU can get up to 30,000 US Dollars, building a data center filled with H100s can cost hundreds of millions. The cost could be winded down using homemade chips, and Microsoft could be less dependent on NVIDIA to provide the backbone of AI servers needed in the coming years. Nevertheless, we are excited to see what the company has prepared, and we will report on the Microsoft Ignite announcement in November.

NVIDIA Announces Collaboration with Anyscale

Large language model development is about to reach supersonic speed thanks to a collaboration between NVIDIA and Anyscale. At its annual Ray Summit developers conference, Anyscale—the company behind the fast growing open-source unified compute framework for scalable computing—announced today that it is bringing NVIDIA AI to Ray open source and the Anyscale Platform. It will also be integrated into Anyscale Endpoints, a new service announced today that makes it easy for application developers to cost-effectively embed LLMs in their applications using the most popular open source models.

These integrations can dramatically speed generative AI development and efficiency while boosting security for production AI, from proprietary LLMs to open models such as Code Llama, Falcon, Llama 2, SDXL and more. Developers will have the flexibility to deploy open-source NVIDIA software with Ray or opt for NVIDIA AI Enterprise software running on the Anyscale Platform for a fully supported and secure production deployment. Ray and the Anyscale Platform are widely used by developers building advanced LLMs for generative AI applications capable of powering intelligent chatbots, coding copilots and powerful search and summarization tools.

Intel Shows Strong AI Inference Performance

Today, MLCommons published results of its MLPerf Inference v3.1 performance benchmark for GPT-J, the 6 billion parameter large language model, as well as computer vision and natural language processing models. Intel submitted results for Habana Gaudi 2 accelerators, 4th Gen Intel Xeon Scalable processors, and Intel Xeon CPU Max Series. The results show Intel's competitive performance for AI inference and reinforce the company's commitment to making artificial intelligence more accessible at scale across the continuum of AI workloads - from client and edge to the network and cloud.

"As demonstrated through the recent MLCommons results, we have a strong, competitive AI product portfolio, designed to meet our customers' needs for high-performance, high-efficiency deep learning inference and training, for the complete spectrum of AI models - from the smallest to the largest - with leading price/performance." -Sandra Rivera, Intel executive vice president and general manager of the Data Center and AI Group

TSMC Prediction: AI Chip Supply Shortage to Last ~18 Months

TSMC Chairman Mark Liu was asked to comment on all things artificial intelligence-related at the SEMICON Taiwan 2023 industry event. According to a Nikkei Asia report, he foresees supply constraints lasting until the tail end of 2024: "It's not the shortage of AI chips. It's the shortage of our chip-on-wafer-on-substrate (COWOS) capacity...Currently, we can't fulfill 100% of our customers' needs, but we try to support about 80%. We think this is a temporary phenomenon. After our expansion of advanced chip packaging capacity, it should be alleviated in one and a half years." He cites a recent and very "sudden" spike in demand for COWOS, with numbers tripling within the span of a year. Market leader NVIDIA relies on TSMC's advanced packaging system—most notably with the production of highly-prized A100 and H100 series Tensor Core compute GPUs.

These issues are deemed a "temporary" problem—it could take around 18 months to eliminate production output "bottlenecks." TSMC is racing to bolster its native activities with new facilities—plans for a new $2.9 billion advanced chip packaging plant (in Miaoli County) were disclosed during summer time. Liu reckons that industry-wide innovation is necessary to meet growing demand through new methods to "connect, package and stack chips." Liu elaborated: "We are now putting together many chips into a tightly integrated massive interconnect system. This is a paradigm shift in semiconductor technology integration." The TSMC boss reckons that processing units fielding over one trillion transistors are viable within the next decade: "it's through packaging with multiple chips that this could be possible.".

Google Introduces Cloud TPU v5e and Announces A3 Instance Availability

We're at a once-in-a-generation inflection point in computing. The traditional ways of designing and building computing infrastructure are no longer adequate for the exponentially growing demands of workloads like generative AI and LLMs. In fact, the number of parameters in LLMs has increased by 10x per year over the past five years. As a result, customers need AI-optimized infrastructure that is both cost effective and scalable.

For two decades, Google has built some of the industry's leading AI capabilities: from the creation of Google's Transformer architecture that makes gen AI possible, to our AI-optimized infrastructure, which is built to deliver the global scale and performance required by Google products that serve billions of users like YouTube, Gmail, Google Maps, Google Play, and Android. We are excited to bring decades of innovation and research to Google Cloud customers as they pursue transformative opportunities in AI. We offer a complete solution for AI, from computing infrastructure optimized for AI to the end-to-end software and services that support the full lifecycle of model training, tuning, and serving at global scale.

Google Cloud and NVIDIA Expand Partnership to Advance AI Computing, Software and Services

Google Cloud Next—Google Cloud and NVIDIA today announced new AI infrastructure and software for customers to build and deploy massive models for generative AI and speed data science workloads.

In a fireside chat at Google Cloud Next, Google Cloud CEO Thomas Kurian and NVIDIA founder and CEO Jensen Huang discussed how the partnership is bringing end-to-end machine learning services to some of the largest AI customers in the world—including by making it easy to run AI supercomputers with Google Cloud offerings built on NVIDIA technologies. The new hardware and software integrations utilize the same NVIDIA technologies employed over the past two years by Google DeepMind and Google research teams.

Strong Cloud AI Server Demand Propels NVIDIA's FY2Q24 Data Center Business to Surpass 76% for the First Time

NVIDIA's latest financial report for FY2Q24 reveals that its data center business reached US$10.32 billion—a QoQ growth of 141% and YoY increase of 171%. The company remains optimistic about its future growth. TrendForce believes that the primary driver behind NVIDIA's robust revenue growth stems from its data center's AI server-related solutions. Key products include AI-accelerated GPUs and AI server HGX reference architecture, which serve as the foundational AI infrastructure for large data centers.

TrendForce further anticipates that NVIDIA will integrate its software and hardware resources. Utilizing a refined approach, NVIDIA will align its high-end, mid-tier, and entry-level GPU AI accelerator chips with various ODMs and OEMs, establishing a collaborative system certification model. Beyond accelerating the deployment of CSP cloud AI server infrastructures, NVIDIA is also partnering with entities like VMware on solutions including the Private AI Foundation. This strategy extends NVIDIA's reach into the edge enterprise AI server market, underpinning steady growth in its data center business for the next two years.

NVIDIA H100 Tensor Core GPU Used on New Azure Virtual Machine Series Now Available

Microsoft Azure users can now turn to the latest NVIDIA accelerated computing technology to train and deploy their generative AI applications. Available today, the Microsoft Azure ND H100 v5 VMs using NVIDIA H100 Tensor Core GPUs and NVIDIA Quantum-2 InfiniBand networking—enables scaling generative AI, high performance computing (HPC) and other applications with a click from a browser. Available to customers across the U.S., the new instance arrives as developers and researchers are using large language models (LLMs) and accelerated computing to uncover new consumer and business use cases.

The NVIDIA H100 GPU delivers supercomputing-class performance through architectural innovations, including fourth-generation Tensor Cores, a new Transformer Engine for accelerating LLMs and the latest NVLink technology that lets GPUs talk to each other at 900 GB/s. The inclusion of NVIDIA Quantum-2 CX7 InfiniBand with 3,200 Gbps cross-node bandwidth ensures seamless performance across the GPUs at massive scale, matching the capabilities of top-performing supercomputers globally.

Inventec's C805G6 Data Center Solution Brings Sustainable Efficiency & Advanced Security for Powering AI

Inventec, a global leader in high-powered servers headquartered in Taiwan, is launching its cutting-edge C805G6 server for data centers based on AMD's newest 4th Gen EPYC platform—a major innovation in computing power that provides double the operating efficiency of previous platforms. These innovations are timely, as the industry worldwide faces converse challenges—on one hand, a growing need to reduce carbon footprints and power consumption, while, on the other hand, the push for ever higher computing power and performance for AI. In fact, in 2022 MIT found that improving a machine learning model tenfold will require a 10,000-fold increase in computational requirements.

Addressing both pain points, George Lin, VP of Business Unit VI, Inventec Enterprise Business Group (Inventec EBG) notes that, "Our latest C805G6 data center solution represents an innovation both for the present and the future, setting the standard for performance, energy efficiency, and security while delivering top-notch hardware for powering AI workloads."

New AI Accelerator Chips Boost HBM3 and HBM3e to Dominate 2024 Market

TrendForce reports that the HBM (High Bandwidth Memory) market's dominant product for 2023 is HBM2e, employed by the NVIDIA A100/A800, AMD MI200, and most CSPs' (Cloud Service Providers) self-developed accelerator chips. As the demand for AI accelerator chips evolves, manufacturers plan to introduce new HBM3e products in 2024, with HBM3 and HBM3e expected to become mainstream in the market next year.

The distinctions between HBM generations primarily lie in their speed. The industry experienced a proliferation of confusing names when transitioning to the HBM3 generation. TrendForce clarifies that the so-called HBM3 in the current market should be subdivided into two categories based on speed. One category includes HBM3 running at speeds between 5.6 to 6.4 Gbps, while the other features the 8 Gbps HBM3e, which also goes by several names including HBM3P, HBM3A, HBM3+, and HBM3 Gen2.

NVIDIA H100 GPUs Now Available on AWS Cloud

AWS users can now access the leading performance demonstrated in industry benchmarks of AI training and inference. The cloud giant officially switched on a new Amazon EC2 P5 instance powered by NVIDIA H100 Tensor Core GPUs. The service lets users scale generative AI, high performance computing (HPC) and other applications with a click from a browser.

The news comes in the wake of AI's iPhone moment. Developers and researchers are using large language models (LLMs) to uncover new applications for AI almost daily. Bringing these new use cases to market requires the efficiency of accelerated computing. The NVIDIA H100 GPU delivers supercomputing-class performance through architectural innovations including fourth-generation Tensor Cores, a new Transformer Engine for accelerating LLMs and the latest NVLink technology that lets GPUs talk to each other at 900 GB/sec.

Report Suggests NVIDIA Prioritizing H800 GPU Production For Chinese AI Market

NVIDIA could be adjusting its enterprise-grade GPU production strategies for the Chinese market, according to an article published by MyDriver—despite major sanctions placed on semiconductor imports, Team Green is doing plenty of business with tech firms operating in the region thanks to an uptick in AI-related activities. NVIDIA offers two market specific accelerator models that have been cut down to conform to rules and regulations—the more powerful and expensive (250K RMB/~$35K) H800 is an adaptation of the western H100 GPU, while the A800 is a legal market alternative to the older A100.

The report proposes that NVIDIA is considering plans to reduce factory output of the A800 (sold for 100K RMB/~$14K per unit), so clients will be semi-forced into purchasing the higher-end H800 model instead (if they require a significant number of GPUs). The A800 seems to be the more popular choice for the majority of companies at the moment, with the heavy hitters—Alibaba, Baidu, Tencent, Jitwei and ByteDance—flexing their spending muscles and splurging on mixed shipments of the two accelerators. By limiting supplies of the lesser A800, Team Green could be generating more profit by prioritizing the more expensive (and readily available) model.

Comino Launches Water Block for NVIDIA H100 PCIe Accelerator Card

A relatively new player in the water cooling industry, Comino, has recently introduced its latest product: a water block for the NVIDIA H100 PCIe accelerator card. The new block provides full coverage with cooling to the GPU, GDDR, and VRM. In the design, Comino only used non-corrosive materials such as copper, stainless steel, aluminium, and Plastic. The core of the block uses copper, while the frame and backplate use aluminium. The company claims that at a coolant temperature of 20°C, the temperature of the GH100 chip with Comino water blocks will be 30º-40°C.

Comino uses "deformational cutting" technology to create a copper fin as thin as 0.1 mm with a 0.1 mm channel and 3 mm height. In Comino water blocks, micro fins are optimized for a low-pressure drop with a thickness of 0.25 mm, channel - 0.25 mm, and 2.7 mm height. The block itself is a single-slot solution with fitting adapters on the back and a 90º adapter option for workstation implementation. More information is available on the Comino website. You can see the images below.

Inflection AI Builds Supercomputer with 22,000 NVIDIA H100 GPUs

The AI hype continues to push hardware shipments, especially for servers with GPUs that are in very high demand. Another example is the latest feat of AI startup, Inflection AI. Building foundational AI models, the Inflection AI crew has secured an order of 22,000 NVIDIA H100 GPUs and built a supercomputer. Assuming a configuration of a single Intel Xeon CPU with eight GPUs, almost 700 four-node racks should go into the supercomputer. Scaling and connecting 22,000 GPUs is easier than it is to acquire them, as NVIDIA's H100 GPUs are selling out everywhere due to the enormous demand for AI applications both on and off premises.

Getting 22,000 H100 GPUs is the biggest challenge here, and Inflection AI managed to get them by having NVIDIA as an investor in the startup. The supercomputer is estimated to cost around one billion USD and consume 31 Mega-Watts of power. The Inflection AI startup is now valued at 1.5 billion USD at the time of writing.
Return to Keyword Browsing
Dec 20th, 2024 01:07 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts