News Posts matching #H100

Return to Keyword Browsing

Frontier Remains As Sole Exaflop Machine on TOP500 List

Increasing its HPL score from 1.02 Eflop/s in November 2022 to an impressive 1.194 Eflop/s on this list, Frontier was able to improve upon its score after a stagnation between June 2022 and November 2022. Considering exascale was only a goal to aspire to just a few years ago, a roughly 17% increase here is an enormous success. Additionally, Frontier earned a score of 9.95 Eflop/s on the HLP-MxP benchmark, which measures performance for mixed-precision calculation. This is also an increase over the 7.94 EFlop/s that the system achieved on the previous list and nearly 10 times more powerful than the machine's HPL score. Frontier is based on the HPE Cray EX235a architecture and utilizes AMD EPYC 64C 2 GHz processors. It also has 8,699,904 cores and an incredible energy efficiency rating of 52.59 Gflops/watt. It also relies on gigabit ethernet for data transfer.

NVIDIA Grace Drives Wave of New Energy-Efficient Arm Supercomputers

NVIDIA today announced a supercomputer built on the NVIDIA Grace CPU Superchip, adding to a wave of new energy-efficient supercomputers based on the Arm Neoverse platform. The Isambard 3 supercomputer to be based at the Bristol & Bath Science Park, in the U.K., will feature 384 Arm-based NVIDIA Grace CPU Superchips to power medical and scientific research, and is expected to deliver 6x the performance and energy efficiency of Isambard 2, placing it among Europe's most energy-efficient systems.

It will achieve about 2.7 petaflops of FP64 peak performance and consume less than 270 kilowatts of power, ranking it among the world's three greenest non-accelerated supercomputers. The project is being led by the University of Bristol, as part of the research consortium the GW4 Alliance, together with the universities of Bath, Cardiff and Exeter.

Supermicro Launches Industry's First NVIDIA HGX H100 8 and 4-GPU H100 Servers with Liquid Cooling

Supermicro, Inc., a Total IT Solution Provider for Cloud, AI/ML, Storage, and 5G/Edge, continues to expand its data center offerings with liquid cooled NVIDIA HGX H100 rack scale solutions. Advanced liquid cooling technologies entirely from Supermicro reduce the lead time for a complete installation, increase performance, and result in lower operating expenses while significantly reducing the PUE of data centers. Savings for a data center are estimated to be 40% for power when using Supermicro liquid cooling solutions compared to an air-cooled data center. In addition, up to 86% reduction in direct cooling costs compared to existing data centers may be realized.

"Supermicro continues to lead the industry supporting the demanding needs of AI workloads and modern data centers worldwide," said Charles Liang, president, and CEO of Supermicro. "Our innovative GPU servers that use our liquid cooling technology significantly lower the power requirements of data centers. With the amount of power required to enable today's rapidly evolving large scale AI models, optimizing TCO and the Total Cost to Environment (TCE) is crucial to data center operators. We have proven expertise in designing and building entire racks of high-performance servers. These GPU systems are designed from the ground up for rack scale integration with liquid cooling to provide superior performance, efficiency, and ease of deployments, allowing us to meet our customers' requirements with a short lead time."

Google Announces A3 Supercomputers with NVIDIA H100 GPUs, Purpose-built for AI

Implementing state-of-the-art artificial intelligence (AI) and machine learning (ML) models requires large amounts of computation, both to train the underlying models, and to serve those models once they're trained. Given the demands of these workloads, a one-size-fits-all approach is not enough - you need infrastructure that's purpose-built for AI.

Together with our partners, we offer a wide range of compute options for ML use cases such as large language models (LLMs), generative AI, and diffusion models. Recently, we announced G2 VMs, becoming the first cloud to offer the new NVIDIA L4 Tensor Core GPUs for serving generative AI workloads. Today, we're expanding that portfolio with the private preview launch of the next-generation A3 GPU supercomputer. Google Cloud now offers a complete range of GPU options for training and inference of ML models.

NVIDIA A800 China-Tailored GPU Performance within 70% of A100

The recent growth in demand for training Large Language Models (LLMs) like Generative Pre-trained Transformer (GPT) has sparked the interest of many companies to invest in GPU solutions that are used to train these models. However, countries like China have struggled with US sanctions, and NVIDIA has to create custom models that meet US export regulations. Carrying two GPUs, H800 and A800, they represent cut-down versions of the original H100 and A100, respectively. We reported about H800; however, it remained as mysterious as A800 that we are talking about today. Thanks to MyDrivers, we have information that the A800 GPU performance is within 70% of the regular A100.

The regular A100 GPU manages 9.7 TeraFLOPs of FP64, 19.5 TeraFLOPS of FP64 Tensor, and up to 624 BF16/FP16 TeraFLOPS with sparsity. A rough napkin math would suggest that 70% performance of the original (a 30% cut) would equal 6.8 TeraFLOPs of FP64 precision, 13.7 TeraFLOPs of FP64 Tensor, and 437 BF16/FP16 TeraFLOPs with sparsity. MyDrivers notes that A800 can be had for 100,000 Yuan, translating to about 14,462 USD at the time of writing. This is not the most capable GPU that Chinese companies can acquire, as H800 exists. However, we don't have any information about its performance for now.

NVIDIA DGX H100 Systems are Now Shipping

Customers from Japan to Ecuador and Sweden are using NVIDIA DGX H100 systems like AI factories to manufacture intelligence. They're creating services that offer AI-driven insights in finance, healthcare, law, IT and telecom—and working to transform their industries in the process. Among the dozens of use cases, one aims to predict how factory equipment will age, so tomorrow's plants can be more efficient.

Called Green Physics AI, it adds information like an object's CO2 footprint, age and energy consumption to SORDI.ai, which claims to be the largest synthetic dataset in manufacturing.

NVIDIA H100 Compared to A100 for Training GPT Large Language Models

NVIDIA's H100 has recently become available to use via Cloud Service Providers (CSPs), and it was only a matter of time before someone decided to benchmark its performance and compare it to the previous generation's A100 GPU. Today, thanks to the benchmarks of MosaicML, a startup company led by the ex-CEO of Nervana and GM of Artificial Intelligence (AI) at Intel, Naveen Rao, we have some comparison between these two GPUs with a fascinating insight about the cost factor. Firstly, MosaicML has taken Generative Pre-trained Transformer (GPT) models of various sizes and trained them using bfloat16 and FP8 Floating Point precision formats. All training occurred on CoreWeave cloud GPU instances.

Regarding performance, the NVIDIA H100 GPU achieved anywhere from 2.2x to 3.3x speedup. However, an interesting finding emerges when comparing the cost of running these GPUs in the cloud. CoreWeave prices the H100 SXM GPUs at $4.76/hr/GPU, while the A100 80 GB SXM gets $2.21/hr/GPU pricing. While the H100 is 2.2x more expensive, the performance makes it up, resulting in less time to train a model and a lower price for the training process. This inherently makes H100 more attractive for researchers and companies wanting to train Large Language Models (LLMs) and makes choosing the newer GPU more viable, despite the increased cost. Below, you can see tables of comparison between two GPUs in training time, speedup, and cost of training.

Gigabyte Extends Its Leading GPU Portfolio of Servers

Giga Computing, a subsidiary of GIGABYTE and an industry leader in high-performance servers, server motherboards, and workstations, today announced a lineup of powerful GPU-centric servers with the latest AMD and Intel CPUs, including NVIDIA HGX H100 servers with both 4-GPU and 8-GPU modules. With growing interest in HPC and AI applications, specifically generative AI (GAI), this breed of server relies heavily on GPU resources to tackle compute-heavy workloads that handle large amounts of data. With the advent of OpenAI's ChatGPT and other AI chatbots, large GPU clusters are being deployed with system-level optimization to train large language models (LLMs). These LLMs can be processed by GIGABYTE's new design-optimized systems that offer a high level of customization based on users' workloads and requirements.

The GIGABYTE G-series servers are built first and foremost to support dense GPU compute and the latest PCIe technology. Starting with the 2U servers, the new G293 servers can support up to 8 dual-slot GPUs or 16 single-slot GPUs, depending on the server model. For the ultimate in CPU and GPU performance, the 4U G493 servers offer plenty of networking options and storage configurations to go alongside support for eight (Gen 5 x16) GPUs. And for the highest level of GPU compute for HPC and AI, the G393 & G593 series support NVIDIA H100 Tensor Core GPUs. All these new two CPU socket servers are designed for either 4th Gen AMD EPYC processors or 4th Gen Intel Xeon Scalable processors.

NVIDIA H100 AI Performance Receives up to 54% Uplift with Optimizations

On Wednesday, the MLCommons team released the MLPerf 3.0 Inference numbers, and there was an exciting submission from NVIDIA. Reportedly, NVIDIA has used software optimization to improve the already staggering performance of its latest H100 GPU by up to 54%. For reference, NVIDIA's H100 GPU first appeared on MLPerf 2.1 back in September of 2022. In just six months, NVIDIA engineers worked on AI optimizations for the MLPerf 3.0 release to find that basic software optimization can catalyze performance increases anywhere from 7-54%. The workloads for measuring the inferencing speed suite included RNN-T speech recognition, 3D U-Net medical imaging, RetinaNet object detection, ResNet-50 object classification, DLRM recommendation, and BERT 99/99.9% natural language processing.

What is interesting is that NVIDIA's submission is a bit modified. There are open and closed categories that vendors have to compete in, where closed is the mathematical equivalent of a neural network. In contrast, the open category is flexible and allows vendors to submit results based on optimizations for their hardware. The closed submission aims to provide an "apples-to-apples" hardware comparison. Given that NVIDIA opted to use the closed category, performance optimization of other vendors such as Intel and Qualcomm are not accounted for here. Still, it is interesting that optimization can lead to a performance increase of up to 54% in NVIDIA's case with its H100 GPU. Another interesting takeaway is that some comparable hardware, like Qualcomm Cloud AI 100, Intel Xeon Platinum 8480+, and NeuChips's ReccAccel N3000, failed to finish all the workloads. This is shown as "X" on the slides made by NVIDIA, stressing the need for proper ML system software support, which is NVIDIA's strength and an extensive marketing claim.

Chinese GPU Maker Biren Technology Loses its Co-Founder, Only Months After Revealing New GPUs

Golf Jiao, a co-founder and general manager of Biren Technology, has left the company late last month according to insider sources in China. No official statement has been issued by the executive team at Biren Tech, and Jiao has not provided any details regarding his departure from the fabless semiconductor design company. The Shanghai-based firm is a relatively new startup - it was founded in 2019 by several former NVIDIA, Qualcomm and Alibaba veterans. Biren Tech received $726.6 million in funding for its debut range of general-purpose graphics processing units (GPGPUs), also defined as high-performance computing graphics processing units (HPC GPUs).

The company revealed its ambitions to take on NVIDIA's Ampere A100 and Hopper H100 compute platforms, and last August announced two HPC GPUs in the form of the BR100 and BR104. The specifications and performance charts demonstrated impressive figures, but Biren Tech had to roll back its numbers when it was hit by U.S Government enforced sanctions in October 2022. The fabless company had contracted with TSMC to produce its Biren range, and the new set of rules resulted in shipments from the Taiwanese foundry being halted. Biren Tech cut its work force by a third soon after losing its supply chain with TSMC, and the engineering team had to reassess how the BR100 and BR104 would perform on a process node larger than the original 7 nm design. It was decided that a downgrade in transfer rates would appease the legal teams, and get newly redesigned Biren silicon back onto the assembly line.

NVIDIA Prepares H800 Adaptation of H100 GPU for the Chinese Market

NVIDIA's H100 accelerator is one of the most powerful solutions for powering AI workloads. And, of course, every company and government wants to use it to power its AI workload. However, in countries like China, shipment of US-made goods is challenging. With export regulations in place, NVIDIA had to get creative and make a specific version of its H100 GPU for the Chinese market, labeled the H800 model. Late last year, NVIDIA also created a China-specific version of the A100 model called A800, with the only difference being the chip-to-chip interconnect bandwidth being dropped from 600 GB/s to 400 GB/s.

This year's H800 SKU also features similar restrictions, and the company appears to have made similar sacrifices for shipping its chips to China. From the 600 GB/s bandwidth of the regular H100 PCIe model, the H800 is gutted to only 300 GB/s of bi-directional chip-to-chip interconnect bandwidth speed. While we have no data if the CUDA or Tensor core count has been adjusted, the sacrifice of bandwidth to comply with export regulations will have consequences. As the communication speed is reduced, training large models will increase the latency and slow the workload compared to the regular H100 chip. This is due to the massive data size that needs to travel from one chip to another. According to Reuters, an NVIDIA spokesperson declined to discuss other differences, stating that "our 800 series products are fully compliant with export control regulations."

NVIDIA Prepares H100 NVL GPUs With More Memory and SLI-Like Capability

NVIDIA has killed SLI on its graphics cards, disabling the possibility of connecting two or more GPUs to harness their power for gaming and other workloads. However, SLI is making a reincarnation today in the form of a new H100 GPU model that spots higher memory capacity and higher performance. Called the H100 NVL, the GPU is a unique edition design based on the regular H100 PCIe version. What makes the H100 HVL version so special is the boost in memory capacity, now up from 80 GB in the standard model to 94 GB in the NVL edition SKU, for a total of 188 GB of HMB3 memory, running on a 6144-bit bus. Being a special edition SKU, it is sold only in pairs, as these H100 NVL GPUs are paired together and are connected by three NVLink connectors on top. Installation requires two PCIe slots, separated by dual-slot spacing.

The performance differences between the H100 PCIe version and the H100 SXM version are now matched with the new H100 NVL, as the card features a boost in the TDP with up to 400 Watts per card, which is configurable. The H100 NVL uses the same Tensor and CUDA core configuration as the SXM edition, except it is placed on a PCIe slot and connected to another card. Being sold in pairs, OEMs can outfit their systems with either two or four pairs per certified system. You can see the specification table below, with information filled out by AnandTech. As NVIDIA says, the need for this special edition SKU is the emergence of Large Language Models (LLMs) that require significant computational power to run. "Servers equipped with H100 NVL GPUs increase GPT-175B model performance up to 12X over NVIDIA DGX A100 systems while maintaining low latency in power-constrained data center environments," noted the company.

ASUS Announces NVIDIA-Certified Servers and ProArt Studiobook Pro 16 OLED at GTC

ASUS today announced its participation in NVIDIA GTC, a developer conference for the era of AI and the metaverse. ASUS will offer comprehensive NVIDIA-certified server solutions that support the latest NVIDIA L4 Tensor Core GPU—which accelerates real-time video AI and generative AI—as well as the NVIDIA BlueField -3 DPU, igniting unprecedented innovation for supercomputing infrastructure. ASUS will also launch the new ProArt Studiobook Pro 16 OLED laptop with the NVIDIA RTX 3000 Ada Generation Laptop GPU for mobile creative professionals.

Purpose-built GPU servers for generative AI
Generative AI applications enable businesses to develop better products and services, and deliver original content tailored to the unique needs of customers and audiences. ASUS ESC8000 and ESC4000 are fully certified NVIDIA servers that support up to eight NVIDIA L4 Tensor Core GPUs, which deliver universal acceleration and energy efficiency for AI with up to 2.7X more generative AI performance than the previous GPU generation. ASUS ESC and RS series servers are engineered for HPC workloads, with support for the NVIDIA Bluefield-3 DPU to transform data center infrastructure, as well as NVIDIA AI Enterprise applications for streamlined AI workflows and deployment.

Supermicro Servers Now Featuring NVIDIA HGX and PCIe-Based H100 8-GPU Systems

Supermicro, Inc., a Total IT Solution Provider for AI/ML, Cloud, Storage, and 5G/Edge, today has announced that it has begun shipping its top-of-the-line new GPU servers that feature the latest NVIDIA HGX H100 8-GPU system. Supermicro servers incorporate the new NVIDIA L4 Tensor Core GPU in a wide range of application-optimized servers from the edge to the data center.

"Supermicro offers the most comprehensive portfolio of GPU systems in the industry, including servers in 8U, 6U, 5U, 4U, 2U, and 1U form factors, as well as workstations and SuperBlade systems that support the full range of new NVIDIA H100 GPUs," said Charles Liang, president and CEO of Supermicro. "With our new NVIDIA HGX H100 Delta-Next server, customers can expect 9x performance gains compared to the previous generation for AI training applications. Our GPU servers have innovative airflow designs which reduce fan speeds, lower noise levels, and consume less power, resulting in a reduced total cost of ownership (TCO). In addition, we deliver complete rack-scale liquid-cooling options for customers looking to further future-proof their data centers."

Mitsui and NVIDIA Announce World's First Generative AI Supercomputer for Pharmaceutical Industry

Mitsui & Co., Ltd., one of Japan's largest business conglomerates, is collaborating with NVIDIA on Tokyo-1—an initiative to supercharge the nation's pharmaceutical leaders with technology, including high-resolution molecular dynamics simulations and generative AI models for drug discovery.

Announced today at the NVIDIA GTC global AI conference, the Tokyo-1 project features an NVIDIA DGX AI supercomputer that will be accessible to Japan's pharma companies and startups. The effort is poised to accelerate Japan's $100 billion pharma industry, the world's third largest following the U.S. and China.

NVIDIA Hopper GPUs Expand Reach as Demand for AI Grows

NVIDIA and key partners today announced the availability of new products and services featuring the NVIDIA H100 Tensor Core GPU—the world's most powerful GPU for AI—to address rapidly growing demand for generative AI training and inference. Oracle Cloud Infrastructure (OCI) announced the limited availability of new OCI Compute bare-metal GPU instances featuring H100 GPUs. Additionally, Amazon Web Services announced its forthcoming EC2 UltraClusters of Amazon EC2 P5 instances, which can scale in size up to 20,000 interconnected H100 GPUs. This follows Microsoft Azure's private preview announcement last week for its H100 virtual machine, ND H100 v5.

Additionally, Meta has now deployed its H100-powered Grand Teton AI supercomputer internally for its AI production and research teams. NVIDIA founder and CEO Jensen Huang announced during his GTC keynote today that NVIDIA DGX H100 AI supercomputers are in full production and will be coming soon to enterprises worldwide.

NVIDIA, ASML, TSMC and Synopsys Set Foundation for Next-Generation Chip Manufacturing

NVIDIA today announced a breakthrough that brings accelerated computing to the field of computational lithography, enabling semiconductor leaders like ASML, TSMC and Synopsys to accelerate the design and manufacturing of next-generation chips, just as current production processes are nearing the limits of what physics makes possible.

The new NVIDIA cuLitho software library for computational lithography is being integrated by TSMC, the world's leading foundry, as well as electronic design automation leader Synopsys into their software, manufacturing processes and systems for the latest-generation NVIDIA Hopper architecture GPUs. Equipment maker ASML is working closely with NVIDIA on GPUs and cuLitho, and is planning to integrate support for GPUs into all of its computational lithography software products.

NVIDIA to Lose Two Major HPC Partners in China, Focuses on Complying with Export Control Rules

NVIDIA's presence in high-performance computing has steadily increased, with various workloads benefiting from the company's AI and HPC accelerator GPUs. One of the important markets for the company is China, and export regulations are about to complicate NVIDIA's business dealing with the country. NVIDIA's major partners in the Asia Pacific region are Inspur and Huawei, which make servers powered by A100 and H100 GPU solutions. Amid the latest Biden Administration complications, the US is considering limiting more export of US-designed goods to Chinese entities. Back in 2019, the US blacklisted Huawei and restricted the sales of the latest GPU hardware to the company. Last week, the Biden Administration also blacklisted Inspur, the world's third-largest server maker.

In the Morgan Stanley conference, NVIDIA's Chief Financial Officer Colette Cress noted that: "Inspur is a partner for us, when we indicate a partner, they are helping us stand up computing for the end customers. As we work forward, we will probably be working with other partners, for them to stand-up compute within the Asia-Pac region or even other parts of the world. But again, our most important focus is focusing on the law and making sure that we follow export controls very closely. So in this case, we will look in terms of other partners to help us." This indicates that NVIDIA will lose millions of dollars in revenue due to the inability to sell its GPUs to partners like Inspur. As the company stated, complying with the export regulations is the most crucial focus.

Shipments of AI Servers Will Climb at CAGR of 10.8% from 2022 to 2026

According to TrendForce's latest survey of the server market, many cloud service providers (CSPs) have begun large-scale investments in the kinds of equipment that support artificial intelligence (AI) technologies. This development is in response to the emergence of new applications such as self-driving cars, artificial intelligence of things (AIoT), and edge computing since 2018. TrendForce estimates that in 2022, AI servers that are equipped with general-purpose GPUs (GPGPUs) accounted for almost 1% of annual global server shipments. Moving into 2023, shipments of AI servers are projected to grow by 8% YoY thanks to ChatBot and similar applications generating demand across AI-related fields. Furthermore, shipments of AI servers are forecasted to increase at a CAGR of 10.8% from 2022 to 2026.

Next-Generation Dell PowerEdge Servers Deliver Advanced Performance and Energy Efficient Design

Dell Technologies expands the industry's top selling server portfolio, with an additional 13 next-generation Dell PowerEdge servers, designed to accelerate performance and reliability for powerful computing across core data centers, large-scale public clouds and edge locations. Next-generation rack, tower and multi-node PowerEdge servers, with 4th Gen Intel Xeon Scalable processors, include Dell software and engineering advancements, such as a new Smart Flow design, to improve energy and cost efficiency. Expanded Dell APEX capabilities will help organizations take an as-a-Service approach, allowing for more effective IT operations that make the most of compute resources while minimizing risk.

"Customers come to Dell for easily managed yet sophisticated and efficient servers with advanced capabilities to power their business-critical workloads," said Jeff Boudreau, president and general manager, Infrastructure Solutions Group, Dell Technologies. "Our next-generation Dell PowerEdge servers offer unmatched innovation that raises the bar in power efficiency, performance and reliability while simplifying how customers can implement a Zero Trust approach for greater security throughout their IT environments."

Hewlett Packard Enterprise Brings HPE Cray EX and HPE Cray XD Supercomputers to Enterprise Customers

Hewlett Packard Enterprise (NYSE: HPE) today announced it is making supercomputing accessible for more enterprises to harness insights, solve problems and innovate faster by delivering its world-leading, energy-efficient supercomputers in a smaller form factor and at a lower price point.

The expanded portfolio includes new HPE Cray EX and HPE Cray XD supercomputers, which are based on HPE's exascale innovation that delivers end-to-end, purpose-built technologies in compute, accelerated compute, interconnect, storage, software, and flexible power and cooling options. The supercomputers provide significant performance and AI-at-scale capabilities to tackle demanding, data-intensive workloads, speed up AI and machine learning initiatives, and accelerate innovation to deliver products and services to market faster.

Meta's Grand Teton Brings NVIDIA Hopper to Its Data Centers

Meta today announced its next-generation AI platform, Grand Teton, including NVIDIA's collaboration on design. Compared to the company's previous generation Zion EX platform, the Grand Teton system packs in more memory, network bandwidth and compute capacity, said Alexis Bjorlin, vice president of Meta Infrastructure Hardware, at the 2022 OCP Global Summit, an Open Compute Project conference.

AI models are used extensively across Facebook for services such as news feed, content recommendations and hate-speech identification, among many other applications. "We're excited to showcase this newest family member here at the summit," Bjorlin said in prepared remarks for the conference, adding her thanks to NVIDIA for its deep collaboration on Grand Teton's design and continued support of OCP.

CORSAIR Has Everything PC Builders Need for AMD Ryzen 7000

CORSAIR, a world leader in enthusiast components for gamers, creators, and PC builders, today announced its comprehensive product readiness for the newly available AMD Ryzen 7000 series of processors and the accompanying X670 and B650 chipset motherboards. From a new dedicated range of CORSAIR DDR5 memory for AMD platforms, to a huge lineup of award-winning CPU coolers, PC power supplies, cases and accessories, CORSAIR has the complete lineup of products to help enthusiasts build their new AMD-powered PC.

New AMD Ryzen 7000 processors and their supporting X670 and B650 chipset motherboards bring with them a huge change from past generations in the adoption of DDR5 memory, substantially increasing memory frequency versus DDR4. CORSAIR has a complete range of performance DDR5 created especially for AMD platforms, including the illustrious DOMINATOR PLATINUM RGB DDR5, the panoramically lit VENGEANCE RGB DDR5, or minimalist VENGEANCE DDR5. Available in a range of frequencies up to 6,000 MHz and capacities up to 64 GB, all CORSAIR DDR5 memory for AMD supports the new AMD EXPO (Extended Profiles for Overclocking) standard, offering single-setting-setup to ensure owners can easily run their memory at the speed it was created to run at.

NVIDIA Could Launch Hopper H100 PCIe GPU with 120 GB Memory

NVIDIA's high-performance computing hardware stack is now equipped with the top-of-the-line Hopper H100 GPU. It features 16896 or 14592 CUDA cores, developing if it comes in SXM5 of PCIe variant, with the former being more powerful. Both variants come with a 5120-bit interface, with the SXM5 version using HBM3 memory running at 3.0 Gbps speed and the PCIe version using HBM2E memory running at 2.0 Gbps. Both versions use the same capacity capped at 80 GBs. However, that could soon change with the latest rumor suggesting that NVIDIA could be preparing a PCIe version of Hopper H100 GPU with 120 GBs of an unknown type of memory installed.

According to the Chinese website "s-ss.cc" the 120 GB variant of the H100 PCIe card will feature an entire GH100 chip with everything unlocked. As the site suggests, this version will improve memory capacity and performance over the regular H100 PCIe SKU. With HPC workloads increasing in size and complexity, more significant memory allocation is needed for better performance. With the recent advances in Large Language Models (LLMs), AI workloads use trillions of parameters for tranining, most of which is done on GPUs like NVIDIA H100.

NVIDIA Introduces L40 Omniverse Graphics Card

During its GTC 2022 session, NVIDIA introduced its new generation of gaming graphics cards based on the novel Ada Lovelace architecture. Dubbed NVIDIA GeForce RTX 40 series, it brings various updates like more CUDA cores, a new DLSS 3 version, 4th generation Tensor cores, 3rd generation Ray Tracing cores, and much more, which you can read about here. However, today, we also got a new Ada Lovelace card intended for the data center. Called the L40, NVIDIA updated its previous Ampere-based A40 design. While the NVIDIA website provides sparse, the new L40 GPU uses 48 GB GDDR6 memory with ECC error correction. Using NVLink, you can get 96GBs of VRAM. Paired with an unknown SKU, we assume that it uses AD102 with adjusted frequencies to lower the TDP and allow for passive cooling.

NVIDIA is calling this their Omniverse GPU, as it is a part of the push to separate its GPUs used for graphics and AI/HPC models. The "L" model in the current product stack is used to accelerate graphics, with display ports installed on the GPU, while the "H" models (H100) are there to accelerate HPC/AI installments where visual elements are a secondary task. This is a further separation of the entire GPU market, where the HPC/AI SKUs get their own architecture, and GPUs for graphics processing are built on a new architecture as well. You can see the specifications provided by NVIDIA below.
Return to Keyword Browsing
Jul 5th, 2025 13:31 CDT change timezone

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts