News Posts matching #H100

Return to Keyword Browsing

NVIDIA A800 China-Tailored GPU Performance within 70% of A100

The recent growth in demand for training Large Language Models (LLMs) like Generative Pre-trained Transformer (GPT) has sparked the interest of many companies to invest in GPU solutions that are used to train these models. However, countries like China have struggled with US sanctions, and NVIDIA has to create custom models that meet US export regulations. Carrying two GPUs, H800 and A800, they represent cut-down versions of the original H100 and A100, respectively. We reported about H800; however, it remained as mysterious as A800 that we are talking about today. Thanks to MyDrivers, we have information that the A800 GPU performance is within 70% of the regular A100.

The regular A100 GPU manages 9.7 TeraFLOPs of FP64, 19.5 TeraFLOPS of FP64 Tensor, and up to 624 BF16/FP16 TeraFLOPS with sparsity. A rough napkin math would suggest that 70% performance of the original (a 30% cut) would equal 6.8 TeraFLOPs of FP64 precision, 13.7 TeraFLOPs of FP64 Tensor, and 437 BF16/FP16 TeraFLOPs with sparsity. MyDrivers notes that A800 can be had for 100,000 Yuan, translating to about 14,462 USD at the time of writing. This is not the most capable GPU that Chinese companies can acquire, as H800 exists. However, we don't have any information about its performance for now.

NVIDIA DGX H100 Systems are Now Shipping

Customers from Japan to Ecuador and Sweden are using NVIDIA DGX H100 systems like AI factories to manufacture intelligence. They're creating services that offer AI-driven insights in finance, healthcare, law, IT and telecom—and working to transform their industries in the process. Among the dozens of use cases, one aims to predict how factory equipment will age, so tomorrow's plants can be more efficient.

Called Green Physics AI, it adds information like an object's CO2 footprint, age and energy consumption to SORDI.ai, which claims to be the largest synthetic dataset in manufacturing.

NVIDIA H100 Compared to A100 for Training GPT Large Language Models

NVIDIA's H100 has recently become available to use via Cloud Service Providers (CSPs), and it was only a matter of time before someone decided to benchmark its performance and compare it to the previous generation's A100 GPU. Today, thanks to the benchmarks of MosaicML, a startup company led by the ex-CEO of Nervana and GM of Artificial Intelligence (AI) at Intel, Naveen Rao, we have some comparison between these two GPUs with a fascinating insight about the cost factor. Firstly, MosaicML has taken Generative Pre-trained Transformer (GPT) models of various sizes and trained them using bfloat16 and FP8 Floating Point precision formats. All training occurred on CoreWeave cloud GPU instances.

Regarding performance, the NVIDIA H100 GPU achieved anywhere from 2.2x to 3.3x speedup. However, an interesting finding emerges when comparing the cost of running these GPUs in the cloud. CoreWeave prices the H100 SXM GPUs at $4.76/hr/GPU, while the A100 80 GB SXM gets $2.21/hr/GPU pricing. While the H100 is 2.2x more expensive, the performance makes it up, resulting in less time to train a model and a lower price for the training process. This inherently makes H100 more attractive for researchers and companies wanting to train Large Language Models (LLMs) and makes choosing the newer GPU more viable, despite the increased cost. Below, you can see tables of comparison between two GPUs in training time, speedup, and cost of training.

Gigabyte Extends Its Leading GPU Portfolio of Servers

Giga Computing, a subsidiary of GIGABYTE and an industry leader in high-performance servers, server motherboards, and workstations, today announced a lineup of powerful GPU-centric servers with the latest AMD and Intel CPUs, including NVIDIA HGX H100 servers with both 4-GPU and 8-GPU modules. With growing interest in HPC and AI applications, specifically generative AI (GAI), this breed of server relies heavily on GPU resources to tackle compute-heavy workloads that handle large amounts of data. With the advent of OpenAI's ChatGPT and other AI chatbots, large GPU clusters are being deployed with system-level optimization to train large language models (LLMs). These LLMs can be processed by GIGABYTE's new design-optimized systems that offer a high level of customization based on users' workloads and requirements.

The GIGABYTE G-series servers are built first and foremost to support dense GPU compute and the latest PCIe technology. Starting with the 2U servers, the new G293 servers can support up to 8 dual-slot GPUs or 16 single-slot GPUs, depending on the server model. For the ultimate in CPU and GPU performance, the 4U G493 servers offer plenty of networking options and storage configurations to go alongside support for eight (Gen 5 x16) GPUs. And for the highest level of GPU compute for HPC and AI, the G393 & G593 series support NVIDIA H100 Tensor Core GPUs. All these new two CPU socket servers are designed for either 4th Gen AMD EPYC processors or 4th Gen Intel Xeon Scalable processors.

NVIDIA H100 AI Performance Receives up to 54% Uplift with Optimizations

On Wednesday, the MLCommons team released the MLPerf 3.0 Inference numbers, and there was an exciting submission from NVIDIA. Reportedly, NVIDIA has used software optimization to improve the already staggering performance of its latest H100 GPU by up to 54%. For reference, NVIDIA's H100 GPU first appeared on MLPerf 2.1 back in September of 2022. In just six months, NVIDIA engineers worked on AI optimizations for the MLPerf 3.0 release to find that basic software optimization can catalyze performance increases anywhere from 7-54%. The workloads for measuring the inferencing speed suite included RNN-T speech recognition, 3D U-Net medical imaging, RetinaNet object detection, ResNet-50 object classification, DLRM recommendation, and BERT 99/99.9% natural language processing.

What is interesting is that NVIDIA's submission is a bit modified. There are open and closed categories that vendors have to compete in, where closed is the mathematical equivalent of a neural network. In contrast, the open category is flexible and allows vendors to submit results based on optimizations for their hardware. The closed submission aims to provide an "apples-to-apples" hardware comparison. Given that NVIDIA opted to use the closed category, performance optimization of other vendors such as Intel and Qualcomm are not accounted for here. Still, it is interesting that optimization can lead to a performance increase of up to 54% in NVIDIA's case with its H100 GPU. Another interesting takeaway is that some comparable hardware, like Qualcomm Cloud AI 100, Intel Xeon Platinum 8480+, and NeuChips's ReccAccel N3000, failed to finish all the workloads. This is shown as "X" on the slides made by NVIDIA, stressing the need for proper ML system software support, which is NVIDIA's strength and an extensive marketing claim.

Chinese GPU Maker Biren Technology Loses its Co-Founder, Only Months After Revealing New GPUs

Golf Jiao, a co-founder and general manager of Biren Technology, has left the company late last month according to insider sources in China. No official statement has been issued by the executive team at Biren Tech, and Jiao has not provided any details regarding his departure from the fabless semiconductor design company. The Shanghai-based firm is a relatively new startup - it was founded in 2019 by several former NVIDIA, Qualcomm and Alibaba veterans. Biren Tech received $726.6 million in funding for its debut range of general-purpose graphics processing units (GPGPUs), also defined as high-performance computing graphics processing units (HPC GPUs).

The company revealed its ambitions to take on NVIDIA's Ampere A100 and Hopper H100 compute platforms, and last August announced two HPC GPUs in the form of the BR100 and BR104. The specifications and performance charts demonstrated impressive figures, but Biren Tech had to roll back its numbers when it was hit by U.S Government enforced sanctions in October 2022. The fabless company had contracted with TSMC to produce its Biren range, and the new set of rules resulted in shipments from the Taiwanese foundry being halted. Biren Tech cut its work force by a third soon after losing its supply chain with TSMC, and the engineering team had to reassess how the BR100 and BR104 would perform on a process node larger than the original 7 nm design. It was decided that a downgrade in transfer rates would appease the legal teams, and get newly redesigned Biren silicon back onto the assembly line.

NVIDIA Prepares H800 Adaptation of H100 GPU for the Chinese Market

NVIDIA's H100 accelerator is one of the most powerful solutions for powering AI workloads. And, of course, every company and government wants to use it to power its AI workload. However, in countries like China, shipment of US-made goods is challenging. With export regulations in place, NVIDIA had to get creative and make a specific version of its H100 GPU for the Chinese market, labeled the H800 model. Late last year, NVIDIA also created a China-specific version of the A100 model called A800, with the only difference being the chip-to-chip interconnect bandwidth being dropped from 600 GB/s to 400 GB/s.

This year's H800 SKU also features similar restrictions, and the company appears to have made similar sacrifices for shipping its chips to China. From the 600 GB/s bandwidth of the regular H100 PCIe model, the H800 is gutted to only 300 GB/s of bi-directional chip-to-chip interconnect bandwidth speed. While we have no data if the CUDA or Tensor core count has been adjusted, the sacrifice of bandwidth to comply with export regulations will have consequences. As the communication speed is reduced, training large models will increase the latency and slow the workload compared to the regular H100 chip. This is due to the massive data size that needs to travel from one chip to another. According to Reuters, an NVIDIA spokesperson declined to discuss other differences, stating that "our 800 series products are fully compliant with export control regulations."

NVIDIA Prepares H100 NVL GPUs With More Memory and SLI-Like Capability

NVIDIA has killed SLI on its graphics cards, disabling the possibility of connecting two or more GPUs to harness their power for gaming and other workloads. However, SLI is making a reincarnation today in the form of a new H100 GPU model that spots higher memory capacity and higher performance. Called the H100 NVL, the GPU is a unique edition design based on the regular H100 PCIe version. What makes the H100 HVL version so special is the boost in memory capacity, now up from 80 GB in the standard model to 94 GB in the NVL edition SKU, for a total of 188 GB of HMB3 memory, running on a 6144-bit bus. Being a special edition SKU, it is sold only in pairs, as these H100 NVL GPUs are paired together and are connected by three NVLink connectors on top. Installation requires two PCIe slots, separated by dual-slot spacing.

The performance differences between the H100 PCIe version and the H100 SXM version are now matched with the new H100 NVL, as the card features a boost in the TDP with up to 400 Watts per card, which is configurable. The H100 NVL uses the same Tensor and CUDA core configuration as the SXM edition, except it is placed on a PCIe slot and connected to another card. Being sold in pairs, OEMs can outfit their systems with either two or four pairs per certified system. You can see the specification table below, with information filled out by AnandTech. As NVIDIA says, the need for this special edition SKU is the emergence of Large Language Models (LLMs) that require significant computational power to run. "Servers equipped with H100 NVL GPUs increase GPT-175B model performance up to 12X over NVIDIA DGX A100 systems while maintaining low latency in power-constrained data center environments," noted the company.

ASUS Announces NVIDIA-Certified Servers and ProArt Studiobook Pro 16 OLED at GTC

ASUS today announced its participation in NVIDIA GTC, a developer conference for the era of AI and the metaverse. ASUS will offer comprehensive NVIDIA-certified server solutions that support the latest NVIDIA L4 Tensor Core GPU—which accelerates real-time video AI and generative AI—as well as the NVIDIA BlueField -3 DPU, igniting unprecedented innovation for supercomputing infrastructure. ASUS will also launch the new ProArt Studiobook Pro 16 OLED laptop with the NVIDIA RTX 3000 Ada Generation Laptop GPU for mobile creative professionals.

Purpose-built GPU servers for generative AI
Generative AI applications enable businesses to develop better products and services, and deliver original content tailored to the unique needs of customers and audiences. ASUS ESC8000 and ESC4000 are fully certified NVIDIA servers that support up to eight NVIDIA L4 Tensor Core GPUs, which deliver universal acceleration and energy efficiency for AI with up to 2.7X more generative AI performance than the previous GPU generation. ASUS ESC and RS series servers are engineered for HPC workloads, with support for the NVIDIA Bluefield-3 DPU to transform data center infrastructure, as well as NVIDIA AI Enterprise applications for streamlined AI workflows and deployment.

Supermicro Servers Now Featuring NVIDIA HGX and PCIe-Based H100 8-GPU Systems

Supermicro, Inc., a Total IT Solution Provider for AI/ML, Cloud, Storage, and 5G/Edge, today has announced that it has begun shipping its top-of-the-line new GPU servers that feature the latest NVIDIA HGX H100 8-GPU system. Supermicro servers incorporate the new NVIDIA L4 Tensor Core GPU in a wide range of application-optimized servers from the edge to the data center.

"Supermicro offers the most comprehensive portfolio of GPU systems in the industry, including servers in 8U, 6U, 5U, 4U, 2U, and 1U form factors, as well as workstations and SuperBlade systems that support the full range of new NVIDIA H100 GPUs," said Charles Liang, president and CEO of Supermicro. "With our new NVIDIA HGX H100 Delta-Next server, customers can expect 9x performance gains compared to the previous generation for AI training applications. Our GPU servers have innovative airflow designs which reduce fan speeds, lower noise levels, and consume less power, resulting in a reduced total cost of ownership (TCO). In addition, we deliver complete rack-scale liquid-cooling options for customers looking to further future-proof their data centers."

Mitsui and NVIDIA Announce World's First Generative AI Supercomputer for Pharmaceutical Industry

Mitsui & Co., Ltd., one of Japan's largest business conglomerates, is collaborating with NVIDIA on Tokyo-1—an initiative to supercharge the nation's pharmaceutical leaders with technology, including high-resolution molecular dynamics simulations and generative AI models for drug discovery.

Announced today at the NVIDIA GTC global AI conference, the Tokyo-1 project features an NVIDIA DGX AI supercomputer that will be accessible to Japan's pharma companies and startups. The effort is poised to accelerate Japan's $100 billion pharma industry, the world's third largest following the U.S. and China.

NVIDIA Hopper GPUs Expand Reach as Demand for AI Grows

NVIDIA and key partners today announced the availability of new products and services featuring the NVIDIA H100 Tensor Core GPU—the world's most powerful GPU for AI—to address rapidly growing demand for generative AI training and inference. Oracle Cloud Infrastructure (OCI) announced the limited availability of new OCI Compute bare-metal GPU instances featuring H100 GPUs. Additionally, Amazon Web Services announced its forthcoming EC2 UltraClusters of Amazon EC2 P5 instances, which can scale in size up to 20,000 interconnected H100 GPUs. This follows Microsoft Azure's private preview announcement last week for its H100 virtual machine, ND H100 v5.

Additionally, Meta has now deployed its H100-powered Grand Teton AI supercomputer internally for its AI production and research teams. NVIDIA founder and CEO Jensen Huang announced during his GTC keynote today that NVIDIA DGX H100 AI supercomputers are in full production and will be coming soon to enterprises worldwide.

NVIDIA, ASML, TSMC and Synopsys Set Foundation for Next-Generation Chip Manufacturing

NVIDIA today announced a breakthrough that brings accelerated computing to the field of computational lithography, enabling semiconductor leaders like ASML, TSMC and Synopsys to accelerate the design and manufacturing of next-generation chips, just as current production processes are nearing the limits of what physics makes possible.

The new NVIDIA cuLitho software library for computational lithography is being integrated by TSMC, the world's leading foundry, as well as electronic design automation leader Synopsys into their software, manufacturing processes and systems for the latest-generation NVIDIA Hopper architecture GPUs. Equipment maker ASML is working closely with NVIDIA on GPUs and cuLitho, and is planning to integrate support for GPUs into all of its computational lithography software products.

NVIDIA to Lose Two Major HPC Partners in China, Focuses on Complying with Export Control Rules

NVIDIA's presence in high-performance computing has steadily increased, with various workloads benefiting from the company's AI and HPC accelerator GPUs. One of the important markets for the company is China, and export regulations are about to complicate NVIDIA's business dealing with the country. NVIDIA's major partners in the Asia Pacific region are Inspur and Huawei, which make servers powered by A100 and H100 GPU solutions. Amid the latest Biden Administration complications, the US is considering limiting more export of US-designed goods to Chinese entities. Back in 2019, the US blacklisted Huawei and restricted the sales of the latest GPU hardware to the company. Last week, the Biden Administration also blacklisted Inspur, the world's third-largest server maker.

In the Morgan Stanley conference, NVIDIA's Chief Financial Officer Colette Cress noted that: "Inspur is a partner for us, when we indicate a partner, they are helping us stand up computing for the end customers. As we work forward, we will probably be working with other partners, for them to stand-up compute within the Asia-Pac region or even other parts of the world. But again, our most important focus is focusing on the law and making sure that we follow export controls very closely. So in this case, we will look in terms of other partners to help us." This indicates that NVIDIA will lose millions of dollars in revenue due to the inability to sell its GPUs to partners like Inspur. As the company stated, complying with the export regulations is the most crucial focus.

Shipments of AI Servers Will Climb at CAGR of 10.8% from 2022 to 2026

According to TrendForce's latest survey of the server market, many cloud service providers (CSPs) have begun large-scale investments in the kinds of equipment that support artificial intelligence (AI) technologies. This development is in response to the emergence of new applications such as self-driving cars, artificial intelligence of things (AIoT), and edge computing since 2018. TrendForce estimates that in 2022, AI servers that are equipped with general-purpose GPUs (GPGPUs) accounted for almost 1% of annual global server shipments. Moving into 2023, shipments of AI servers are projected to grow by 8% YoY thanks to ChatBot and similar applications generating demand across AI-related fields. Furthermore, shipments of AI servers are forecasted to increase at a CAGR of 10.8% from 2022 to 2026.

Next-Generation Dell PowerEdge Servers Deliver Advanced Performance and Energy Efficient Design

Dell Technologies expands the industry's top selling server portfolio, with an additional 13 next-generation Dell PowerEdge servers, designed to accelerate performance and reliability for powerful computing across core data centers, large-scale public clouds and edge locations. Next-generation rack, tower and multi-node PowerEdge servers, with 4th Gen Intel Xeon Scalable processors, include Dell software and engineering advancements, such as a new Smart Flow design, to improve energy and cost efficiency. Expanded Dell APEX capabilities will help organizations take an as-a-Service approach, allowing for more effective IT operations that make the most of compute resources while minimizing risk.

"Customers come to Dell for easily managed yet sophisticated and efficient servers with advanced capabilities to power their business-critical workloads," said Jeff Boudreau, president and general manager, Infrastructure Solutions Group, Dell Technologies. "Our next-generation Dell PowerEdge servers offer unmatched innovation that raises the bar in power efficiency, performance and reliability while simplifying how customers can implement a Zero Trust approach for greater security throughout their IT environments."

Hewlett Packard Enterprise Brings HPE Cray EX and HPE Cray XD Supercomputers to Enterprise Customers

Hewlett Packard Enterprise (NYSE: HPE) today announced it is making supercomputing accessible for more enterprises to harness insights, solve problems and innovate faster by delivering its world-leading, energy-efficient supercomputers in a smaller form factor and at a lower price point.

The expanded portfolio includes new HPE Cray EX and HPE Cray XD supercomputers, which are based on HPE's exascale innovation that delivers end-to-end, purpose-built technologies in compute, accelerated compute, interconnect, storage, software, and flexible power and cooling options. The supercomputers provide significant performance and AI-at-scale capabilities to tackle demanding, data-intensive workloads, speed up AI and machine learning initiatives, and accelerate innovation to deliver products and services to market faster.

Meta's Grand Teton Brings NVIDIA Hopper to Its Data Centers

Meta today announced its next-generation AI platform, Grand Teton, including NVIDIA's collaboration on design. Compared to the company's previous generation Zion EX platform, the Grand Teton system packs in more memory, network bandwidth and compute capacity, said Alexis Bjorlin, vice president of Meta Infrastructure Hardware, at the 2022 OCP Global Summit, an Open Compute Project conference.

AI models are used extensively across Facebook for services such as news feed, content recommendations and hate-speech identification, among many other applications. "We're excited to showcase this newest family member here at the summit," Bjorlin said in prepared remarks for the conference, adding her thanks to NVIDIA for its deep collaboration on Grand Teton's design and continued support of OCP.

CORSAIR Has Everything PC Builders Need for AMD Ryzen 7000

CORSAIR, a world leader in enthusiast components for gamers, creators, and PC builders, today announced its comprehensive product readiness for the newly available AMD Ryzen 7000 series of processors and the accompanying X670 and B650 chipset motherboards. From a new dedicated range of CORSAIR DDR5 memory for AMD platforms, to a huge lineup of award-winning CPU coolers, PC power supplies, cases and accessories, CORSAIR has the complete lineup of products to help enthusiasts build their new AMD-powered PC.

New AMD Ryzen 7000 processors and their supporting X670 and B650 chipset motherboards bring with them a huge change from past generations in the adoption of DDR5 memory, substantially increasing memory frequency versus DDR4. CORSAIR has a complete range of performance DDR5 created especially for AMD platforms, including the illustrious DOMINATOR PLATINUM RGB DDR5, the panoramically lit VENGEANCE RGB DDR5, or minimalist VENGEANCE DDR5. Available in a range of frequencies up to 6,000 MHz and capacities up to 64 GB, all CORSAIR DDR5 memory for AMD supports the new AMD EXPO (Extended Profiles for Overclocking) standard, offering single-setting-setup to ensure owners can easily run their memory at the speed it was created to run at.

NVIDIA Could Launch Hopper H100 PCIe GPU with 120 GB Memory

NVIDIA's high-performance computing hardware stack is now equipped with the top-of-the-line Hopper H100 GPU. It features 16896 or 14592 CUDA cores, developing if it comes in SXM5 of PCIe variant, with the former being more powerful. Both variants come with a 5120-bit interface, with the SXM5 version using HBM3 memory running at 3.0 Gbps speed and the PCIe version using HBM2E memory running at 2.0 Gbps. Both versions use the same capacity capped at 80 GBs. However, that could soon change with the latest rumor suggesting that NVIDIA could be preparing a PCIe version of Hopper H100 GPU with 120 GBs of an unknown type of memory installed.

According to the Chinese website "s-ss.cc" the 120 GB variant of the H100 PCIe card will feature an entire GH100 chip with everything unlocked. As the site suggests, this version will improve memory capacity and performance over the regular H100 PCIe SKU. With HPC workloads increasing in size and complexity, more significant memory allocation is needed for better performance. With the recent advances in Large Language Models (LLMs), AI workloads use trillions of parameters for tranining, most of which is done on GPUs like NVIDIA H100.

NVIDIA Introduces L40 Omniverse Graphics Card

During its GTC 2022 session, NVIDIA introduced its new generation of gaming graphics cards based on the novel Ada Lovelace architecture. Dubbed NVIDIA GeForce RTX 40 series, it brings various updates like more CUDA cores, a new DLSS 3 version, 4th generation Tensor cores, 3rd generation Ray Tracing cores, and much more, which you can read about here. However, today, we also got a new Ada Lovelace card intended for the data center. Called the L40, NVIDIA updated its previous Ampere-based A40 design. While the NVIDIA website provides sparse, the new L40 GPU uses 48 GB GDDR6 memory with ECC error correction. Using NVLink, you can get 96GBs of VRAM. Paired with an unknown SKU, we assume that it uses AD102 with adjusted frequencies to lower the TDP and allow for passive cooling.

NVIDIA is calling this their Omniverse GPU, as it is a part of the push to separate its GPUs used for graphics and AI/HPC models. The "L" model in the current product stack is used to accelerate graphics, with display ports installed on the GPU, while the "H" models (H100) are there to accelerate HPC/AI installments where visual elements are a secondary task. This is a further separation of the entire GPU market, where the HPC/AI SKUs get their own architecture, and GPUs for graphics processing are built on a new architecture as well. You can see the specifications provided by NVIDIA below.

NVIDIA Ada's 4th Gen Tensor Core, 3rd Gen RT Core, and Latest CUDA Core at a Glance

Yesterday, NVIDIA launched its GeForce RTX 40-series, based on the "Ada" graphics architecture. We're yet to receive a technical briefing about the architecture itself, and the various hardware components that make up the silicon; but NVIDIA on its website gave us a first look at what's in store with the key number-crunching components of "Ada," namely the Ada CUDA core, 4th generation Tensor core, and 3rd generation RT core. Besides generational IPC and clock speed improvements, the latest CUDA core benefits from SER (shader execution reordering), an SM or GPC-level feature that reorders execution waves/threads to optimally load each CUDA core and improve parallelism.

Despite using specialized hardware such as the RT cores, the ray tracing pipeline still relies on CUDA cores and the CPU for a handful tasks, and here NVIDIA claims that SER contributes to a 3X ray tracing performance uplift (the performance contribution of CUDA cores). With traditional raster graphics, SER contributes a meaty 25% performance uplift. With Ada, NVIDIA is introducing its 4th generation of Tensor core (after Volta, Turing, and Ampere). The Tensor cores deployed on Ada are functionally identical to the ones on the Hopper H100 Tensor Core HPC processor, featuring the new FP8 Transformer Engine, which delivers up to 5X the AI inference performance over the previous generation Ampere Tensor Core (which itself delivered a similar leap by leveraging sparsity).

Supermicro Expands Its NVIDIA-Certified Server Portfolio with New NVIDIA H100 Optimized GPU Systems

Super Micro Computer, Inc., a global leader in enterprise computing, GPUs, storage, networking solutions, and green computing technology, is extending its lead in accelerated compute infrastructure again with a full line of new systems optimized for NVIDIA H100 Tensor Core GPUs- encompassing over 20 product options. With a large portfolio of NVIDIA-Certified Systems, Supermicro is now leveraging the new NVIDIA H100 PCI-E and NVIDIA H100 SXM GPUs.

"Today, Supermicro introduced GPU-based servers with the new NVIDIA H100," said Charles Liang, president, and CEO of Supermicro. "We continue to offer the most comprehensive portfolio in the industry today and can deliver these systems in a range of sizes, including 8U, 5U, 4U, 2U, and 1U options. We also offer the latest GPU in our SuperBlades, workstations, and the Universal GPU systems. Customers can expect up to 30x performance gains for AI inferencing compared to previous GPU generations of accelerators for certain AI applications. Our GPU servers' innovative airflow designs result in reduced fan speeds, less power consumption, lower noise levels, and a lower total cost of ownership."

ASUS Servers Announce AI Developments at NVIDIA GTC

ASUS, the leading IT company in server systems, server motherboards and workstations, today announced its presence at NVIDIA GTC - a developer conference for the era of AI and the metaverse. ASUS will focus on three demonstrations outlining its strategic developments in AI, including: the methodology behind ASUS MLPerf Training v2.0 results that achieved multiple breakthrough records; a success story exploring the building of an academic AI data center at King Abdullah University of Science and Technology (KAUST) in Saudi Arabia; and a research AI data center created in conjunction with the National Health Research Institute in Taiwan.

MLPerf benchmark results help advance machine-learning performance and efficiency, allowing researchers to evaluate the efficacy of AI training and inference based on specific server configurations. Since joining MLCommons in 2021, ASUS has gained multiple breakthrough records in the data center closed division across six AI-benchmark tasks in AI training and inferencing MLPerf Training v2.0. At the ASUS GTC session, senior ASUS software engineers will share the methodology for achieving these world-class results—as well as the company's efforts to deliver more efficient AI workflows through machine learning.

NVIDIA Rush-Orders A100 and H100 AI-GPUs with TSMC Before US Sanctions Hit

Early this month, the US Government banned American companies from exporting AI-acceleration GPUs to China and Russia, but these restrictions don't take effect before March 2023. This gives NVIDIA time to take rush-orders from Chinese companies for its AI-accelerators before the sanctions hit. The company has placed "rush orders" for a large quantity of A100 "Ampere" and H100 "Hopper" chips with TSMC, so they could be delivered to firms in China before March 2023, according to a report by Chinese business news publication UDN. The rush-orders for high-margin products such as AI-GPUs, could come as a shot in the arm for NVIDIA, which is facing a sudden loss in gaming GPU revenues, as those chips are no longer in demand from crypto-currency miners.
Return to Keyword Browsing
May 21st, 2024 18:32 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts