News Posts matching #H200

Return to Keyword Browsing

China Plans to Deploy 115,000 NVIDIA AI GPUs Across 36 Data Centers

by

Thursday, 03:27 Discuss (26 Comments)

China's leading AI companies have unveiled plans to construct a massive network of 36 data centers across China's Western deserts, which will house over 115,000 NVIDIA AI processors. According to a Bloomberg analysis, which gained insight into investment approvals and tender documents, the critical location for this effort is a complex facility situated near Yiwu in Xinjiang Province, selected for its suitable wind and solar resources, coal reserves, and cooler high-altitude climate. Chinese AI labs aim to deploy NVIDIA's flagship H100 and H200 GPUs, which are primarily sourced from third-party suppliers. One Chinese firm proposes an initial phase of 625 H100 servers, equivalent to roughly 2,000 chips, with additional phases to follow.

A key obstacle to this great vision is US export controls that prohibit the sale of NVIDIA's most advanced processors to China without special licenses, which have not been granted. None of the firms or government spokespersons Bloomberg contacted have explained how they will acquire these embargoed GPUs, and experts familiar with trade enforcement and underground AI chip markets express skepticism that such a large volume of cutting-edge chips could be smuggled undetected. If direct access to NVIDIA hardware remains limited, Chinese chipmakers such as Huawei may fill part of the demand. Construction in Yiwu presses on regardless, where a slogan painted on a nearby hillside proclaims "data‑electricity fusion shows great promise," reflecting China's determined push to take a leading role in the next generation of AI innovation, despite the hardware slowdown.

Read full story

Alphacool Releases New ES 1-Slot GPU Water Cooler

Press Release by

Jun 19th, 2025 03:43 Discuss (0 Comments)

Alphacool International GmbH from Braunschweig is a pioneer in PC water cooling technology. With one of the industry's most comprehensive product portfolios and over 20 years of experience, Alphacool is now expanding its Enterprise Solutions series with the new ES 1-Slot GPU water cooler for the NVIDIA H200 141 GB - a cooling solution specifically designed for professional use.

The cooler impresses with its compact 1-slot design, making it ideal for use in racks and cases with limited space. The space-saving rear-facing port layout simplifies integration into existing water cooling loops - even under demanding installation conditions. The cooling block is made from high-quality, chrome-plated copper, which is significantly more durable than conventional nickel plating. This combination offers reliable protection against corrosion, scratches, and thermal stress - ideal for continuous 24/7 operation in professional environments.

Read full story

MSI - Micro-Star International

MSI Powers AI's Next Leap for Enterprises at ISC 2025

Press Release by

Jun 11th, 2025 02:21 Discuss (0 Comments)

MSI, a global leader in high-performance server solutions, is showcasing its enterprise-grade, high-performance server platforms at ISC 2025, taking place June 10-12 at booth #E12. Built on standardized and modular architectures, MSI's AI servers are designed to power next-generation AI and accelerated computing workloads, enabling enterprises to rapidly advance their AI innovations.

"As AI workloads continue to grow and evolve toward inference-driven applications, we're seeing a significant shift in how enterprises approach AI deployment," said Danny Hsu, General Manager of Enterprise Platform Solutions at MSI. "With modular and standards-based architectures, enterprise data centers can now adopt AI technologies more quickly and cost-effectively than ever before. This marks a new era where AI is not only powerful but also increasingly accessible to businesses of all sizes.

Read full story

Report: Customers Show Little Interest in AMD Instinct MI325X Accelerators

by

May 13th, 2025 02:25 Discuss (22 Comments)

AMD's Instinct MI325X accelerator has struggled to gain traction with large customers, according to extensive data from SemiAnalysis. Launched in Q2 2025, the MI325X arrived roughly nine months after NVIDIA's H200 and concurrently with NVIDIA's "Blackwell" mass-production roll-out. That timing proved unfavourable, as many buyers opted instead for Blackwell's superior cost-per-performance ratio. Early interest from Microsoft in 2024 failed to translate into repeat orders. After the initial test purchases, Microsoft did not place any further commitments. In response, AMD reduced its margin expectations in an effort to attract other major clients. Oracle and a handful of additional hyperscalers have since expressed renewed interest, but these purchases remain modest compared with NVIDIA's volume.

A fundamental limitation of the MI325X is its eight-GPU scale-up capacity. By contrast, NVIDIA's rack-scale GB200 NVL72 supports up to 72 GPUs in a single cluster. For large-scale AI inference and frontier-level reasoning workloads, that difference is decisive. AMD positioned the MI325X against NVIDIA's air-cooled HGX B200 NVL8 and HGX B300 NVL16 modules. Even in that non-rack-scale segment, NVIDIA maintains an advantage in both raw performance and total-cost-of-ownership efficiency. Nonetheless, there remains potential for the MI325X in smaller-scale deployments that do not require extensive GPU clusters. Smaller model inference should be sufficient for eight GPU clusters, where lots of memory bandwidth and capacity are the primary needs. AMD continues to improve its software ecosystem and maintain competitive pricing, so AI labs developing mid-sized AI models may find the MI325X appealing.

GIGABYTE to Present End-to-End AI Portfolio at COMPUTEX 2025

Press Release by

May 5th, 2025 05:11 Discuss (0 Comments)

GIGABYTE Technology, a global leader in computing innovation, will return to COMPUTEX 2025 from May 20 to 23 under the theme "Omnipresence of Computing: AI Forward." Demonstrating how GIGABYTE's complete spectrum of solutions spanning the AI lifecycle, from data center training to edge deployment and end-user applications reshapes the infrastructure to meet the next-gen AI demands.

⁠As generative AI continues to evolve, so do the demands for handling massive token volumes, real-time data streaming, and high-throughput compute environments. GIGABYTE's end-to-end portfolio - ranging from rack-scale infrastructure to servers, cooling systems, embedded platforms, and personal computing—forms the foundation to accelerate AI breakthroughs across industries.

Read full story

NVIDIA Blackwell Takes Pole Position in Latest MLPerf Inference Results

Press Release by

Apr 2nd, 2025 10:52 Discuss (1 Comment)

In the latest MLPerf Inference V5.0 benchmarks, which reflect some of the most challenging inference scenarios, the NVIDIA Blackwell platform set records - and marked NVIDIA's first MLPerf submission using the NVIDIA GB200 NVL72 system, a rack-scale solution designed for AI reasoning. Delivering on the promise of cutting-edge AI takes a new kind of compute infrastructure, called AI factories. Unlike traditional data centers, AI factories do more than store and process data - they manufacture intelligence at scale by transforming raw data into real-time insights. The goal for AI factories is simple: deliver accurate answers to queries quickly, at the lowest cost and to as many users as possible.

The complexity of pulling this off is significant and takes place behind the scenes. As AI models grow to billions and trillions of parameters to deliver smarter replies, the compute required to generate each token increases. This requirement reduces the number of tokens that an AI factory can generate and increases cost per token. Keeping inference throughput high and cost per token low requires rapid innovation across every layer of the technology stack, spanning silicon, network systems and software.

Read full story

Quantum Machines Announces NVIDIA DGX Quantum Early Access Program

Press Release by

Apr 1st, 2025 12:59 Discuss (0 Comments)

Quantum Machines (QM), the leading provider of advanced quantum control solutions, has recently announced the NVIDIA DGX Quantum Early Customer Program, with a cohort of six leading research groups and quantum computer builders. NVIDIA DGX Quantum, a reference architecture jointly developed by NVIDIA and QM, is the first tightly integrated quantum-classical computing solution, designed to unlock new frontiers in quantum computing research and development. As quantum computers scale, their reliance on classical resources for essential operations, such as quantum error correction (QEC) and parameter drift compensation, grows exponentially. NVIDIA DGX Quantum provides access to the classical acceleration needed to support this progress, advancing the path toward practical quantum supercomputers.

NVIDIA DGX Quantum leverages OPX1000, the best-in-class, modular high-density hybrid control platform, seamlessly interfacing with NVIDIA GH200 Grace Hopper Superchips. This solution brings accelerated computing into the heart of the quantum computing stack for the first time, achieving an ultra-low round-trip latency of less than 4 µs between quantum control and AI supercomputers - faster than any other approach. The NVIDIA DGX Quantum Early Customer Program is now underway, with selected leading academic institutions, national labs, and commercial quantum computer builders participating. These include the Engineering Quantum Systems group (equs.mit.edu) led by MIT Professor William D. Oliver, the Israeli Quantum Computing Center (IQCC), quantum hardware developer Diraq, the Quantum Circuit group (led by Ecole Normale Supérieure de Lyon Professor Benjamin Huard), and more.

Read full story

Lenovo Announces Hybrid AI Advantage with NVIDIA Blackwell Support

Press Release by

Mar 19th, 2025 03:46 Discuss (7 Comments)

Today, at NVIDIA GTC, Lenovo unveiled new Lenovo Hybrid AI Advantage with NVIDIA solutions designed to accelerate AI adoption and boost business productivity by fast-tracking agentic AI that can reason, plan and take action to reach goals faster. The validated, full-stack AI solutions enable enterprises to quickly build and deploy AI agents for a broad range of high-demand use cases, increasing productivity, agility and trust while accelerating the next wave of AI reasoning for the new era of agentic AI.

New global IDC research commissioned by Lenovo reveals that ROI remains the greatest AI adoption barrier, despite a three-fold spend increase. AI agents are revolutionizing enterprise workflows and lowering barriers to ROI by supporting employees with complex problem-solving, coding, and multistep planning that drives speed, innovation and productivity. As CIOs and business leaders seek tangible return on AI investment, Lenovo is delivering hybrid AI solutions that unleash and customize agentic AI at every scale.

Read full story

Supermicro Expands Enterprise AI Portfolio With Support for Upcoming NVIDIA RTX PRO 6000 Blackwell Server Edition and NVIDIA H200 NVL Platform

Press Release by

Mar 19th, 2025 03:40 Discuss (0 Comments)

Supermicro, Inc., a Total IT Solution Provider for AI/ML, HPC, Cloud, Storage, and 5G/Edge, today announced support for the new NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs on a range of workload-optimized GPU servers and workstations. Specifically optimized for the NVIDIA Blackwell generation of PCIe GPUs, the broad range of Supermicro servers will enable more enterprises to leverage accelerated computing for LLM-inference and fine-tuning, agentic AI, visualization, graphics & rendering, and virtualization. Many Supermicro GPU-optimized systems are NVIDIA Certified, guaranteeing compatibility and support for NVIDIA AI Enterprise to simplify the process of developing and deploying production AI.

"Supermicro leads the industry with its broad portfolio of application optimized GPU servers that can be deployed in a wide range of enterprise environments with very short lead times," said Charles Liang, president and CEO of Supermicro. "Our support for the NVIDIA RTX PRO 6000 Blackwell Server Edition GPU adds yet another dimension of performance and flexibility for customers looking to deploy the latest in accelerated computing capabilities from the data center to the intelligent edge. Supermicro's broad range of PCIe GPU-optimized products also support NVIDIA H200 NVL in 2-way and 4-way NVIDIA NVLink configurations to maximize inference performance for today's state-of-the-art AI models, as well as accelerating HPC workloads."

Read full story

Dell Technologies Accelerates Enterprise AI Innovation from PC to Data Center with NVIDIA 

Press Release by

Mar 19th, 2025 03:19 Discuss (0 Comments)

Marking one year since the launch of the Dell AI Factory with NVIDIA, Dell Technologies (NYSE: DELL) announces new AI PCs, infrastructure, software and services advancements to accelerate enterprise AI innovation at any scale. Successful AI deployments are vital for enterprises to remain competitive, but challenges like system integration and skill gaps can delay the value enterprises realize from AI. More than 75% of organizations want their infrastructure providers to deliver capabilities across all aspects of the AI adoption journey, driving customer demand for simplified AI deployments that can scale.

As the top provider of AI centric infrastructure, Dell Technologies - in collaboration with NVIDIA - provides a consistent experience across AI infrastructure, software and services, offering customers a one-stop shop to scale AI initiatives from deskside to large-scale data center deployments.

Read full story

NVIDIA Accelerates Science and Engineering With CUDA-X Libraries Powered by GH200 and GB200 Superchips

Press Release by

Mar 18th, 2025 12:51 Discuss (1 Comment)

Scientists and engineers of all kinds are equipped to solve tough problems a lot faster with NVIDIA CUDA-X libraries powered by NVIDIA GB200 and GH200 superchips. Announced today at the NVIDIA GTC global AI conference, developers can now take advantage of tighter automatic integration and coordination between CPU and GPU resources - enabled by CUDA-X working with these latest superchip architectures - resulting in up to 11x speedups for computational engineering tools and 5x larger calculations compared with using traditional accelerated computing architectures.

This greatly accelerates and improves workflows in engineering simulation, design optimization and more, helping scientists and researchers reach groundbreaking results faster. NVIDIA released CUDA in 2006, opening up a world of applications to the power of accelerated computing. Since then, NVIDIA has built more than 900 domain-specific NVIDIA CUDA-X libraries and AI models, making it easier to adopt accelerated computing and driving incredible scientific breakthroughs. Now, CUDA-X brings accelerated computing to a broad new set of engineering disciplines, including astronomy, particle physics, quantum physics, automotive, aerospace and semiconductor design.

Read full story

Global Top 10 IC Design Houses See 49% YoY Growth in 2024, NVIDIA Commands Half the Market

Press Release by

Mar 17th, 2025 04:42 Discuss (20 Comments)

TrendForce reveals that the combined revenue of the world's top 10 IC design houses reached approximately US$249.8 billion in 2024, marking a 49% YoY increase. The booming AI industry has fueled growth across the semiconductor sector, with NVIDIA leading the charge, posting an astonishing 125% revenue growth, widening its lead over competitors, and solidifying its dominance in the IC industry.

Looking ahead to 2025, advancements in semiconductor manufacturing will further enhance AI computing power, with LLMs continuing to emerge. Open-source models like DeepSeek could lower AI adoption costs, accelerating AI penetration from servers to personal devices. This shift positions edge AI devices as the next major growth driver for the semiconductor industry.

Read full story

ASUS Showcases Servers Based on Intel Xeon 6, Intel Gaudi 3 at CloudFest 2025

Press Release by

Mar 13th, 2025 09:03 Discuss (0 Comments)

ASUS today announced its showcase of comprehensive AI infrastructure solutions at CloudFest 2025, bringing together cutting-edge hardware powered by Intel Xeon 6 processors, NVIDIA GPUs and AMD EPYC processors. The company will also highlight its integrated software platforms, reinforcing its position as a total AI solution provider for enterprises seeking seamless AI deployments from edge to cloud.

Intel Xeon 6-based AI solutions and Gaudi 3 Acceleration for generative AI inferencing and fine tuning training
ASUS Intel Xeon 6-based servers leverage the Data Center Modular Hardware System (DC-MHS) architecture, providing unparalleled scalability, cost-efficiency and simplified maintenance. ASUS will showcase a comprehensive Intel Xeon 6 family of processors at CloudFest 2025, including the RS700-E12, RS720Q-E12. and ESC8000-E12P-series servers. The ESC800-E12P-series servers will debut the Intel Gaudi 3 AI accelerator PCIe card. This lineup underscores the ASUS commitment to delivering comprehensive AI solutions that integrate cutting-edge hardware with enterprise-grade software platforms for seamless, scalable AI deployments, highlighting Intel's latest innovations for high-performance AI training, inference, and cloud-native workloads.

Read full story

GIGABYTE Showcases Future-Ready AI and HPC Technologies for High-Efficiency Computing at SCA 2025

Press Release by

Mar 11th, 2025 06:41 Discuss (0 Comments)

Giga Computing, a subsidiary of GIGABYTE and a pioneer in AI-driven enterprise computing, is set to make a significant impact at Supercomputing Asia 2025 (SCA25) in Singapore (March 11-13). At booth #D5, GIGABYTE showcases its latest advancements in liquid cooling, solutions for AI training and high-performance computing (HPC). The booth highlights GIGABYTE's innovative technology and comprehensive direct liquid cooling (DLC) strategies, reinforcing its commitment to energy-efficient, high-performance computing.

Revolutionizing AI Training with DLC
A key highlight of GIGABYTE's showcase is the NVIDIA HGX H200 platform, a next-generation solution for AI workloads. GIGABYTE is presenting both its liquid-cooled G4L3-SD1 server and its air-cooled G893 series, providing businesses with advanced cooling solutions tailored for high-performance demands. The G4L3-SD1 server, equipped with CoolIT Systems' cold plates, effectively cools Intel Xeon CPUs and eight NVIDIA H200 GPUs, ensuring optimal performance with enhanced energy efficiency.

Read full story

AMD's Pain Point is ROCm Software, NVIDIA's CUDA Software is Still Superior for AI Development: Report

by

Dec 23rd, 2024 07:02 Discuss (33 Comments)

The battle of AI acceleration in the data center is, as most readers are aware, insanely competitive, with NVIDIA offering a top-tier software stack. However, AMD has tried in recent years to capture a part of the revenue that hyperscalers and OEMs are willing to spend with its Instinct MI300X accelerator lineup for AI and HPC. Despite having decent hardware, the company is not close to bridging the gap software-wise with its competitor, NVIDIA. According to the latest report from SemiAnalysis, a research and consultancy firm, they have run a five-month experiment using Instinct MI300X for training and benchmark runs. And the findings were surprising: even with better hardware, AMD's software stack, including ROCm, has massively degraded AMD's performance.

"When comparing NVIDIA's GPUs to AMD's MI300X, we found that the potential on paper advantage of the MI300X was not realized due to a lack within AMD public release software stack and the lack of testing from AMD," noted SemiAnalysis, breaking down arguments in the report further, adding that "AMD's software experience is riddled with bugs rendering out of the box training with AMD is impossible. We were hopeful that AMD could emerge as a strong competitor to NVIDIA in training workloads, but, as of today, this is unfortunately not the case. The CUDA moat has yet to be crossed by AMD due to AMD's weaker-than-expected software Quality Assurance (QA) culture and its challenging out-of-the-box experience."

Read full story

Microsoft Acquired Nearly 500,000 NVIDIA "Hopper" GPUs This Year

by

Dec 19th, 2024 04:22 Discuss (27 Comments)

Microsoft is heavily investing in enabling its company and cloud infrastructure to support the massive AI expansion. The Redmond giant has acquired nearly half a million of the NVIDIA "Hopper" family of GPUs to support this effort. According to market research company Omdia, Microsoft was the biggest hyperscaler, with data center CapEx and GPU expenditure reaching a record high. The company acquired precisely 485,000 NVIDIA "Hopper" GPUs, including H100, H200, and H20, resulting in more than $30 billion spent on servers alone. To put things into perspective, this is about double that of the next-biggest GPU purchaser, Chinese ByteDance, who acquired about 230,000 sanction-abiding H800 GPUs and regular H100s sources from third parties.

Regarding US-based companies, the only ones that have come close to the GPU acquisition rate are Meta, Tesla/xAI, Amazon, and Google. They have acquired around 200,000 GPUs on average while significantly boosting their in-house chip design efforts. "NVIDIA GPUs claimed a tremendously high share of the server capex," Vlad Galabov, director of cloud and data center research at Omdia, noted, adding, "We're close to the peak." Hyperscalers like Amazon, Google, and Meta have been working on their custom solutions for AI training and inference. For example, Google has its TPU, Amazon has its Trainium and Inferentia chips, and Meta has its MTIA. Hyperscalers are eager to develop their in-house solutions, but NVIDIA's grip on the software stack paired with timely product updates seems hard to break. The latest "Blackwell" chips are projected to get even bigger orders, so only the sky (and the local power plant) is the limit.

Advantech Introduces Its GPU Server SKY-602E3 With NVIDIA H200 NVL

Press Release by

Dec 11th, 2024 06:37 Discuss (0 Comments)

Advantech, a leading global provider of industrial edge AI solutions, is excited to introduce its GPU server SKY-602E3 equipped with the NVIDIA H200 NVL platform. This powerful combination is set to accelerate the offline LLM for manufacturing, providing unprecedented levels of performance and efficiency. The NVIDIA H200 NVL, requiring 600 W passive cooling, is fully supported by the compact and efficient SKY-602E3 GPU server, making it an ideal solution for demanding edge AI applications.

Core of Factory LLM Deployment: AI Vision
The SKY-602E3 GPU server excels in supporting large language models (LLMs) for AI inference and training. It features four PCIe 5.0 x16 slots, delivering high bandwidth for intensive tasks, and four PCIe 5.0 x8 slots, providing enhanced flexibility for GPU and frame grabber card expansion. The half-width design of the SKY-602E3 makes it an excellent choice for workstation environments. Additionally, the server can be equipped with the NVIDIA H200 NVL platform, which offers 1.7x more performance than the NVIDIA H100 NVL, freeing up additional PCIe slots for other expansion needs.

Read full story

NVIDIA and Microsoft Showcase Blackwell Preview, Omniverse Industrial AI and RTX AI PCs at Microsoft Ignite

Press Release by

Nov 20th, 2024 02:32 Discuss (10 Comments)

NVIDIA and Microsoft today unveiled product integrations designed to advance full-stack NVIDIA AI development on Microsoft platforms and applications. At Microsoft Ignite, Microsoft announced the launch of the first cloud private preview of the Azure ND GB200 V6 VM series, based on the NVIDIA Blackwell platform. The Azure ND GB200 v6 will be a new AI-optimized virtual machine (VM) series and combines the NVIDIA GB200 NVL72 rack design with NVIDIA Quantum InfiniBand networking.

In addition, Microsoft revealed that Azure Container Apps now supports NVIDIA GPUs, enabling simplified and scalable AI deployment. Plus, the NVIDIA AI platform on Azure includes new reference workflows for industrial AI and an NVIDIA Omniverse Blueprint for creating immersive, AI-powered visuals. At Ignite, NVIDIA also announced multimodal small language models (SLMs) for RTX AI PCs and workstations, enhancing digital human interactions and virtual assistants with greater realism.

Read full story

NVIDIA Announces Hopper H200 NVL PCIe GPU Availability at SC24, Promising 1.3x HPC Performance Over H100 NVL

Press Release by

Nov 18th, 2024 14:33 Discuss (1 Comment)

Since its introduction, the NVIDIA Hopper architecture has transformed the AI and high-performance computing (HPC) landscape, helping enterprises, researchers and developers tackle the world's most complex challenges with higher performance and greater energy efficiency. During the Supercomputing 2024 conference, NVIDIA announced the availability of the NVIDIA H200 NVL PCIe GPU - the latest addition to the Hopper family. H200 NVL is ideal for organizations with data centers looking for lower-power, air-cooled enterprise rack designs with flexible configurations to deliver acceleration for every AI and HPC workload, regardless of size.

According to a recent survey, roughly 70% of enterprise racks are 20kW and below and use air cooling. This makes PCIe GPUs essential, as they provide granularity of node deployment, whether using one, two, four or eight GPUs - enabling data centers to pack more computing power into smaller spaces. Companies can then use their existing racks and select the number of GPUs that best suits their needs. Enterprises can use H200 NVL to accelerate AI and HPC applications, while also improving energy efficiency through reduced power consumption. With a 1.5x memory increase and 1.2x bandwidth increase over NVIDIA H100 NVL, companies can use H200 NVL to fine-tune LLMs within a few hours and deliver up to 1.7x faster inference performance. For HPC workloads, performance is boosted up to 1.3x over H100 NVL and 2.5x over the NVIDIA Ampere architecture generation.

Read full story

NVIDIA B200 "Blackwell" Records 2.2x Performance Improvement Over its "Hopper" Predecessor

by

Nov 14th, 2024 01:32 Discuss (18 Comments)

We know that NVIDIA's latest "Blackwell" GPUs are fast, but how much faster are they over the previous generation "Hopper"? Thanks to the latest MLPerf Training v4.1 results, NVIDIA's HGX B200 Blackwell platform has demonstrated massive performance gains, measuring up to 2.2x improvement per GPU compared to its HGX H200 Hopper. The latest results, verified by MLCommons, reveal impressive achievements in large language model (LLM) training. The Blackwell architecture, featuring HBM3e high-bandwidth memory and fifth-generation NVLink interconnect technology, achieved double the performance per GPU for GPT-3 pre-training and a 2.2x boost for Llama 2 70B fine-tuning compared to the previous Hopper generation. Each benchmark system incorporated eight Blackwell GPUs operating at a 1,000 W TDP, connected via NVLink Switch for scale-up.

The network infrastructure utilized NVIDIA ConnectX-7 SuperNICs and Quantum-2 InfiniBand switches, enabling high-speed node-to-node communication for distributed training workloads. While previous Hopper-based systems required 256 GPUs to optimize performance for the GPT-3 175B benchmark, Blackwell accomplished the same task with just 64 GPUs, leveraging its larger HBM3e memory capacity and bandwidth. One thing to look out for is the upcoming GB200 NVL72 system, which promises even more significant gains past the 2.2x. It features expanded NVLink domains, higher memory bandwidth, and tight integration with NVIDIA Grace CPUs, complemented by ConnectX-8 SuperNIC and Quantum-X800 switch technologies. With faster switching and better data movement with Grace-Blackwell integration, we could see even more software optimization from NVIDIA to push the performance envelope.

Cisco Unveils Plug-and-Play AI Solutions Powered by NVIDIA H100 and H200 Tensor Core GPUs

Press Release by

Oct 29th, 2024 10:53 Discuss (1 Comment)

Today, Cisco announced new additions to its data center infrastructure portfolio: an AI server family purpose-built for GPU-intensive AI workloads with NVIDIA accelerated computing, and AI PODs to simplify and de-risk AI infrastructure investment. They give organizations an adaptable and scalable path to AI, supported by Cisco's industry-leading networking capabilities.

"Enterprise customers are under pressure to deploy AI workloads, especially as we move toward agentic workflows and AI begins solving problems on its own," said Jeetu Patel, Chief Product Officer, Cisco. "Cisco innovations like AI PODs and the GPU server strengthen the security, compliance, and processing power of those workloads as customers navigate their AI journeys from inferencing to training."

Read full story

Intel Won't Compete Against NVIDIA's High-End AI Dominance Soon, Starts Laying Off Over 2,200 Workers Across US

by

Oct 17th, 2024 09:42 Discuss (48 Comments)

Intel's taking a different path with its Gaudi 3 accelerator chips. It's staying away from the high-demand market for training big AI models, which has made NVIDIA so successful. Instead, Intel wants to help businesses that need cheaper AI solutions to train and run smaller specific models and open-source options. At a recent event, Intel talked up Gaudi 3's "price performance advantage" over NVIDIA's H100 GPU for inference tasks. Intel says Gaudi 3 is faster and more cost-effective than the H100 when running Llama 3 and Llama 2 models of different sizes.

Intel also claims that Gaudi 3 is as power-efficient as the H100 for large language model (LLM) inference with small token outputs and does even better with larger outputs. The company even suggests Gaudi 3 beats NVIDIA's newer H200 in LLM inference throughput for large token outputs. However, Gaudi 3 doesn't match up to the H100 in overall floating-point operation throughput for 16-bit and 8-bit formats. For bfloat16 and 8-bit floating-point precision matrix math, Gaudi 3 hits 1,835 TFLOPS in each format, while the H100 reaches 1,979 TFLOPS for BF16 and 3,958 TFLOPS for FP8.

Read full story

GIGABYTE Announces New Liquid Cooled Solutions for NVIDIA HGX H200

Press Release by

Sep 4th, 2024 12:07 Discuss (1 Comment)

Giga Computing, a subsidiary of GIGABYTE and an industry leader in generative AI servers and advanced cooling technologies, today announced new flagship GIGABYTE G593 series servers supporting direct liquid cooling (DLC) technology to advance green data centers using NVIDIA HGX H200 GPU. As DLC technology is becoming a necessity for many data centers, GIGABYTE continues to increase its product portfolio with new DLC solutions for GPU and CPU technologies, and for these new G593 servers the cold plates are made by CoolIT Systems.

G593 Series - Tailored Cooling
The GPU-centric G593 series is custom engineered to house an 8-GPU baseboard, and its design had foresight for both air and liquid cooling. The compact 5U chassis leads the industry in its readily scalable nature, fitting up to sixty-four GPUs in a single rack and supporting 100kW of IT hardware. This helps to consolidate the IT hardware, and in turn, decrease the data center footprint. The G593 series servers for DLC are in response to the rising customer demand for greater energy efficiency. Liquids have a higher thermal conductivity than air, so they can rapidly and effectively remove heat from hot components to maintain lower operating temperatures. And by relying on water and heat exchangers, the overall energy consumption of the data center is reduced.

Read full story

ASUS Announces ESC N8-E11 AI Server with NVIDIA HGX H200

Press Release by

Aug 29th, 2024 05:24 Discuss (0 Comments)

ASUS today announced the latest marvel in the groundbreaking lineup of ASUS AI servers - ESC N8-E11, featuring the intensely powerful NVIDIA HGX H200 platform. With this AI titan, ASUS has secured its first industry deal, showcasing the exceptional performance, reliability and desirability of ESC N8-E11 with HGX H200, as well as the ability of ASUS to move first and fast in creating strong, beneficial partnerships with forward-thinking organizations seeking the world's most powerful AI solutions.

Shipments of the ESC N8-E11 with NVIDIA HGX H200 are scheduled to begin in early Q4 2024, marking a new milestone in the ongoing ASUS commitment to excellence. ASUS has been actively supporting clients by assisting in the development of cooling solutions to optimize overall PUE, guaranteeing that every ESC N8-E11 unit delivers top-tier efficiency and performance - ready to power the new era of AI.

Read full story

NVIDIA Blackwell Sets New Standard for Generative AI in MLPerf Inference Benchmark

Press Release by

Aug 28th, 2024 14:28 Discuss (3 Comments)

As enterprises race to adopt generative AI and bring new services to market, the demands on data center infrastructure have never been greater. Training large language models is one challenge, but delivering LLM-powered real-time services is another. In the latest round of MLPerf industry benchmarks, Inference v4.1, NVIDIA platforms delivered leading performance across all data center tests. The first-ever submission of the upcoming NVIDIA Blackwell platform revealed up to 4x more performance than the NVIDIA H100 Tensor Core GPU on MLPerf's biggest LLM workload, Llama 2 70B, thanks to its use of a second-generation Transformer Engine and FP4 Tensor Cores.

The NVIDIA H200 Tensor Core GPU delivered outstanding results on every benchmark in the data center category - including the latest addition to the benchmark, the Mixtral 8x7B mixture of experts (MoE) LLM, which features a total of 46.7 billion parameters, with 12.9 billion parameters active per token. MoE models have gained popularity as a way to bring more versatility to LLM deployments, as they're capable of answering a wide variety of questions and performing more diverse tasks in a single deployment. They're also more efficient since they only activate a few experts per inference - meaning they deliver results much faster than dense models of a similar size.

Read full story

Return to Keyword Browsing

Jul 13th, 2025 02:41 CDT change timezone

Latest GPU Drivers

New Forum Posts

02:27 by Thimblewad
9070XT BIOS flash (what to use?) (6)
02:25 by Dr. Dro
New ToS of Take Two and 2K (12)
02:21 by Kodehawa
Radeon RX 6700, 6700 XT & 6750 XT users club (1138)
02:19 by Cowboystrekk
6400c30 vs 8000c36 Ryzen 9800X3D (0)
02:13 by Dr. Dro
Best motherboards for XP gaming (116)
02:09 by Cowboystrekk
9800x3D - 6400 CL32 1:1 not stable (12)
02:04 by silentbogo
Is there a WIFI chip I should get? (1)
01:53 by cinemaware
What are you playing? (23945)
01:27 by LabRat 891
9060 XT 16GB or 6800 XT/6900XT? (30)
01:01 by sweethoneybee
ASUS ProArt GeForce RTX 4060 Ti OC Edition 16GB GDDR6 Gaming - nvflash64 VBIOS mismatch (5)

Popular Reviews

Jul 9th, 2025 Fractal Design Epoch RGB TG Review
Jul 11th, 2025 Lexar NM1090 Pro 4 TB Review
Jul 8th, 2025 Corsair FRAME 5000D RS Review
Jul 11th, 2025 Our Visit to the Hunter Super Computer
Jul 4th, 2025 NVIDIA GeForce RTX 5050 8 GB Review
Jul 7th, 2025 NZXT N9 X870E Review
Jun 20th, 2025 Sapphire Radeon RX 9060 XT Pulse OC 16 GB Review - An Excellent Choice
Nov 6th, 2024 AMD Ryzen 7 9800X3D Review - The Best Gaming Processor
May 13th, 2025 Upcoming Hardware Launches 2025 (Updated May 2025)
Jul 10th, 2025 Chieftec Iceberg 360 Review

TPU on YouTube

Controversial News Posts