News Posts matching #Hopper

Return to Keyword Browsing

Thousands of NVIDIA Grace Blackwell GPUs Now Live at CoreWeave

Press Release by

Apr 16th, 2025 02:30 Discuss (0 Comments)

CoreWeave today became one of the first cloud providers to bring NVIDIA GB200 NVL72 systems online for customers at scale, and AI frontier companies Cohere, IBM and Mistral AI are already using them to train and deploy next-generation AI models and applications. CoreWeave, the first cloud provider to make NVIDIA Grace Blackwell generally available, has already shown incredible results in MLPerf benchmarks with NVIDIA GB200 NVL72 - a powerful rack-scale accelerated computing platform designed for reasoning and AI agents. Now, CoreWeave customers are gaining access to thousands of NVIDIA Blackwell GPUs.

"We work closely with NVIDIA to quickly deliver to customers the latest and most powerful solutions for training AI models and serving inference," said Mike Intrator, CEO of CoreWeave. "With new Grace Blackwell rack-scale systems in hand, many of our customers will be the first to see the benefits and performance of AI innovators operating at scale."

Read full story

Quantum Machines Announces NVIDIA DGX Quantum Early Access Program

Press Release by

Apr 1st, 2025 12:59 Discuss (0 Comments)

Quantum Machines (QM), the leading provider of advanced quantum control solutions, has recently announced the NVIDIA DGX Quantum Early Customer Program, with a cohort of six leading research groups and quantum computer builders. NVIDIA DGX Quantum, a reference architecture jointly developed by NVIDIA and QM, is the first tightly integrated quantum-classical computing solution, designed to unlock new frontiers in quantum computing research and development. As quantum computers scale, their reliance on classical resources for essential operations, such as quantum error correction (QEC) and parameter drift compensation, grows exponentially. NVIDIA DGX Quantum provides access to the classical acceleration needed to support this progress, advancing the path toward practical quantum supercomputers.

NVIDIA DGX Quantum leverages OPX1000, the best-in-class, modular high-density hybrid control platform, seamlessly interfacing with NVIDIA GH200 Grace Hopper Superchips. This solution brings accelerated computing into the heart of the quantum computing stack for the first time, achieving an ultra-low round-trip latency of less than 4 µs between quantum control and AI supercomputers - faster than any other approach. The NVIDIA DGX Quantum Early Customer Program is now underway, with selected leading academic institutions, national labs, and commercial quantum computer builders participating. These include the Engineering Quantum Systems group (equs.mit.edu) led by MIT Professor William D. Oliver, the Israeli Quantum Computing Center (IQCC), quantum hardware developer Diraq, the Quantum Circuit group (led by Ecole Normale Supérieure de Lyon Professor Benjamin Huard), and more.

Read full story

NVIDIA H20 AI GPU at Risk in China, Due to Revised Energy-efficiency Guidelines & Supply Problems

by

Mar 27th, 2025 13:22 Discuss (3 Comments)

NVIDIA's supply of Chinese market-exclusive H20 AI GPU faces an uncertain future, due to recently introduced energy-efficiency guidelines. As covered over a year ago, Team Green readied a regional alternative to its "full fat" H800 "Hopper" AI GPU—designed and/or neutered to comply with US sanctions. Despite being less performant than Western siblings, the H20 model proved to be highly popular by mid-2024—industry analysis projected "$12 billion in take-home revenue" for NVIDIA. According to a fresh Reuters news piece, demand for cut-down "Hopper" hardware has surged throughout early 2025. The report cites "a rush to adopt Chinese AI startup DeepSeek's cost-effective AI models" as the main cause behind an increased snap up rate of H20 chips; with the nation's "big three" AI players—Tencent, Alibaba and ByteDance—driving the majority of sales.

The supply of H20 AI GPUs seems to be under threat on several fronts; Reuters points out that "U.S. officials were considering curbs on sales of H20 chips to China" back in January. Returning to the present day, their report sources "unofficial" statements from H3C—one of China's largest server equipment manufacturers and a key OEM partner for NVIDIA. An anonymous company insider outlined a murky outlook: "H20's international supply chain faces significant uncertainties...We were told the chips would be available, but when it came time to actually purchase them, we were informed they had already been sold at higher prices." More (rumored) bad news has arrived in the shape of alleged Chinese government intervention—the Financial Times posits that local regulators have privately advised that Tencent, Alibaba and ByteDance not purchase NVIDIA H20 chips.

Read full story

NVIDIA Announces Blackwell Ultra Platform for Next-Gen AI

Press Release by

Mar 18th, 2025 14:08 Discuss (0 Comments)

NVIDIA today announced the next evolution of the NVIDIA Blackwell AI factory platform, NVIDIA Blackwell Ultra—paving the way for the age of AI reasoning. NVIDIA Blackwell Ultra boosts training and test-time scaling inference—the art of applying more compute during inference to improve accuracy—to enable organizations everywhere to accelerate applications such as AI reasoning, agentic AI and physical AI.

Built on the groundbreaking Blackwell architecture introduced a year ago, Blackwell Ultra includes the NVIDIA GB300 NVL72 rack-scale solution and the NVIDIA HGX B300 NVL16 system. The GB300 NVL72 delivers 1.5x more AI performance than the NVIDIA GB200 NVL72, as well as increases Blackwell's revenue opportunity by 50x for AI factories, compared with those built with NVIDIA Hopper.

Read full story

US Investigates Possible "Singapore" Loophole in China's Access to NVIDIA GPUs

by

Jan 31st, 2025 12:50 Discuss (36 Comments)

Today, Bloomberg reported that the US government under Trump administration is probing whether Chinese AI company DeepSeek circumvented export restrictions to acquire advanced NVIDIA GPUs through Singaporean intermediaries. The investigation follows concerns that DeepSeek's AI model, R1—reportedly rivaling leading systems from OpenAI and Google—may have been trained using restricted hardware that is blocked from exporting to China. Singapore's role in NVIDIA's global sales has surged, with the nation accounting for 22% of the chipmaker's revenue in Q3 FY2025, up from 9% in Q3 FY2023. This spike coincides with tightened US export controls on AI chips to China, prompting speculation that Singapore serves as a pipe for Chinese firms to access high-end GPUs like the H100, which cannot be sold directly to China.

DeepSeek has not disclosed hardware details for R1 but revealed its earlier V3 model was trained using 2,048 H800 GPUs (2.8 million GPU hours), achieving efficiency surpassing Meta's Llama 3, which required 30.8 million GPU hours. Analysts suggest R1's performance implies even more powerful infrastructure, potentially involving restricted chips. US authorities, including the White House and FBI, are examining whether third parties in Singapore facilitated the transfer of controlled GPUs to DeepSeek. A well-known semiconductor analyst firm, SemiAnalysis, believes that DeepSeek acquired around 50,000 NVIDIA Hopper GPUs, which includes a mix of H100, H800, and H20. NVIDIA clarified that its reported Singapore revenue reflects "bill to" customer locations, not final destinations, stating most products are routed to the US or Western markets.

Read full story

DeepSeek-R1 Goes Live on NVIDIA NIM

Press Release by

Jan 31st, 2025 09:59 Discuss (9 Comments)

DeepSeek-R1 is an open model with state-of-the-art reasoning capabilities. Instead of offering direct responses, reasoning models like DeepSeek-R1 perform multiple inference passes over a query, conducting chain-of-thought, consensus and search methods to generate the best answer. Performing this sequence of inference passes—using reason to arrive at the best answer—is known as test-time scaling. DeepSeek-R1 is a perfect example of this scaling law, demonstrating why accelerated computing is critical for the demands of agentic AI inference.

As models are allowed to iteratively "think" through the problem, they create more output tokens and longer generation cycles, so model quality continues to scale. Significant test-time compute is critical to enable both real-time inference and higher-quality responses from reasoning models like DeepSeek-R1, requiring larger inference deployments. R1 delivers leading accuracy for tasks demanding logical inference, reasoning, math, coding and language understanding while also delivering high inference efficiency.

Read full story

NVIDIA Outlines Cost Benefits of Inference Platform

Press Release by

Jan 27th, 2025 07:54 Discuss (8 Comments)

Businesses across every industry are rolling out AI services this year. For Microsoft, Oracle, Perplexity, Snap and hundreds of other leading companies, using the NVIDIA AI inference platform—a full stack comprising world-class silicon, systems and software—is the key to delivering high-throughput and low-latency inference and enabling great user experiences while lowering cost. NVIDIA's advancements in inference software optimization and the NVIDIA Hopper platform are helping industries serve the latest generative AI models, delivering excellent user experiences while optimizing total cost of ownership. The Hopper platform also helps deliver up to 15x more energy efficiency for inference workloads compared to previous generations.

AI inference is notoriously difficult, as it requires many steps to strike the right balance between throughput and user experience. But the underlying goal is simple: generate more tokens at a lower cost. Tokens represent words in a large language model (LLM) system—and with AI inference services typically charging for every million tokens generated, this goal offers the most visible return on AI investments and energy used per task. Full-stack software optimization offers the key to improving AI inference performance and achieving this goal.

Read full story

MediaTek Adopts AI-Driven Cadence Virtuoso Studio and Spectre Simulation on NVIDIA Accelerated Computing Platform for 2nm Designs

Press Release by

Jan 22nd, 2025 10:34 Discuss (0 Comments)

Cadence today announced that MediaTek has adopted the AI-driven Cadence Virtuoso Studio and Spectre X Simulator on the NVIDIA accelerated computing platform for its 2 nm development. As design size and complexity continue to escalate, advanced-node technology development has become increasingly challenging for SoC providers. To meet the aggressive performance and turnaround time (TAT) requirements for its 2 nm high-speed analog IP, MediaTek is leveraging Cadence's proven custom/analog design solutions, enhanced by AI, to achieve a 30% productivity gain.

"As MediaTek continues to push technology boundaries for 2 nm development, we need a trusted design solution with strong AI-powered tools to achieve our goals," said Ching San Wu, corporate vice president at MediaTek. "Closely collaborating with Cadence, we have adopted the Cadence Virtuoso Studio and Spectre X Simulator, which deliver the performance and accuracy necessary to achieve our tight design turnaround time requirements. Cadence's comprehensive automation features enhance our throughput and efficiency, enabling our designers to be 30% more productive."

Read full story

NVIDIA's GB200 "Blackwell" Racks Face Overheating Issues

by

Jan 14th, 2025 11:23 Discuss (9 Comments)

NVIDIA's new GB200 "Blackwell" racks are running into trouble (again). Big cloud companies like Microsoft, Amazon, Google, and Meta Platforms are cutting back their orders because of heat problems, Reuters reports, quoting The Information. The first shipments of racks with Blackwell chips are getting too hot and have connection issues between chips, the report says. These tech hiccups have made some customers who ordered $10 billion or more worth of racks think twice about buying.

Some are putting off their orders until NVIDIA has better versions of the racks. Others are looking at buying older NVIDIA AI chips instead. For example, Microsoft planned to set up GB200 racks with no less than 50,000 Blackwell chips at one of its Phoenix sites. However, The Information reports that OpenAI has asked Microsoft to provide NVIDIA's older "Hopper" chips instead pointing to delays linked to the Blackwell racks. NVIDIA's problems with its Blackwell GPUs housed in high-density racks are not something new; in November 2024, Reuters, also referencing The Information, uncovered overheating issues in servers that housed 72 processors. NVIDIA has made several changes to its server rack designs to tackle these problems, however, it seems that the problem was not entirely solved.

Microsoft Acquired Nearly 500,000 NVIDIA "Hopper" GPUs This Year

by

Dec 19th, 2024 04:22 Discuss (27 Comments)

Microsoft is heavily investing in enabling its company and cloud infrastructure to support the massive AI expansion. The Redmond giant has acquired nearly half a million of the NVIDIA "Hopper" family of GPUs to support this effort. According to market research company Omdia, Microsoft was the biggest hyperscaler, with data center CapEx and GPU expenditure reaching a record high. The company acquired precisely 485,000 NVIDIA "Hopper" GPUs, including H100, H200, and H20, resulting in more than $30 billion spent on servers alone. To put things into perspective, this is about double that of the next-biggest GPU purchaser, Chinese ByteDance, who acquired about 230,000 sanction-abiding H800 GPUs and regular H100s sources from third parties.

Regarding US-based companies, the only ones that have come close to the GPU acquisition rate are Meta, Tesla/xAI, Amazon, and Google. They have acquired around 200,000 GPUs on average while significantly boosting their in-house chip design efforts. "NVIDIA GPUs claimed a tremendously high share of the server capex," Vlad Galabov, director of cloud and data center research at Omdia, noted, adding, "We're close to the peak." Hyperscalers like Amazon, Google, and Meta have been working on their custom solutions for AI training and inference. For example, Google has its TPU, Amazon has its Trainium and Inferentia chips, and Meta has its MTIA. Hyperscalers are eager to develop their in-house solutions, but NVIDIA's grip on the software stack paired with timely product updates seems hard to break. The latest "Blackwell" chips are projected to get even bigger orders, so only the sky (and the local power plant) is the limit.

NVIDIA's Next-Gen "Rubin" AI GPU Development 6 Months Ahead of Schedule: Report

by

Dec 5th, 2024 22:49 Discuss (3 Comments)

The "Rubin" architecture succeeds NVIDIA's current "Blackwell," which powers the company's AI GPUs, as well as the upcoming GeForce RTX 50-series gaming GPUs. NVIDIA will likely not build gaming GPUs with "Rubin," just like it didn't with "Hopper," and for the most part, "Volta." NVIDIA's AI GPU product roadmap put out at SC'24 puts "Blackwell" firmly in charge of the company's AI GPU product stack throughout 2025, with "Rubin" only succeeding it in the following year, for a two-year run in the market, being capped off with a "Rubin Ultra" larger GPU slated for 2027. A new report by United Daily News (UDN), a Taiwan-based publication, says that the development of "Rubin" is running 6 months ahead of schedule.

Being 6 months ahead of schedule doesn't necessarily mean that the product will launch sooner. It would give NVIDIA headroom to get "Rubin" better evaluated in the industry, and make last-minute changes to the product if needed; or even advance the launch if it wants to. The first AI GPU powered by "Rubin" will feature 8-high HBM4 memory stacks. The company will also introduce the "Vera" CPU, the long-awaited successor to "Grace." It will also introduce the X1600 InfiniBand/Ethernet network processor. According to the SC'24 roadmap by NVIDIA, these three would've seen a 2026 launch. Then in 2027, the company would follow up with an even larger AI GPU based on the same "Rubin" architecture, codenamed "Rubin Ultra." This features 12-high HBM4 stacks. NVIDIA's current GB200 "Blackwell" is a tile-based GPU, with two dies that have full cache-coherence. "Rubin" is rumored to feature four tiles.

NVIDIA Announces Hopper H200 NVL PCIe GPU Availability at SC24, Promising 1.3x HPC Performance Over H100 NVL

Press Release by

Nov 18th, 2024 14:33 Discuss (1 Comment)

Since its introduction, the NVIDIA Hopper architecture has transformed the AI and high-performance computing (HPC) landscape, helping enterprises, researchers and developers tackle the world's most complex challenges with higher performance and greater energy efficiency. During the Supercomputing 2024 conference, NVIDIA announced the availability of the NVIDIA H200 NVL PCIe GPU - the latest addition to the Hopper family. H200 NVL is ideal for organizations with data centers looking for lower-power, air-cooled enterprise rack designs with flexible configurations to deliver acceleration for every AI and HPC workload, regardless of size.

According to a recent survey, roughly 70% of enterprise racks are 20kW and below and use air cooling. This makes PCIe GPUs essential, as they provide granularity of node deployment, whether using one, two, four or eight GPUs - enabling data centers to pack more computing power into smaller spaces. Companies can then use their existing racks and select the number of GPUs that best suits their needs. Enterprises can use H200 NVL to accelerate AI and HPC applications, while also improving energy efficiency through reduced power consumption. With a 1.5x memory increase and 1.2x bandwidth increase over NVIDIA H100 NVL, companies can use H200 NVL to fine-tune LLMs within a few hours and deliver up to 1.7x faster inference performance. For HPC workloads, performance is boosted up to 1.3x over H100 NVL and 2.5x over the NVIDIA Ampere architecture generation.

Read full story

NVIDIA B200 "Blackwell" Records 2.2x Performance Improvement Over its "Hopper" Predecessor

by

Nov 14th, 2024 01:32 Discuss (18 Comments)

We know that NVIDIA's latest "Blackwell" GPUs are fast, but how much faster are they over the previous generation "Hopper"? Thanks to the latest MLPerf Training v4.1 results, NVIDIA's HGX B200 Blackwell platform has demonstrated massive performance gains, measuring up to 2.2x improvement per GPU compared to its HGX H200 Hopper. The latest results, verified by MLCommons, reveal impressive achievements in large language model (LLM) training. The Blackwell architecture, featuring HBM3e high-bandwidth memory and fifth-generation NVLink interconnect technology, achieved double the performance per GPU for GPT-3 pre-training and a 2.2x boost for Llama 2 70B fine-tuning compared to the previous Hopper generation. Each benchmark system incorporated eight Blackwell GPUs operating at a 1,000 W TDP, connected via NVLink Switch for scale-up.

The network infrastructure utilized NVIDIA ConnectX-7 SuperNICs and Quantum-2 InfiniBand switches, enabling high-speed node-to-node communication for distributed training workloads. While previous Hopper-based systems required 256 GPUs to optimize performance for the GPT-3 175B benchmark, Blackwell accomplished the same task with just 64 GPUs, leveraging its larger HBM3e memory capacity and bandwidth. One thing to look out for is the upcoming GB200 NVL72 system, which promises even more significant gains past the 2.2x. It features expanded NVLink domains, higher memory bandwidth, and tight integration with NVIDIA Grace CPUs, complemented by ConnectX-8 SuperNIC and Quantum-X800 switch technologies. With faster switching and better data movement with Grace-Blackwell integration, we could see even more software optimization from NVIDIA to push the performance envelope.

NVIDIA Ethernet Networking Accelerates World's Largest AI Supercomputer, Built by xAI

Press Release by

Oct 29th, 2024 02:40 Discuss (3 Comments)

NVIDIA today announced that xAI's Colossus supercomputer cluster comprising 100,000 NVIDIA Hopper GPUs in Memphis, Tennessee, achieved this massive scale by using the NVIDIA Spectrum-X Ethernet networking platform, which is designed to deliver superior performance to multi-tenant, hyperscale AI factories using standards-based Ethernet, for its Remote Direct Memory Access (RDMA) network.

Colossus, the world's largest AI supercomputer, is being used to train xAI's Grok family of large language models, with chatbots offered as a feature for X Premium subscribers. xAI is in the process of doubling the size of Colossus to a combined total of 200,000 NVIDIA Hopper GPUs.

Read full story

NVIDIA "Blackwell" GPUs are Sold Out for 12 Months, Customers Ordering in 100K GPU Quantities

by

Oct 11th, 2024 08:08 Discuss (29 Comments)

NVIDIA's "Blackwell" series of GPUs, including B100, B200, and GB200, are reportedly sold out for 12 months or an entire year. This directly means that if a new customer is willing to order a new Blackwell GPU now, there is a 12-month waitlist to get that GPU. Analyst from Morgan Stanley Joe Moore confirmed that in a meeting with NVIDIA and its investors, NVIDIA executives confirmed that the demand for "Blackwell" is so great that there is a 12-month backlog to fulfill first before shipping to anyone else. We expect that this includes customers like Amazon, META, Microsoft, Google, Oracle, and others, who are ordering GPUs in insane quantities to keep up with the demand from their customers.

The previous generation of "Hopper" GPUs was ordered in 10s of thousands of GPUs, while this "Blackwell" generation was ordered in 100s of thousands of GPUs simultaneously. For NVIDIA, that is excellent news, as that demand is expected to continue. The only one standing in the way of customers is TSMC, which manufactures these GPUs as fast as possible to meet demand. NVIDIA is one of TSMC's largest customers, so wafer allocation at TSMC's facilities is only expected to grow. We are now officially in the era of the million-GPU data centers, and we can only question at what point this massive growth stops or if it will stop at all in the near future.

NVIDIA's Jensen Huang to Lead CES 2025 Keynote

by

Oct 8th, 2024 02:07 Discuss (19 Comments)

NVIDIA CEO Jensen Huang will be leading the keynote address at the coveted 2025 International CES in Las Vegas, which opens on January 7. The keynote address is slated for January 6, 6:30 am PT. There is of course no word from NVIDIA on what to expect, but we have some fairly easy guesswork. NVIDIA's refresh of the GeForce RTX product stack is due, and the company is expected to either debut or expand its next-generation GeForce RTX 50-series "Blackwell" gaming GPU stack, bringing in generational improvements in performance and performance-per-Watt, besides new technology.

The company could also make more announcements related to its "Blackwell" AI GPU lineup, which is expected to ramp through 2025, succeeding the current "Hopper" H100 and H200 series. The company could also tease "Rubin," which it referenced recently at GTC in May, "Rubin" succeeds "Blackwell," and will debut as an AI GPU toward the end of 2025, with a 2026 ramp toward customers. It's unclear if NVIDIA will make gaming GPUs on "Rubin," since GeForce RTX generations tend to have a 2-year cadence, and there was no gaming GPU based on "Hopper."

AMD MI300X Accelerators are Competitive with NVIDIA H100, Crunch MLPerf Inference v4.1

by

Aug 29th, 2024 02:24 Discuss (22 Comments)

The MLCommons consortium on Wednesday posted MLPerf Inference v4.1 benchmark results for popular AI inferencing accelerators available in the market, across brands that include NVIDIA, AMD, and Intel. AMD's Instinct MI300X accelerators emerged competitive to NVIDIA's "Hopper" H100 series AI GPUs. AMD also used the opportunity to showcase the kind of AI inferencing performance uplifts customers can expect from its next-generation EPYC "Turin" server processors powering these MI300X machines. "Turin" features "Zen 5" CPU cores, sporting a 512-bit FPU datapath, and improved performance in AI-relevant 512-bit SIMD instruction-sets, such as AVX-512, and VNNI. The MI300X, on the other hand, banks on the strengths of its memory sub-system, FP8 data format support, and efficient KV cache management.

The MLPerf Inference v4.1 benchmark focused on the 70 billion-parameter LLaMA2-70B model. AMD's submissions included machines featuring the Instinct MI300X, powered by the current EPYC "Genoa" (Zen 4), and next-gen EPYC "Turin" (Zen 5). The GPUs are backed by AMD's ROCm open-source software stack. The benchmark evaluated inference performance using 24,576 Q&A samples from the OpenORCA dataset, with each sample containing up to 1024 input and output tokens. Two scenarios were assessed: the offline scenario, focusing on batch processing to maximize throughput in tokens per second, and the server scenario, which simulates real-time queries with strict latency limits (TTFT ≤ 2 seconds, TPOT ≤ 200 ms). This lets you see the chip's mettle in both high-throughput and low-latency queries.

Read full story

NVIDIA Blackwell Sets New Standard for Generative AI in MLPerf Inference Benchmark

Press Release by

Aug 28th, 2024 14:28 Discuss (3 Comments)

As enterprises race to adopt generative AI and bring new services to market, the demands on data center infrastructure have never been greater. Training large language models is one challenge, but delivering LLM-powered real-time services is another. In the latest round of MLPerf industry benchmarks, Inference v4.1, NVIDIA platforms delivered leading performance across all data center tests. The first-ever submission of the upcoming NVIDIA Blackwell platform revealed up to 4x more performance than the NVIDIA H100 Tensor Core GPU on MLPerf's biggest LLM workload, Llama 2 70B, thanks to its use of a second-generation Transformer Engine and FP4 Tensor Cores.

The NVIDIA H200 Tensor Core GPU delivered outstanding results on every benchmark in the data center category - including the latest addition to the benchmark, the Mixtral 8x7B mixture of experts (MoE) LLM, which features a total of 46.7 billion parameters, with 12.9 billion parameters active per token. MoE models have gained popularity as a way to bring more versatility to LLM deployments, as they're capable of answering a wide variety of questions and performing more diverse tasks in a single deployment. They're also more efficient since they only activate a few experts per inference - meaning they deliver results much faster than dense models of a similar size.

Read full story

NVIDIA's New B200A Targets OEM Customers; High-End GPU Shipments Expected to Grow 55% in 2025

Press Release by

Aug 7th, 2024 05:57 Discuss (1 Comment)

Despite recent rumors speculating on NVIDIA's supposed cancellation of the B100 in favor of the B200A, TrendForce reports that NVIDIA is still on track to launch both the B100 and B200 in the 2H24 as it aims to target CSP customers. Additionally, a scaled-down B200A is planned for other enterprise clients, focusing on edge AI applications.

TrendForce reports that NVIDIA will prioritize the B100 and B200 for CSP customers with higher demand due to the tight production capacity of CoWoS-L. Shipments are expected to commence after 3Q24. In light of yield and mass production challenges with CoWoS-L, NVIDIA is also planning the B200A for other enterprise clients, utilizing CoWoS-S packaging technology.

Read full story

NVIDIA Hit with DOJ Antitrust Probe over AI GPUs, Unfair Sales Tactics and Pricing Alleged

by

Aug 2nd, 2024 05:36 Discuss (50 Comments)

NVIDIA has reportedly been hit with a US Department of Justice (DOJ) antitrust probe over the tactics the company allegedly employs to sell or lease its AI GPUs and data-center networking equipment, "The Information" reported. Shares of the NVIDIA stock fell 3.6% in the pre-market trading on Friday (08/02). The main complainants behind the probe appear to be a special interest group among the customers of AI GPUs, and not NVIDIA's competitors in the AI GPU industry per se. US Senator Elizabeth Warren and US progressives have been most vocal about calling upon the DOJ to investigate antitrust allegations against NVIDIA.

Meanwhile, US officials are reportedly reaching out to NVIDIA's competitors, including AMD and Intel, to gather information about the complaints. NVIDIA holds 80% of the AI GPU market, while AMD, and to a much lesser extent, Intel, have received spillover demand for AI GPUs. "The Information" report says that the complaint alleges NVIDIA pressured cloud customers to buy "multiple products". We don't know what this means, one theory holds that NVIDIA is getting them to commit to buying multiple generations of products (eg: Ampere, Hopper, and over to Blackwell); while another holds that it's getting them to buy multiple kinds of products, which include not just the AI GPUs, but also NVIDIA's first-party server systems and networking equipment. Yet another theory holds that it is bundle first-party software and services to go with the hardware, far beyond the basic software needed to get the hardware to work.

Read full story

NVIDIA Blackwell's High Power Consumption Drives Cooling Demands; Liquid Cooling Penetration Expected to Reach 10% by Late 2024

Press Release by

Jul 30th, 2024 05:51 Discuss (41 Comments)

With the growing demand for high-speed computing, more effective cooling solutions for AI servers are gaining significant attention. TrendForce's latest report on AI servers reveals that NVIDIA is set to launch its next-generation Blackwell platform by the end of 2024. Major CSPs are expected to start building AI server data centers based on this new platform, potentially driving the penetration rate of liquid cooling solutions to 10%.

Air and liquid cooling systems to meet higher cooling demands
TrendForce reports that the NVIDIA Blackwell platform will officially launch in 2025, replacing the current Hopper platform and becoming the dominant solution for NVIDIA's high-end GPUs, accounting for nearly 83% of all high-end products. High-performance AI server models like the B200 and GB200 are designed for maximum efficiency, with individual GPUs consuming over 1,000 W. HGX models will house 8 GPUs each, while NVL models will support 36 or 72 GPUs per rack, significantly boosting the growth of the liquid cooling supply chain for AI servers.

Read full story

Global AI Server Demand Surge Expected to Drive 2024 Market Value to US$187 Billion; Represents 65% of Server Market

Press Release by

Jul 17th, 2024 05:45 Discuss (7 Comments)

TrendForce's latest industry report on AI servers reveals that high demand for advanced AI servers from major CSPs and brand clients is expected to continue in 2024. Meanwhile, TSMC, SK hynix, Samsung, and Micron's gradual production expansion has significantly eased shortages in 2Q24. Consequently, the lead time for NVIDIA's flagship H100 solution has decreased from the previous 40-50 weeks to less than 16 weeks.

TrendForce estimates that AI server shipments in the second quarter will increase by nearly 20% QoQ, and has revised the annual shipment forecast up to 1.67 million units—marking a 41.5% YoY growth.

Read full story

AI Startup Etched Unveils Transformer ASIC Claiming 20x Speed-up Over NVIDIA H100

by

Jun 25th, 2024 12:30 Discuss (37 Comments)

A new startup emerged out of stealth mode today to power the next generation of generative AI. Etched is a company that makes an application-specific integrated circuit (ASIC) to process "Transformers." The transformer is an architecture for designing deep learning models developed by Google and is now the powerhouse behind models like OpenAI's GPT-4o in ChatGPT, Anthropic Claude, Google Gemini, and Meta's Llama family. Etched wanted to create an ASIC for processing only the transformer models, making a chip called Sohu. The claim is Sohu outperforms NVIDIA's latest and greatest by an entire order of magnitude. Where a server configuration with eight NVIDIA H100 GPU clusters pushes Llama-3 70B models at 25,000 tokens per second, and the latest eight B200 "Blackwell" GPU cluster pushes 43,000 tokens/s, the eight Sohu clusters manage to output 500,000 tokens per second.

Why is this important? Not only does the ASIC outperform Hopper by 20x and Blackwell by 10x, but it also serves so many tokens per second that it enables an entirely new fleet of AI applications requiring real-time output. The Sohu architecture is so efficient that 90% of the FLOPS can be used, while traditional GPUs boast a 30-40% FLOP utilization rate. This translates into inefficiency and waste of power, which Etched hopes to solve by building an accelerator dedicated to power transformers (the "T" in GPT) at massive scales. Given that the frontier model development costs more than one billion US dollars, and hardware costs are measured in tens of billions of US Dollars, having an accelerator dedicated to powering a specific application can help advance AI faster. AI researchers often say that "scale is all you need" (resembling the legendary "attention is all you need" paper), and Etched wants to build on that.

Read full story

Noctua Shows Ampere Altra and NVIDIA GH200 CPU Coolers at Computex 2024

Computex by

Jun 5th, 2024 07:58 Discuss (4 Comments)

Noctua unveiled its new Ampere Altra family of CPU coolers for Ampere Altra and Altra Max Arm processors at the Computex 2024 show, as well as the upcoming NVIDIA GH200 Grace Hopper superchip cooler. In addition, it also showcased its new cooperation with Seasonic with PRIME TX-1600 Noctua Edition power supply and a rather unique Kaelo wine cooler.

In addition to the new and upcoming standard CPU coolers and fans, Noctua also unveiled the new Ampere Altra family of CPU coolers at the Computex 2024 show, aimed to be used with recently launched Ampere Altra and Altra Max Arm processors with up to 128 cores. The new Noctua Ampere Altra CPU coolers are based on the proven models for Intel Xeon and AMD Threadripper or EPYC platforms. The Noctua Ampere Altra family of CPU coolers use Noctua's SecuFirm2 mounting system for LGA4926 socket and come with pre-applied NT-H2 thermal paste. According to Noctua, these provide exceptional performance and whisper-quiet operation which are ideal for Arm based workstations in noise-sensitive environments. The Ampere Altra lineup should be already available over at Newegg. In addition, Nocuta has unveiled its new prototype of NVIDIA GH200 Grace Hopper superchip cooler, which integrates two custom NH-U12A heatsinks in order to cool both the Grace CPU and Hopper GPU. It supports up to 1,000 W of heat emissions, and aimed at noise-sensitive environments like local HPC applications and self-hosted open source LLMs. The NVIDIA GH200 cooler is expected in Q4 this year and offered to clients on pre-order basis.

Read full story

TOP500: Frontier Keeps Top Spot, Aurora Officially Becomes the Second Exascale Machine

Press Release by

May 13th, 2024 12:46 Discuss (5 Comments)

The 63rd edition of the TOP500 reveals that Frontier has once again claimed the top spot, despite no longer being the only exascale machine on the list. Additionally, a new system has found its way into the Top 10.

The Frontier system at Oak Ridge National Laboratory in Tennessee, USA remains the most powerful system on the list with an HPL score of 1.206 EFlop/s. The system has a total of 8,699,904 combined CPU and GPU cores, an HPE Cray EX architecture that combines 3rd Gen AMD EPYC CPUs optimized for HPC and AI with AMD Instinct MI250X accelerators, and it relies on Cray's Slingshot 11 network for data transfer. On top of that, this machine has an impressive power efficiency rating of 52.93 GFlops/Watt - putting Frontier at the No. 13 spot on the GREEN500.

Read full story

Return to Keyword Browsing

Jun 30th, 2025 20:40 CDT change timezone

Latest GPU Drivers

New Forum Posts

20:33 by Logic
Laptop overclocking adventures (1238)
19:49 by iameatingjam
[INTEL]-How To Update Your Microcode for Intel HX 13/14th Gen. CPUs Laptops/Mobile Easily. (172)
19:40 by Tyler-98-W68
Will you buy a RTX 5090? (584)
19:24 by AusWolf
The TPU UK Clubhouse (26530)
19:24 by Logic
Optane and "enable write caching " (27)
18:51 by A Computer Guy
Question about Intel Optane SSDs (87)
18:45 by lexluthermiester
Do you use Linux? (664)
18:45 by Gold Leader
Remember Fermi? Well here's my EVGA GTX 480 that I picked up for just 19 Euros! (9)
17:56 by HermannSW
Vega owners club (587)
17:38 by mclaren85
Can you guess Which game it is? (194)

Popular Reviews

Jun 30th, 2025 ASUS ROG Crosshair X870E Extreme Review
Jun 20th, 2025 Sapphire Radeon RX 9060 XT Pulse OC 16 GB Review - Samsung Memory Tested
Jun 27th, 2025 AVerMedia CamStream 4K Review
Jun 26th, 2025 Lexar NQ780 4 TB Review
Nov 6th, 2024 AMD Ryzen 7 9800X3D Review - The Best Gaming Processor
May 13th, 2025 Upcoming Hardware Launches 2025 (Updated May 2025)
Mar 5th, 2025 Sapphire Radeon RX 9070 XT Nitro+ Review - Beating NVIDIA
Mar 11th, 2025 AMD Ryzen 9 9950X3D Review - Great for Gaming and Productivity
Jun 25th, 2025 ASRock Phantom Gaming Z890 Riptide Wi-Fi Review
May 27th, 2025 NVIDIA GeForce RTX 5060 8 GB Review

TPU on YouTube

Controversial News Posts