News Posts matching #Blackwell

Return to Keyword Browsing

Dell Shows Compute-Dense AI Servers at SC24

Dell Technologies (NYSE: DELL) continues to make enterprise AI adoption easier with the Dell AI Factory, expanding the world's broadest AI solutions portfolio.Powerful new infrastructure, solutions and services accelerate, simplify and streamline AI workloads and data management.

"Getting AI up and running across a company can be a real challenge," said Arthur Lewis, president, Infrastructure Solutions Group, Dell Technologies. "We're making it easier for our customers with new AI infrastructure, solutions and services that simplify AI deployments, paving the way for smarter, faster ways to work and a more adaptable future."

CoolIT Announces the World's Highest Density Liquid-to-Liquid Coolant Distribution Unit

CoolIT Systems (CoolIT), the world leader in liquid cooling systems for AI and high-performance computing, introduces the CHx1000, the world's highest-density liquid-to-liquid coolant distribution unit (CDU). Designed for mission-critical applications, the CHx1000 is purpose-built to cool the NVIDIA Blackwell platform and other demanding AI workloads where liquid cooling is now necessary.

"CoolIT created the CHx1000 to provide the high capacity and pressure delivery required to direct liquid cool NVIDIA Blackwell and future generations of high-performance AI accelerators," said Patrick McGinn, CoolIT's COO. "Besides exceptional performance, serviceability and reliability are central to the CHx1000's design. The single rack-sized unit is fully front and back serviceable with hot-swappable critical components. Precision coolant controls and multiple levels of redundancy provide for steady, uninterrupted operation."

NVIDIA "Blackwell" NVL72 Servers Reportedly Require Redesign Amid Overheating Problems

According to The Information, NVIDIA's latest "Blackwell" processors are reportedly encountering significant thermal management issues in high-density server configurations, potentially affecting deployment timelines for major tech companies. The challenges emerge specifically in NVL72 GB200 racks housing 72 GB200 processors, which can consume up to 120 kilowatts of power per rack, weighting a "mere" 3,000 pounds (or about 1.5 tons). These thermal concerns have prompted NVIDIA to revisit and modify its server rack designs multiple times to prevent performance degradation and potential hardware damage. Hyperscalers like Google, Meta, and Microsoft, who rely heavily on NVIDIA GPUs for training their advanced language models, have allegedly expressed concerns about possible delays in their data center deployment schedules.

The thermal management issues follow earlier setbacks related to a design flaw in the Blackwell production process. The problem stemmed from the complex CoWoS-L packaging technology, which connects dual chiplets using RDL interposer and LSI bridges. Thermal expansion mismatches between various components led to warping issues, requiring modifications to the GPU's metal layers and bump structures. A company spokesperson characterized these modifications as part of the standard development process, noting that a new photomask resolved this issue. The Information states that mass production of the revised Blackwell GPUs began in late October, with shipments expected to commence in late January. However, these timelines are unconfirmed by NVIDIA, and some server makers like Dell confirmed that these GB200 NVL72 liquid-cooled systems are shipping now, not in January, with CoreWave GPU cloud provider as a customer. The original report could be using older information, as Dell is one of NVIDIA's most significant partners and among the first in the supply chain to gain access to new GPU batches.

NVIDIA RTX 40-series Stocks Begin Drying Up as Decks are Cleared for RTX 50-series Blackwell

Chinese tech site Board Channels keeps tabs on the way computer hardware is moving at the very beginning of the supply-chain. It has some fascinating insights the NVIDIA GeForce RTX 50-series "Blackwell" series of graphics cards. Apparently, NVIDIA has planned the transition between the current RTX 40-series "Ada" and the next-generation RTX 50-series such that there's minimal spillover inventory of the older generation graphics cards in the channel, so it doesn't end up with a situation similar to the one between the RTX 30-series "Ampere" and its successor. Back in 2021-22, the cryptocurrency mining boom, which waned toward the end of 2022, had caused an overproduction of RTX 30-series cards that lingered in the channel even as the RTX 40-series launched.

According to the report by Board Channels that's been translated by Gazlog and VideoCardz, the China-specific RTX 4090D has been vaporized from the channel, none of NVIDIA's AIC partners has any boards to sell. The RTX 4080 SUPER sees most AIC partners have their final batches shipping, which should clear out in November 2024. The RTX 4070 Ti SUPER isn't as prominent a SKU as the RTX 4080 SUPER, and is being phased out at the same pace as its bigger AD103-based sibling, with last orders shipping this month. The RTX 4070 SUPER and RTX 4070 remain the most popular high-end graphics SKUs in this generation, and NVIDIA will supply these SKUs throughout December. Given that the RTX 5070 series doesn't come out till February (with wide availability in March), this makes sense. The RTX 4060 series will phase out a lot slower than the other SKUs, given its popularity, and the fact that the RTX 5060 series won't ramp until Q2-2025.

NVIDIA B200 "Blackwell" Records 2.2x Performance Improvement Over its "Hopper" Predecessor

We know that NVIDIA's latest "Blackwell" GPUs are fast, but how much faster are they over the previous generation "Hopper"? Thanks to the latest MLPerf Training v4.1 results, NVIDIA's HGX B200 Blackwell platform has demonstrated massive performance gains, measuring up to 2.2x improvement per GPU compared to its HGX H200 Hopper. The latest results, verified by MLCommons, reveal impressive achievements in large language model (LLM) training. The Blackwell architecture, featuring HBM3e high-bandwidth memory and fifth-generation NVLink interconnect technology, achieved double the performance per GPU for GPT-3 pre-training and a 2.2x boost for Llama 2 70B fine-tuning compared to the previous Hopper generation. Each benchmark system incorporated eight Blackwell GPUs operating at a 1,000 W TDP, connected via NVLink Switch for scale-up.

The network infrastructure utilized NVIDIA ConnectX-7 SuperNICs and Quantum-2 InfiniBand switches, enabling high-speed node-to-node communication for distributed training workloads. While previous Hopper-based systems required 256 GPUs to optimize performance for the GPT-3 175B benchmark, Blackwell accomplished the same task with just 64 GPUs, leveraging its larger HBM3e memory capacity and bandwidth. One thing to look out for is the upcoming GB200 NVL72 system, which promises even more significant gains past the 2.2x. It features expanded NVLink domains, higher memory bandwidth, and tight integration with NVIDIA Grace CPUs, complemented by ConnectX-8 SuperNIC and Quantum-X800 switch technologies. With faster switching and better data movement with Grace-Blackwell integration, we could see even more software optimization from NVIDIA to push the performance envelope.

NVIDIA Switches Production Capacity to RTX 50-series "Blackwell"

Q1-2025 promises to be an action-packed quarter for graphics cards, with NVIDIA introducing the bulk of its next-generation GeForce RTX 50-series "Blackwell" GPUs. The company is expected to start things off with the two enthusiast-segment SKUs, the RTX 5090 and RTX 5080, in January, followed by the RTX 5070-series in February, and rounded off nicely with the RTX 5060-series in March. This would mean hundreds of individual new graphics card SKUs from NVIDIA's board partners, which are reportedly busy winding up the final inventory deliveries of their RTX 40-series "Ada" products, and transferring this production capacity to the RTX 50-series. So, when the RTX 50-series GPU models do come out across the quarter, there's plenty of inventory to go around. Board Channels reports that on NVIDIA's end, production of nearly every AD100-series silicon has ended, except the AD107, which will continue selling for entry-mainstream GeForce RTX 40-series SKUs. The AD106 production line has stopped, as has the AD103, AD104, and AD102.

Possible NVIDIA GeForce RTX 5080 Laptop GPU Pictured

Could this be the first picture of an NVIDIA GeForce RTX 5080 Laptop GPU? This picture, coupled with a specs sheet by notebook OEM Clevo, seems to suggest so, thanks to a new video by Moore's Law is Dead. The chip is noticeably more rectangular than the Ada "AD104," and is labelled N22W-ES-A1. It is an engineering sample. Cross-referencing "N22W" with the Clevo specs-sheet for a next-generation laptop mainboard, points to the possibility that the chip is indeed based on a next-gen silicon by NVIDIA. The board design has to undergo a significant change, due to the major change in the pin-map of the fiberglass substrate brought about by the switch to the new GDDR7 memory type.

The GeForce "Blackwell" generation comes in several GPU silicon sizes, and the RTX 5080 Laptop GPU is expected to be based on the "GB203" chip, which is expected to power the desktop RTX 5080 and possibly some SKUs in even the RTX 5070 series, such as the "RTX 5070 Ti." It is rumored to feature as many as 8,192 CUDA cores, and a 256-bit wide GDDR7 memory interface. NVIDIA is expected to unveil the GeForce "Blackwell" generation at CES 2025.

NVIDIA GeForce RTX 5090 "Blackwell" GPU Appears During Factory Boot-Up

We officially have the first look at NVIDIA's GeForce RTX 5090 "Blackwell" add-in board from what appears to be ZOTAC manufacturing facility. The leaked video shows a newly opened factory in Indonesia, which is recently opened-up to circumvent US export regulations. Published on Chiphell platform, the factory video shows NVIDIA GeForce RTX 5090 AIB design powering up, followed by cheering of factory workers. This signals that the alleged NVIDIA GeForce RTX 50 series scheduled for CES is near indeed, and AIB designs are also going to be available around that timeframe.

To confirm that the video is indeed showing GeForce RTX 5090, the video description, translated from Chinese, is as follows: "Due to the US's chip export control on China, graphics card chips with performance equal to or higher than 4090 are prohibited from being exported to mainland China. In order to avoid the impact of this move on the launch of RTX 5090, Bo Neng urgently built a factory in Batam, Indonesia. The video shows the debugging of the factory production line. The graphics card that lights up the monitor in the video is the NVIDIA RTX 5090 graphics card that will be launched soon." Although the video is quite blurry, we have to wait for the official launch or more leaks to see the GPU in its full glory.

Meta Shows Open-Architecture NVIDIA "Blackwell" GB200 System for Data Center

During the Open Compute Project (OCP) Summit 2024, Meta, one of the prime members of the OCP project, showed its NVIDIA "Blackwell" GB200 systems for its massive data centers. We previously covered Microsoft's Azure server rack with GB200 GPUs featuring one-third of the rack space for computing and two-thirds for cooling. A few days later, Google showed off its smaller GB200 system, and today, Meta is showing off its GB200 system—the smallest of the bunch. To train a dense transformer large language model with 405B parameters and a context window of up to 128k tokens, like the Llama 3.1 405B, Meta must redesign its data center infrastructure to run a distributed training job on two 24,000 GPU clusters. That is 48,000 GPUs used for training a single AI model.

Called "Catalina," it is built on the NVIDIA Blackwell platform, emphasizing modularity and adaptability while incorporating the latest NVIDIA GB200 Grace Blackwell Superchip. To address the escalating power requirements of GPUs, Catalina introduces the Orv3, a high-power rack capable of delivering up to 140kW. The comprehensive liquid-cooled setup encompasses a power shelf supporting various components, including a compute tray, switch tray, the Orv3 HPR, Wedge 400 fabric switch with 12.8 Tbps switching capacity, management switch, battery backup, and a rack management controller. Interestingly, Meta also upgraded its "Grand Teton" system for internal usage, such as deep learning recommendation models (DLRMs) and content understanding with AMD Instinct MI300X. Those are used to inference internal models, and MI300X appears to provide the best performance per Dollar for inference. According to Meta, the computational demand stemming from AI will continue to increase exponentially, so more NVIDIA and AMD GPUs is needed, and we can't wait to see what the company builds.

MSI Unveils AI Servers Powered by NVIDIA MGX at OCP 2024

MSI, a leading global provider of high-performance server solutions, proudly announced it is showcasing new AI servers powered by the NVIDIA MGX platform—designed to address the increasing demand for scalable, energy-efficient AI workloads in modern data centers—at the OCP Global Summit 2024, booth A6. This collaboration highlights MSI's continued commitment to advancing server solutions, focusing on cutting-edge AI acceleration and high-performance computing (HPC).

The NVIDIA MGX platform offers a flexible architecture that enables MSI to deliver purpose-built solutions optimized for AI, HPC, and LLMs. By leveraging this platform, MSI's AI server solutions provide exceptional scalability, efficiency, and enhanced GPU density—key factors in meeting the growing computational demands of AI workloads. Tapping into MSI's engineering expertise and NVIDIA's advanced AI technologies, these AI servers based on the MGX architecture deliver unparalleled compute power, positioning data centers to maximize performance and power efficiency while paving the way for the future of AI-driven infrastructure.

Lenovo Announces New Liquid Cooled Servers for Intel Xeon and NVIDIA Blackwell Platforms

At Lenovo Tech World 2024, we announced new Supercomputing servers for HPC and AI workloads. These new water-cooled servers use the latest processor and accelerator technology from Intel and NVIDIA.

ThinkSystem SC750 V4
Engineered for large-scale cloud infrastructures and High Performance Computing (HPC), the Lenovo ThinkSystem SC750 V4 Neptune excels in intensive simulations and complex modeling. It's designed to handle technical computing, grid deployments, and analytics workloads in various fields such as research, life sciences, energy, engineering, and financial simulation.

NVIDIA to Release the Bulk of its RTX 50-series in Q1-2025

The first quarter of 2025 (January thru March) will see back-to-back launches of next-generation GeForce RTX 50-series "Blackwell" graphics card, according to the latest rumors. NVIDIA CEO Jensen Huang is confirmed to take center stage for the 2025 International CES keynote address, where he is widely expected to kick off the GeForce "Blackwell" gaming GPU generation. CES is expected to see NVIDIA launch its flagship GeForce RTX 5090 (RTX 4090-successor SKU), and its next-best part, the GeForce RTX 5080 (RTX 4080 successor).

February 2025 is expected to see the company debut the RTX 5070, and possibly the RTX 5070 Ti, if there is such a SKU. The RTX 5070 succeeds a long line of extremely successful SKUs that tended to sell in large volumes. Perhaps the most important launches of the generation will come in March 2025, when the company is expected to debut the RTX 5060 and RTX 5060 Ti, which succeed the current RTX 4060 and RTX 4060 Ti, respectively. The xx60 tier tends to be the bestselling class of gaming GPUs in any generation. In all, it's expected that NVIDIA will release six new SKUs within Q1, and you can expect over a hundred graphics card reviews from TechPowerUp in Q1.

NVIDIA Contributes Blackwell Platform Design to Open Hardware Ecosystem, Accelerating AI Infrastructure Innovation

To drive the development of open, efficient and scalable data center technologies, NVIDIA today announced that it has contributed foundational elements of its NVIDIA Blackwell accelerated computing platform design to the Open Compute Project (OCP) and broadened NVIDIA Spectrum-X support for OCP standards.

At this year's OCP Global Summit, NVIDIA will be sharing key portions of the NVIDIA GB200 NVL72 system electro-mechanical design with the OCP community — including the rack architecture, compute and switch tray mechanicals, liquid-cooling and thermal environment specifications, and NVIDIA NVLink cable cartridge volumetrics — to support higher compute density and networking bandwidth.

NVIDIA "Blackwell" GPUs are Sold Out for 12 Months, Customers Ordering in 100K GPU Quantities

NVIDIA's "Blackwell" series of GPUs, including B100, B200, and GB200, are reportedly sold out for 12 months or an entire year. This directly means that if a new customer is willing to order a new Blackwell GPU now, there is a 12-month waitlist to get that GPU. Analyst from Morgan Stanley Joe Moore confirmed that in a meeting with NVIDIA and its investors, NVIDIA executives confirmed that the demand for "Blackwell" is so great that there is a 12-month backlog to fulfill first before shipping to anyone else. We expect that this includes customers like Amazon, META, Microsoft, Google, Oracle, and others, who are ordering GPUs in insane quantities to keep up with the demand from their customers.

The previous generation of "Hopper" GPUs was ordered in 10s of thousands of GPUs, while this "Blackwell" generation was ordered in 100s of thousands of GPUs simultaneously. For NVIDIA, that is excellent news, as that demand is expected to continue. The only one standing in the way of customers is TSMC, which manufactures these GPUs as fast as possible to meet demand. NVIDIA is one of TSMC's largest customers, so wafer allocation at TSMC's facilities is only expected to grow. We are now officially in the era of the million-GPU data centers, and we can only question at what point this massive growth stops or if it will stop at all in the near future.

NVIDIA Tunes GeForce RTX 5080 GDDR7 Memory to 32 Gbps, RTX 5070 Launches at CES

NVIDIA is gearing up for an exciting showcase at CES 2025, where its CEO, Jensen Huang, will take the stage and talk about, hopefully, future "Blackwell" products. According to Wccftech's sources, the anticipated GeForce RTX 5090, RTX 5080, and RTX 5070 graphics cards should arrive at CES 2025 in January. The flagship RTX 5090 is rumored to come equipped with 32 GB of GDDR7 memory running at 28 Gbps. Meanwhile, the RTX 5080 looks very interesting with reports of its impressive 16 GB of GDDR7 memory running at 32 Gbps. This advancement comes after we previously believed that the RTX 5080 model is going to feature 28 Gbps GDDR7 memory. However, the newest rumors suggest that we are in for a surprise, as the massive gap between RTX 5090 and RTX 5080 compute cores will be filled... with a faster memory.

The more budget-friendly RTX 5070 is also set for a CES debut, featuring 12 GB of memory. This card aims to deliver solid performance for gamers who want high-quality graphics without breaking the bank, targeting the mid-range segment. We are very curious about pricing of these models and how they would fit in the current market. As anticipation builds for CES 2025, we are eager to see how these innovations will impact gaming experiences and creative workflows in the coming year. Stay tuned for more updates as the event approaches!

NVIDIA "Blackwell" GB200 Server Dedicates Two-Thirds of Space to Cooling at Microsoft Azure

Late Tuesday, Microsoft Azure shared an interesting picture on its social media platform X, showcasing the pinnacle of GPU-accelerated servers—NVIDIA "Blackwell" GB200-powered AI systems. Microsoft is one of NVIDIA's largest customers, and the company often receives products first to integrate into its cloud and company infrastructure. Even NVIDIA listens to feedback from companies like Microsoft about designing future products, especially those like the now-canceled NVL36x2 system. The picture below shows a massive cluster that roughly divides the compute area into a single-third of the entire system, with a gigantic two-thirds of the system dedicated to closed-loop liquid cooling.

The entire system is connected using Infiniband networking, a standard for GPU-accelerated systems due to its lower latency in packet transfer. While the details of the system are scarce, we can see that the integrated closed-loop liquid cooling allows the GPU racks to be in a 1U form for increased density. Given that these systems will go into the wider Microsoft Azure data centers, a system needs to be easily maintained and cooled. There are indeed limits in power and heat output that Microsoft's data centers can handle, so these types of systems often fit inside internal specifications that Microsoft designs. There are more compute-dense systems, of course, like NVIDIA's NVL72, but hyperscalers should usually opt for other custom solutions that fit into their data center specifications. Finally, Microsoft noted that we can expect to see more details at the upcoming Microsoft Ignite conference in November and learn more about its GB200-powered AI systems.

NVIDIA's Jensen Huang to Lead CES 2025 Keynote

NVIDIA CEO Jensen Huang will be leading the keynote address at the coveted 2025 International CES in Las Vegas, which opens on January 7. The keynote address is slated for January 6, 6:30 am PT. There is of course no word from NVIDIA on what to expect, but we have some fairly easy guesswork. NVIDIA's refresh of the GeForce RTX product stack is due, and the company is expected to either debut or expand its next-generation GeForce RTX 50-series "Blackwell" gaming GPU stack, bringing in generational improvements in performance and performance-per-Watt, besides new technology.

The company could also make more announcements related to its "Blackwell" AI GPU lineup, which is expected to ramp through 2025, succeeding the current "Hopper" H100 and H200 series. The company could also tease "Rubin," which it referenced recently at GTC in May, "Rubin" succeeds "Blackwell," and will debut as an AI GPU toward the end of 2025, with a 2026 ramp toward customers. It's unclear if NVIDIA will make gaming GPUs on "Rubin," since GeForce RTX generations tend to have a 2-year cadence, and there was no gaming GPU based on "Hopper."

Foxconn to Build Taiwan's Fastest AI Supercomputer With NVIDIA Blackwell

NVIDIA and Foxconn are building Taiwan's largest supercomputer, marking a milestone in the island's AI advancement. The project, Hon Hai Kaohsiung Super Computing Center, revealed Tuesday at Hon Hai Tech Day, will be built around NVIDIA's groundbreaking Blackwell architecture and feature the GB200 NVL72 platform, which includes a total of 64 racks and 4,608 Tensor Core GPUs. With an expected performance of over 90 exaflops of AI performance, the machine would easily be considered the fastest in Taiwan.

Foxconn plans to use the supercomputer, once operational, to power breakthroughs in cancer research, large language model development and smart city innovations, positioning Taiwan as a global leader in AI-driven industries. Foxconn's "three-platform strategy" focuses on smart manufacturing, smart cities and electric vehicles. The new supercomputer will play a pivotal role in supporting Foxconn's ongoing efforts in digital twins, robotic automation and smart urban infrastructure, bringing AI-assisted services to urban areas like Kaohsiung.

NVIDIA Cancels Dual-Rack NVL36x2 in Favor of Single-Rack NVL72 Compute Monster

NVIDIA has reportedly discontinued its dual-rack GB200 NVL36x2 GPU model, opting to focus on the single-rack GB200 NVL72 and NVL36 models. This shift, revealed by industry analyst Ming-Chi Kuo, aims to simplify NVIDIA's offerings in the AI and HPC markets. The decision was influenced by major clients like Microsoft, who prefer the NVL72's improved space efficiency and potential for enhanced inference performance. While both models perform similarly in AI large language model (LLM) training, the NVL72 is expected to excel in non-parallelizable inference tasks. As a reminder, the NVL72 features 36 Grace CPUs, delivering 2,592 Arm Neoverse V2 cores with 17 TB LPDDR5X memory with 18.4 TB/s aggregate bandwidth. Additionally, it includes 72 Blackwell GB200 SXM GPUs that have a massive 13.5 TB of HBM3e combined, running at 576 TB/s aggregate bandwidth.

However, this shift presents significant challenges. The NVL72's power consumption of around 120kW far exceeds typical data center capabilities, potentially limiting its immediate widespread adoption. The discontinuation of the NVL36x2 has also sparked concerns about NVIDIA's execution capabilities and may disrupt the supply chain for assembly and cooling solutions. Despite these hurdles, industry experts view this as a pragmatic approach to product planning in the dynamic AI landscape. While some customers may be disappointed by the dual-rack model's cancellation, NVIDIA's long-term outlook in the AI technology market remains strong. The company continues to work with clients and listen to their needs, to position itself as a leader in high-performance computing solutions.

NVIDIA RTX 5090 "Blackwell" Could Feature Two 16-pin Power Connectors

NVIDIA CEO Jensen Huang never misses an opportunity to remind us that Moore's Law is cooked, and that future generations of logic hardware will only get larger and hotter, or hungrier for power. NVIDIA's next generation "Blackwell" graphics architecture promises to bring certain architecture-level performance/Watt improvements, coupled with the node-level performance/Watt improvements from the switch to the TSMC 4NP (4 nm-class) node. Even so, the GeForce RTX 5090, or the part that succeeds the current RTX 4090, will be a power hungry GPU, with rumors suggesting the need for two 16-pin power inputs.

TweakTown reports that the RTX 5090 could come with two 16-pin power connectors, which should give the card the theoretical ability to pull 1200 W (continuous). This doesn't mean that the GPU's total graphics power (TGP) is 1200 W, but a number close to or greater than 600 W, which calls for two of these connectors. Even if the TGP is exactly 600 W, NVIDIA would want to deploy two inputs, to spread the load among two connectors, and improve physical resilience of the connector. It's likely that both connectors will have 600 W input capability, so end-users don't mix up connectors should one of them be 600 W and the other keyed to 150 W or 300 W.

Oracle Offers First Zettascale Cloud Computing Cluster

Oracle today announced the first zettascale cloud computing clusters accelerated by the NVIDIA Blackwell platform. Oracle Cloud Infrastructure (OCI) is now taking orders for the largest AI supercomputer in the cloud—available with up to 131,072 NVIDIA Blackwell GPUs.

"We have one of the broadest AI infrastructure offerings and are supporting customers that are running some of the most demanding AI workloads in the cloud," said Mahesh Thiagarajan, executive vice president, Oracle Cloud Infrastructure. "With Oracle's distributed cloud, customers have the flexibility to deploy cloud and AI services wherever they choose while preserving the highest levels of data and AI sovereignty."

NVIDIA GeForce RTX 5090 and RTX 5080 Reach Final Stages This Month, Chinese "D" Variant Arrives for Both SKUs

NVIDIA is on the brink of finalizing its next-generation "Blackwell" graphics cards, the GeForce RTX 5090 and RTX 5080. Sources close to BenchLife indicate that NVIDIA is targeting September for the official design specification finalization of both models. This timeline hints at a possible unveiling at CES 2025, with a market release shortly after. The RTX 5090 is rumored to boast a staggering 550 W TGP, a significant 22% increase from its predecessor, the RTX 4090. Meanwhile, the RTX 5080 is expected to draw 350 W, a more modest 9.3% bump from the current RTX 4080. Interestingly, NVIDIA appears to be developing "D" variants for both cards, which are likely tailored for the Chinese market to comply with export regulations.

Regarding raw power, the RTX 5090 is speculated to feature 24,576 CUDA cores paired with 512-bit GDDR7 memory. The RTX 5080, while less mighty, is still expected to pack a punch with 10,752 CUDA cores and 256-bit GDDR7 memory. As NVIDIA prepares to launch these powerhouses, rumors suggest the RTX 4090D may be discontinued by December 2024, paving the way for its successor. We are curious to see how the power consumption is handled and if these cards are packed efficiently within the higher power envelope. Some rumors indicate that the RTX 5090 could reach 600 watts at its peak, while RTX 5080 reaches 400 watts. However, that is just a rumor for now. As always, until NVIDIA makes an official announcement, these details should be taken with a grain of salt.

NVIDIA Resolves "Blackwell" Yield Issues with New Photomask

During its Q2 2024 earnings call, NVIDIA confirmed that its upcoming Blackwell-based products are facing low-yield challenges. However, the company announced that it has implemented design changes to improve the production yields of its B100 and B200 processors. Despite these setbacks, NVIDIA remains optimistic about its production timeline. The tech giant plans to commence the production ramp of Blackwell GPUs in Q4 2024, with expected shipments worth several billion dollars by the end of the year. In an official statement, NVIDIA explained, "We executed a change to the Blackwell GPU mask to improve production yield." The company also reaffirmed that it had successfully sampled Blackwell GPUs with customers in the second quarter.

However, NVIDIA acknowledged that meeting demand required producing "low-yielding Blackwell material," which impacted its gross margins. During an earnings call, NVIDIA's CEO Jensen Huang assured investors that the supply of B100 and B200 GPUs will be there. He expressed confidence in the company's ability to mass-produce these chips starting in the fourth quarter. The Blackwell B100 and B200 GPUs use TSMC's CoWoS-L packaging technology and a complex design, which prompted rumors about the company facing yield issues with its designs. Reports suggest that initial challenges arose from mismatched thermal expansion coefficients among various components, leading to warping and system failures. However, now the company claims that the fix that solved these problems was a new GPU photomask, which bumped yields back to normal levels.

NVIDIA Blackwell Sets New Standard for Generative AI in MLPerf Inference Benchmark

As enterprises race to adopt generative AI and bring new services to market, the demands on data center infrastructure have never been greater. Training large language models is one challenge, but delivering LLM-powered real-time services is another. In the latest round of MLPerf industry benchmarks, Inference v4.1, NVIDIA platforms delivered leading performance across all data center tests. The first-ever submission of the upcoming NVIDIA Blackwell platform revealed up to 4x more performance than the NVIDIA H100 Tensor Core GPU on MLPerf's biggest LLM workload, Llama 2 70B, thanks to its use of a second-generation Transformer Engine and FP4 Tensor Cores.

The NVIDIA H200 Tensor Core GPU delivered outstanding results on every benchmark in the data center category - including the latest addition to the benchmark, the Mixtral 8x7B mixture of experts (MoE) LLM, which features a total of 46.7 billion parameters, with 12.9 billion parameters active per token. MoE models have gained popularity as a way to bring more versatility to LLM deployments, as they're capable of answering a wide variety of questions and performing more diverse tasks in a single deployment. They're also more efficient since they only activate a few experts per inference - meaning they deliver results much faster than dense models of a similar size.

ASUS Presents Comprehensive AI Server Lineup

ASUS today announced its ambitious All in AI initiative, marking a significant leap into the server market with a complete AI infrastructure solution, designed to meet the evolving demands of AI-driven applications from edge, inference and generative AI the new, unparalleled wave of AI supercomputing. ASUS has proven its expertise lies in striking the perfect balance between hardware and software, including infrastructure and cluster architecture design, server installation, testing, onboarding, remote management and cloud services - positioning the ASUS brand and AI server solutions to lead the way in driving innovation and enabling the widespread adoption of AI across industries.

Meeting diverse AI needs
In partnership with NVIDIA, Intel and AMD, ASUS offer comprehensive AI-infrastructure solutions with robust software platforms and services, from entry-level AI servers and machine-learning solutions to full racks and data centers for large-scale supercomputing. At the forefront is the ESC AI POD with NVIDIA GB200 NVL72, a cutting-edge rack designed to accelerate trillion-token LLM training and real-time inference operations. Complemented by the latest NVIDIA Blackwell GPUs, NVIDIA Grace CPUs and 5th Gen NVIDIA NVLink technology, ASUS servers ensure unparalleled computing power and efficiency.
Return to Keyword Browsing
Nov 18th, 2024 16:23 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts