News Posts matching #Blackwell

Return to Keyword Browsing

NVIDIA GeForce RTX 5080 to Stand Out with 30 Gbps GDDR7 Memory, Other SKUs Remain on 28 Gbps

NVIDIA is preparing to unveil its "Blackwell" GeForce RTX 5080 graphics card, featuring cutting-edge GDDR7 memory technology. However, RTX 5080 is expected to be equipped with 16 GB of GDDR7 memory running at an impressive 30 Gbps. Combined with a 256-bit memory bus, this configuration will deliver approximately 960 GB/s bandwidth—a 34% improvement over its predecessor, the RTX 4080, which operates at 716.8 GB/s. While the RTX 5080 will stand as the sole card in the lineup featuring 30 Gbps memory modules, while other models in the RTX 50 series will incorporate slightly slower 28 Gbps variants. This strategic differentiation is possibly due to the massive CUDA cores gap between the rumored RTX 5080 and RTX 5090.

The flagship RTX 5090 is set to push boundaries even further, implementing a wider 512-bit memory bus that could potentially achieve bandwidth exceeding 1.7 TB/s. NVIDIA appears to be reserving larger memory configurations of 16 GB+ exclusively for this top-tier model, at least until higher-capacity GDDR7 modules become available in the market. Despite these impressive specifications, the RTX 5080's bandwidth still falls approximately 5% short of the current RTX 4090, which benefits from a physically wider bus configuration. This performance gap between the 5080 and the anticipated 5090 suggests NVIDIA is maintaining a clear hierarchy within its product stack, and we have to wait for the final launch to conclude what, how, and why of the Blackwell gaming GPUs.

NVIDIA's Next-Gen "Rubin" AI GPU Development 6 Months Ahead of Schedule: Report

The "Rubin" architecture succeeds NVIDIA's current "Blackwell," which powers the company's AI GPUs, as well as the upcoming GeForce RTX 50-series gaming GPUs. NVIDIA will likely not build gaming GPUs with "Rubin," just like it didn't with "Hopper," and for the most part, "Volta." NVIDIA's AI GPU product roadmap put out at SC'24 puts "Blackwell" firmly in charge of the company's AI GPU product stack throughout 2025, with "Rubin" only succeeding it in the following year, for a two-year run in the market, being capped off with a "Rubin Ultra" larger GPU slated for 2027. A new report by United Daily News (UDN), a Taiwan-based publication, says that the development of "Rubin" is running 6 months ahead of schedule.

Being 6 months ahead of schedule doesn't necessarily mean that the product will launch sooner. It would give NVIDIA headroom to get "Rubin" better evaluated in the industry, and make last-minute changes to the product if needed; or even advance the launch if it wants to. The first AI GPU powered by "Rubin" will feature 8-high HBM4 memory stacks. The company will also introduce the "Vera" CPU, the long-awaited successor to "Grace." It will also introduce the X1600 InfiniBand/Ethernet network processor. According to the SC'24 roadmap by NVIDIA, these three would've seen a 2026 launch. Then in 2027, the company would follow up with an even larger AI GPU based on the same "Rubin" architecture, codenamed "Rubin Ultra." This features 12-high HBM4 stacks. NVIDIA's current GB200 "Blackwell" is a tile-based GPU, with two dies that have full cache-coherence. "Rubin" is rumored to feature four tiles.

NVIDIA GeForce RTX 5070 Ti Specs Leak: Same Die as RTX 5080, 300 W TDP

Recent leaks have unveiled specifications for NVIDIA's upcoming RTX 5070 Ti graphics card, suggesting an increase in power consumption. According to industry leaker Kopite7kimi, the RTX 5070 Ti will feature 8,960 CUDA cores and operate at a 300 W TDP. In a departure from previous generations, the RTX 5070 Ti will reportedly share the same GB203 die with its higher-tier sibling, the RTX 5080. This architectural decision differs from the RTX 40-series lineup, where the 4070 Ti and 4080 utilized different dies (AD104 and AD103, respectively). This shared die approach could potentially keep NVIDIA's manufacturing costs lower. Performance-wise, the RTX 5070 Ti shows promising improvements over its predecessor. The leaked specifications indicate a 16% increase in CUDA cores compared to the RTX 4070 Ti, though this advantage shrinks to 6% when measured against the RTX 4070 Ti Super.

Power consumption sees a modest 5% increase to 300 W, suggesting improved efficiency despite the enhanced capabilities. Memory configurations remain unconfirmed, but speculations about the card indicate that it could feature 16 GB of memory on a 256-bit interface, distinguishing it from the RTX 5080's rumored 24 GB configuration. The positioning across the 50-series GPU stack of this RTX 5070 Ti appears carefully calculated, with its 8,960 CUDA cores sitting approximately 20% below the RTX 5080's 10,752 cores. This larger performance gap between tiers contrasts with the previous generation's approach, potentially indicating a more defined product hierarchy in the Blackwell lineup. NVIDIA is expected to unveil its Blackwell gaming graphics cards at CES 2025, with the RTX 5090, 5080, and 5070 series leading the announcement.

NVIDIA Prepares GB200 NVL4: Four "Blackwell" GPUs and Two "Grace" CPUs in a 5,400 W Server

At SC24, NVIDIA announced its latest compute-dense AI accelerators in the form of GB200 NVL4, a single-server solution that expands the company's "Blackwell" series portfolio. The new platform features an impressive combination of four "Blackwell" GPUs and two "Grace" CPUs on a single board. The GB200 NVL4 boasts remarkable specifications for a single-server system, including 768 GB of HBM3E memory across its four Blackwell GPUs, delivering a combined memory bandwidth of 32 TB/s. The system's two Grace CPUs have 960 GB of LPDDR5X memory, making it a powerhouse for demanding AI workloads. A key feature of the NVL4 design is its NVLink interconnect technology, which enables communication between all processors on the board. This integration is important for maintaining optimal performance across the system's multiple processing units, especially during large training runs or inferencing a multi-trillion parameter model.

Performance comparisons with previous generations show significant improvements, with NVIDIA claiming the GB200 GPUs deliver 2.2x faster overall performance and 1.8x quicker training capabilities compared to their GH200 NVL4 predecessor. The system's power consumption reaches 5,400 watts, which effectively doubles the 2,700-watt requirement of the GB200 NVL2 model, its smaller sibling that features two GPUs instead of four. NVIDIA is working closely with OEM partners to bring various Blackwell solutions to market, including the DGX B200, GB200 Grace Blackwell Superchip, GB200 Grace Blackwell NVL2, GB200 Grace Blackwell NVL4, and GB200 Grace Blackwell NVL72. Fitting 5,400 W of TDP in a single server will require liquid cooling for optimal performance, and the GB200 NVL4 is expected to go inside server racks for hyperscaler customers, which usually have a custom liquid cooling systems inside their data centers.

GIGABYTE Showcases a Leading AI and Enterprise Portfolio at Supercomputing 2024

Giga Computing, a subsidiary of GIGABYTE and an industry leader in generative AI servers and advanced cooling technologies, shows off at SC24 how the GIGABYTE enterprise portfolio provides solutions for all applications, from cloud computing to AI to enterprise IT, including energy-efficient liquid-cooling technologies. This portfolio is made more complete by long-term collaborations with leading technology companies and emerging industry leaders, which will be showcased at GIGABYTE booth #3123 at SC24 (Nov. 19-21) in Atlanta. The booth is sectioned to put the spotlight on strategic technology collaborations, as well as direct liquid cooling partners.

The GIGABYTE booth will showcase an array of NVIDIA platforms built to keep up with the diversity of workloads and degrees of demands in applications of AI & HPC hardware. For a rack-scale AI solution using the NVIDIA GB200 NVL72 design, GIGABYTE displays how seventy-two GPUs can be in one rack with eighteen GIGABYTE servers each housing two NVIDIA Grace CPUs and four NVIDIA Blackwell GPUs. Another platform at the GIGABYTE booth is the NVIDIA HGX H200 platform. GIGABYTE exhibits both its liquid-cooling G4L3-SD1 server and an air-cooled version, G593-SD1.

ASUS Presents Next-Gen Infrastructure Solutions With Advanced Cooling Portfolio at SC24

ASUS today announced its next-generation infrastructure solutions at SC24, unveiling an extensive server lineup and advanced cooling solutions, all designed to propel the future of AI. The product showcase will reveal how ASUS is working with NVIDIA and Ubitus/Ubilink to prove the immense computational power of supercomputers, using AI-powered avatar and robot demonstrations that leverage the newly-inaugurated data center. It is Taiwan's largest supercomputing facility, constructed by ASUS, and is also notable for offering flexible green-energy options to customers that desire them. As a total solution provider with a proven track record in pioneering AI supercomputing, ASUS continuously drives maximized value for customers.

To fuel digital transformation in enterprise through high-performance computing (HPC) and AI-driven architecture, ASUS provides a full line-up of server systems—ready for every scenario. ASUS AI POD, a complete rack solution equipped with NVIDIA GB200 NVL72 platform, integrates GPUs, CPUs and switches in seamless, high-speed direct communication, enhancing the training of trillion-parameter LLMs and enabling real-time inference. It features the NVIDIA GB200 Grace Blackwell Superchip and fifth-generation NVIDIA NVLink technology, while offering both liquid-to-air and liquid-to-liquid cooling options to maximize AI computing performance.

Dell Shows Compute-Dense AI Servers at SC24

Dell Technologies (NYSE: DELL) continues to make enterprise AI adoption easier with the Dell AI Factory, expanding the world's broadest AI solutions portfolio.Powerful new infrastructure, solutions and services accelerate, simplify and streamline AI workloads and data management.

"Getting AI up and running across a company can be a real challenge," said Arthur Lewis, president, Infrastructure Solutions Group, Dell Technologies. "We're making it easier for our customers with new AI infrastructure, solutions and services that simplify AI deployments, paving the way for smarter, faster ways to work and a more adaptable future."

CoolIT Announces the World's Highest Density Liquid-to-Liquid Coolant Distribution Unit

CoolIT Systems (CoolIT), the world leader in liquid cooling systems for AI and high-performance computing, introduces the CHx1000, the world's highest-density liquid-to-liquid coolant distribution unit (CDU). Designed for mission-critical applications, the CHx1000 is purpose-built to cool the NVIDIA Blackwell platform and other demanding AI workloads where liquid cooling is now necessary.

"CoolIT created the CHx1000 to provide the high capacity and pressure delivery required to direct liquid cool NVIDIA Blackwell and future generations of high-performance AI accelerators," said Patrick McGinn, CoolIT's COO. "Besides exceptional performance, serviceability and reliability are central to the CHx1000's design. The single rack-sized unit is fully front and back serviceable with hot-swappable critical components. Precision coolant controls and multiple levels of redundancy provide for steady, uninterrupted operation."

NVIDIA "Blackwell" NVL72 Servers Reportedly Require Redesign Amid Overheating Problems

According to The Information, NVIDIA's latest "Blackwell" processors are reportedly encountering significant thermal management issues in high-density server configurations, potentially affecting deployment timelines for major tech companies. The challenges emerge specifically in NVL72 GB200 racks housing 72 GB200 processors, which can consume up to 120 kilowatts of power per rack, weighting a "mere" 3,000 pounds (or about 1.5 tons). These thermal concerns have prompted NVIDIA to revisit and modify its server rack designs multiple times to prevent performance degradation and potential hardware damage. Hyperscalers like Google, Meta, and Microsoft, who rely heavily on NVIDIA GPUs for training their advanced language models, have allegedly expressed concerns about possible delays in their data center deployment schedules.

The thermal management issues follow earlier setbacks related to a design flaw in the Blackwell production process. The problem stemmed from the complex CoWoS-L packaging technology, which connects dual chiplets using RDL interposer and LSI bridges. Thermal expansion mismatches between various components led to warping issues, requiring modifications to the GPU's metal layers and bump structures. A company spokesperson characterized these modifications as part of the standard development process, noting that a new photomask resolved this issue. The Information states that mass production of the revised Blackwell GPUs began in late October, with shipments expected to commence in late January. However, these timelines are unconfirmed by NVIDIA, and some server makers like Dell confirmed that these GB200 NVL72 liquid-cooled systems are shipping now, not in January, with CoreWave GPU cloud provider as a customer. The original report could be using older information, as Dell is one of NVIDIA's most significant partners and among the first in the supply chain to gain access to new GPU batches.

NVIDIA RTX 40-series Stocks Begin Drying Up as Decks are Cleared for RTX 50-series Blackwell

Chinese tech site Board Channels keeps tabs on the way computer hardware is moving at the very beginning of the supply-chain. It has some fascinating insights the NVIDIA GeForce RTX 50-series "Blackwell" series of graphics cards. Apparently, NVIDIA has planned the transition between the current RTX 40-series "Ada" and the next-generation RTX 50-series such that there's minimal spillover inventory of the older generation graphics cards in the channel, so it doesn't end up with a situation similar to the one between the RTX 30-series "Ampere" and its successor. Back in 2021-22, the cryptocurrency mining boom, which waned toward the end of 2022, had caused an overproduction of RTX 30-series cards that lingered in the channel even as the RTX 40-series launched.

According to the report by Board Channels that's been translated by Gazlog and VideoCardz, the China-specific RTX 4090D has been vaporized from the channel, none of NVIDIA's AIC partners has any boards to sell. The RTX 4080 SUPER sees most AIC partners have their final batches shipping, which should clear out in November 2024. The RTX 4070 Ti SUPER isn't as prominent a SKU as the RTX 4080 SUPER, and is being phased out at the same pace as its bigger AD103-based sibling, with last orders shipping this month. The RTX 4070 SUPER and RTX 4070 remain the most popular high-end graphics SKUs in this generation, and NVIDIA will supply these SKUs throughout December. Given that the RTX 5070 series doesn't come out till February (with wide availability in March), this makes sense. The RTX 4060 series will phase out a lot slower than the other SKUs, given its popularity, and the fact that the RTX 5060 series won't ramp until Q2-2025.

NVIDIA B200 "Blackwell" Records 2.2x Performance Improvement Over its "Hopper" Predecessor

We know that NVIDIA's latest "Blackwell" GPUs are fast, but how much faster are they over the previous generation "Hopper"? Thanks to the latest MLPerf Training v4.1 results, NVIDIA's HGX B200 Blackwell platform has demonstrated massive performance gains, measuring up to 2.2x improvement per GPU compared to its HGX H200 Hopper. The latest results, verified by MLCommons, reveal impressive achievements in large language model (LLM) training. The Blackwell architecture, featuring HBM3e high-bandwidth memory and fifth-generation NVLink interconnect technology, achieved double the performance per GPU for GPT-3 pre-training and a 2.2x boost for Llama 2 70B fine-tuning compared to the previous Hopper generation. Each benchmark system incorporated eight Blackwell GPUs operating at a 1,000 W TDP, connected via NVLink Switch for scale-up.

The network infrastructure utilized NVIDIA ConnectX-7 SuperNICs and Quantum-2 InfiniBand switches, enabling high-speed node-to-node communication for distributed training workloads. While previous Hopper-based systems required 256 GPUs to optimize performance for the GPT-3 175B benchmark, Blackwell accomplished the same task with just 64 GPUs, leveraging its larger HBM3e memory capacity and bandwidth. One thing to look out for is the upcoming GB200 NVL72 system, which promises even more significant gains past the 2.2x. It features expanded NVLink domains, higher memory bandwidth, and tight integration with NVIDIA Grace CPUs, complemented by ConnectX-8 SuperNIC and Quantum-X800 switch technologies. With faster switching and better data movement with Grace-Blackwell integration, we could see even more software optimization from NVIDIA to push the performance envelope.

NVIDIA Switches Production Capacity to RTX 50-series "Blackwell"

Q1-2025 promises to be an action-packed quarter for graphics cards, with NVIDIA introducing the bulk of its next-generation GeForce RTX 50-series "Blackwell" GPUs. The company is expected to start things off with the two enthusiast-segment SKUs, the RTX 5090 and RTX 5080, in January, followed by the RTX 5070-series in February, and rounded off nicely with the RTX 5060-series in March. This would mean hundreds of individual new graphics card SKUs from NVIDIA's board partners, which are reportedly busy winding up the final inventory deliveries of their RTX 40-series "Ada" products, and transferring this production capacity to the RTX 50-series. So, when the RTX 50-series GPU models do come out across the quarter, there's plenty of inventory to go around. Board Channels reports that on NVIDIA's end, production of nearly every AD100-series silicon has ended, except the AD107, which will continue selling for entry-mainstream GeForce RTX 40-series SKUs. The AD106 production line has stopped, as has the AD103, AD104, and AD102.

Possible NVIDIA GeForce RTX 5080 Laptop GPU Pictured

Could this be the first picture of an NVIDIA GeForce RTX 5080 Laptop GPU? This picture, coupled with a specs sheet by notebook OEM Clevo, seems to suggest so, thanks to a new video by Moore's Law is Dead. The chip is noticeably more rectangular than the Ada "AD104," and is labelled N22W-ES-A1. It is an engineering sample. Cross-referencing "N22W" with the Clevo specs-sheet for a next-generation laptop mainboard, points to the possibility that the chip is indeed based on a next-gen silicon by NVIDIA. The board design has to undergo a significant change, due to the major change in the pin-map of the fiberglass substrate brought about by the switch to the new GDDR7 memory type.

The GeForce "Blackwell" generation comes in several GPU silicon sizes, and the RTX 5080 Laptop GPU is expected to be based on the "GB203" chip, which is expected to power the desktop RTX 5080 and possibly some SKUs in even the RTX 5070 series, such as the "RTX 5070 Ti." It is rumored to feature as many as 8,192 CUDA cores, and a 256-bit wide GDDR7 memory interface. NVIDIA is expected to unveil the GeForce "Blackwell" generation at CES 2025.

NVIDIA GeForce RTX 5090 "Blackwell" GPU Appears During Factory Boot-Up

We officially have the first look at NVIDIA's GeForce RTX 5090 "Blackwell" add-in board from what appears to be ZOTAC manufacturing facility. The leaked video shows a newly opened factory in Indonesia, which is recently opened-up to circumvent US export regulations. Published on Chiphell platform, the factory video shows NVIDIA GeForce RTX 5090 AIB design powering up, followed by cheering of factory workers. This signals that the alleged NVIDIA GeForce RTX 50 series scheduled for CES is near indeed, and AIB designs are also going to be available around that timeframe.

To confirm that the video is indeed showing GeForce RTX 5090, the video description, translated from Chinese, is as follows: "Due to the US's chip export control on China, graphics card chips with performance equal to or higher than 4090 are prohibited from being exported to mainland China. In order to avoid the impact of this move on the launch of RTX 5090, Bo Neng urgently built a factory in Batam, Indonesia. The video shows the debugging of the factory production line. The graphics card that lights up the monitor in the video is the NVIDIA RTX 5090 graphics card that will be launched soon." Although the video is quite blurry, we have to wait for the official launch or more leaks to see the GPU in its full glory.

Meta Shows Open-Architecture NVIDIA "Blackwell" GB200 System for Data Center

During the Open Compute Project (OCP) Summit 2024, Meta, one of the prime members of the OCP project, showed its NVIDIA "Blackwell" GB200 systems for its massive data centers. We previously covered Microsoft's Azure server rack with GB200 GPUs featuring one-third of the rack space for computing and two-thirds for cooling. A few days later, Google showed off its smaller GB200 system, and today, Meta is showing off its GB200 system—the smallest of the bunch. To train a dense transformer large language model with 405B parameters and a context window of up to 128k tokens, like the Llama 3.1 405B, Meta must redesign its data center infrastructure to run a distributed training job on two 24,000 GPU clusters. That is 48,000 GPUs used for training a single AI model.

Called "Catalina," it is built on the NVIDIA Blackwell platform, emphasizing modularity and adaptability while incorporating the latest NVIDIA GB200 Grace Blackwell Superchip. To address the escalating power requirements of GPUs, Catalina introduces the Orv3, a high-power rack capable of delivering up to 140kW. The comprehensive liquid-cooled setup encompasses a power shelf supporting various components, including a compute tray, switch tray, the Orv3 HPR, Wedge 400 fabric switch with 12.8 Tbps switching capacity, management switch, battery backup, and a rack management controller. Interestingly, Meta also upgraded its "Grand Teton" system for internal usage, such as deep learning recommendation models (DLRMs) and content understanding with AMD Instinct MI300X. Those are used to inference internal models, and MI300X appears to provide the best performance per Dollar for inference. According to Meta, the computational demand stemming from AI will continue to increase exponentially, so more NVIDIA and AMD GPUs is needed, and we can't wait to see what the company builds.

MSI Unveils AI Servers Powered by NVIDIA MGX at OCP 2024

MSI, a leading global provider of high-performance server solutions, proudly announced it is showcasing new AI servers powered by the NVIDIA MGX platform—designed to address the increasing demand for scalable, energy-efficient AI workloads in modern data centers—at the OCP Global Summit 2024, booth A6. This collaboration highlights MSI's continued commitment to advancing server solutions, focusing on cutting-edge AI acceleration and high-performance computing (HPC).

The NVIDIA MGX platform offers a flexible architecture that enables MSI to deliver purpose-built solutions optimized for AI, HPC, and LLMs. By leveraging this platform, MSI's AI server solutions provide exceptional scalability, efficiency, and enhanced GPU density—key factors in meeting the growing computational demands of AI workloads. Tapping into MSI's engineering expertise and NVIDIA's advanced AI technologies, these AI servers based on the MGX architecture deliver unparalleled compute power, positioning data centers to maximize performance and power efficiency while paving the way for the future of AI-driven infrastructure.

Lenovo Announces New Liquid Cooled Servers for Intel Xeon and NVIDIA Blackwell Platforms

At Lenovo Tech World 2024, we announced new Supercomputing servers for HPC and AI workloads. These new water-cooled servers use the latest processor and accelerator technology from Intel and NVIDIA.

ThinkSystem SC750 V4
Engineered for large-scale cloud infrastructures and High Performance Computing (HPC), the Lenovo ThinkSystem SC750 V4 Neptune excels in intensive simulations and complex modeling. It's designed to handle technical computing, grid deployments, and analytics workloads in various fields such as research, life sciences, energy, engineering, and financial simulation.

NVIDIA to Release the Bulk of its RTX 50-series in Q1-2025

The first quarter of 2025 (January thru March) will see back-to-back launches of next-generation GeForce RTX 50-series "Blackwell" graphics card, according to the latest rumors. NVIDIA CEO Jensen Huang is confirmed to take center stage for the 2025 International CES keynote address, where he is widely expected to kick off the GeForce "Blackwell" gaming GPU generation. CES is expected to see NVIDIA launch its flagship GeForce RTX 5090 (RTX 4090-successor SKU), and its next-best part, the GeForce RTX 5080 (RTX 4080 successor).

February 2025 is expected to see the company debut the RTX 5070, and possibly the RTX 5070 Ti, if there is such a SKU. The RTX 5070 succeeds a long line of extremely successful SKUs that tended to sell in large volumes. Perhaps the most important launches of the generation will come in March 2025, when the company is expected to debut the RTX 5060 and RTX 5060 Ti, which succeed the current RTX 4060 and RTX 4060 Ti, respectively. The xx60 tier tends to be the bestselling class of gaming GPUs in any generation. In all, it's expected that NVIDIA will release six new SKUs within Q1, and you can expect over a hundred graphics card reviews from TechPowerUp in Q1.

NVIDIA Contributes Blackwell Platform Design to Open Hardware Ecosystem, Accelerating AI Infrastructure Innovation

To drive the development of open, efficient and scalable data center technologies, NVIDIA today announced that it has contributed foundational elements of its NVIDIA Blackwell accelerated computing platform design to the Open Compute Project (OCP) and broadened NVIDIA Spectrum-X support for OCP standards.

At this year's OCP Global Summit, NVIDIA will be sharing key portions of the NVIDIA GB200 NVL72 system electro-mechanical design with the OCP community — including the rack architecture, compute and switch tray mechanicals, liquid-cooling and thermal environment specifications, and NVIDIA NVLink cable cartridge volumetrics — to support higher compute density and networking bandwidth.

NVIDIA "Blackwell" GPUs are Sold Out for 12 Months, Customers Ordering in 100K GPU Quantities

NVIDIA's "Blackwell" series of GPUs, including B100, B200, and GB200, are reportedly sold out for 12 months or an entire year. This directly means that if a new customer is willing to order a new Blackwell GPU now, there is a 12-month waitlist to get that GPU. Analyst from Morgan Stanley Joe Moore confirmed that in a meeting with NVIDIA and its investors, NVIDIA executives confirmed that the demand for "Blackwell" is so great that there is a 12-month backlog to fulfill first before shipping to anyone else. We expect that this includes customers like Amazon, META, Microsoft, Google, Oracle, and others, who are ordering GPUs in insane quantities to keep up with the demand from their customers.

The previous generation of "Hopper" GPUs was ordered in 10s of thousands of GPUs, while this "Blackwell" generation was ordered in 100s of thousands of GPUs simultaneously. For NVIDIA, that is excellent news, as that demand is expected to continue. The only one standing in the way of customers is TSMC, which manufactures these GPUs as fast as possible to meet demand. NVIDIA is one of TSMC's largest customers, so wafer allocation at TSMC's facilities is only expected to grow. We are now officially in the era of the million-GPU data centers, and we can only question at what point this massive growth stops or if it will stop at all in the near future.

NVIDIA Tunes GeForce RTX 5080 GDDR7 Memory to 32 Gbps, RTX 5070 Launches at CES

NVIDIA is gearing up for an exciting showcase at CES 2025, where its CEO, Jensen Huang, will take the stage and talk about, hopefully, future "Blackwell" products. According to Wccftech's sources, the anticipated GeForce RTX 5090, RTX 5080, and RTX 5070 graphics cards should arrive at CES 2025 in January. The flagship RTX 5090 is rumored to come equipped with 32 GB of GDDR7 memory running at 28 Gbps. Meanwhile, the RTX 5080 looks very interesting with reports of its impressive 16 GB of GDDR7 memory running at 32 Gbps. This advancement comes after we previously believed that the RTX 5080 model is going to feature 28 Gbps GDDR7 memory. However, the newest rumors suggest that we are in for a surprise, as the massive gap between RTX 5090 and RTX 5080 compute cores will be filled... with a faster memory.

The more budget-friendly RTX 5070 is also set for a CES debut, featuring 12 GB of memory. This card aims to deliver solid performance for gamers who want high-quality graphics without breaking the bank, targeting the mid-range segment. We are very curious about pricing of these models and how they would fit in the current market. As anticipation builds for CES 2025, we are eager to see how these innovations will impact gaming experiences and creative workflows in the coming year. Stay tuned for more updates as the event approaches!

NVIDIA "Blackwell" GB200 Server Dedicates Two-Thirds of Space to Cooling at Microsoft Azure

Late Tuesday, Microsoft Azure shared an interesting picture on its social media platform X, showcasing the pinnacle of GPU-accelerated servers—NVIDIA "Blackwell" GB200-powered AI systems. Microsoft is one of NVIDIA's largest customers, and the company often receives products first to integrate into its cloud and company infrastructure. Even NVIDIA listens to feedback from companies like Microsoft about designing future products, especially those like the now-canceled NVL36x2 system. The picture below shows a massive cluster that roughly divides the compute area into a single-third of the entire system, with a gigantic two-thirds of the system dedicated to closed-loop liquid cooling.

The entire system is connected using Infiniband networking, a standard for GPU-accelerated systems due to its lower latency in packet transfer. While the details of the system are scarce, we can see that the integrated closed-loop liquid cooling allows the GPU racks to be in a 1U form for increased density. Given that these systems will go into the wider Microsoft Azure data centers, a system needs to be easily maintained and cooled. There are indeed limits in power and heat output that Microsoft's data centers can handle, so these types of systems often fit inside internal specifications that Microsoft designs. There are more compute-dense systems, of course, like NVIDIA's NVL72, but hyperscalers should usually opt for other custom solutions that fit into their data center specifications. Finally, Microsoft noted that we can expect to see more details at the upcoming Microsoft Ignite conference in November and learn more about its GB200-powered AI systems.

NVIDIA's Jensen Huang to Lead CES 2025 Keynote

NVIDIA CEO Jensen Huang will be leading the keynote address at the coveted 2025 International CES in Las Vegas, which opens on January 7. The keynote address is slated for January 6, 6:30 am PT. There is of course no word from NVIDIA on what to expect, but we have some fairly easy guesswork. NVIDIA's refresh of the GeForce RTX product stack is due, and the company is expected to either debut or expand its next-generation GeForce RTX 50-series "Blackwell" gaming GPU stack, bringing in generational improvements in performance and performance-per-Watt, besides new technology.

The company could also make more announcements related to its "Blackwell" AI GPU lineup, which is expected to ramp through 2025, succeeding the current "Hopper" H100 and H200 series. The company could also tease "Rubin," which it referenced recently at GTC in May, "Rubin" succeeds "Blackwell," and will debut as an AI GPU toward the end of 2025, with a 2026 ramp toward customers. It's unclear if NVIDIA will make gaming GPUs on "Rubin," since GeForce RTX generations tend to have a 2-year cadence, and there was no gaming GPU based on "Hopper."

Foxconn to Build Taiwan's Fastest AI Supercomputer With NVIDIA Blackwell

NVIDIA and Foxconn are building Taiwan's largest supercomputer, marking a milestone in the island's AI advancement. The project, Hon Hai Kaohsiung Super Computing Center, revealed Tuesday at Hon Hai Tech Day, will be built around NVIDIA's groundbreaking Blackwell architecture and feature the GB200 NVL72 platform, which includes a total of 64 racks and 4,608 Tensor Core GPUs. With an expected performance of over 90 exaflops of AI performance, the machine would easily be considered the fastest in Taiwan.

Foxconn plans to use the supercomputer, once operational, to power breakthroughs in cancer research, large language model development and smart city innovations, positioning Taiwan as a global leader in AI-driven industries. Foxconn's "three-platform strategy" focuses on smart manufacturing, smart cities and electric vehicles. The new supercomputer will play a pivotal role in supporting Foxconn's ongoing efforts in digital twins, robotic automation and smart urban infrastructure, bringing AI-assisted services to urban areas like Kaohsiung.

NVIDIA Cancels Dual-Rack NVL36x2 in Favor of Single-Rack NVL72 Compute Monster

NVIDIA has reportedly discontinued its dual-rack GB200 NVL36x2 GPU model, opting to focus on the single-rack GB200 NVL72 and NVL36 models. This shift, revealed by industry analyst Ming-Chi Kuo, aims to simplify NVIDIA's offerings in the AI and HPC markets. The decision was influenced by major clients like Microsoft, who prefer the NVL72's improved space efficiency and potential for enhanced inference performance. While both models perform similarly in AI large language model (LLM) training, the NVL72 is expected to excel in non-parallelizable inference tasks. As a reminder, the NVL72 features 36 Grace CPUs, delivering 2,592 Arm Neoverse V2 cores with 17 TB LPDDR5X memory with 18.4 TB/s aggregate bandwidth. Additionally, it includes 72 Blackwell GB200 SXM GPUs that have a massive 13.5 TB of HBM3e combined, running at 576 TB/s aggregate bandwidth.

However, this shift presents significant challenges. The NVL72's power consumption of around 120kW far exceeds typical data center capabilities, potentially limiting its immediate widespread adoption. The discontinuation of the NVL36x2 has also sparked concerns about NVIDIA's execution capabilities and may disrupt the supply chain for assembly and cooling solutions. Despite these hurdles, industry experts view this as a pragmatic approach to product planning in the dynamic AI landscape. While some customers may be disappointed by the dual-rack model's cancellation, NVIDIA's long-term outlook in the AI technology market remains strong. The company continues to work with clients and listen to their needs, to position itself as a leader in high-performance computing solutions.
Return to Keyword Browsing
Dec 18th, 2024 18:53 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts