News Posts matching #B200

Return to Keyword Browsing

NVIDIA B200 "Blackwell" Records 2.2x Performance Improvement Over its "Hopper" Predecessor

We know that NVIDIA's latest "Blackwell" GPUs are fast, but how much faster are they over the previous generation "Hopper"? Thanks to the latest MLPerf Training v4.1 results, NVIDIA's HGX B200 Blackwell platform has demonstrated massive performance gains, measuring up to 2.2x improvement per GPU compared to its HGX H200 Hopper. The latest results, verified by MLCommons, reveal impressive achievements in large language model (LLM) training. The Blackwell architecture, featuring HBM3e high-bandwidth memory and fifth-generation NVLink interconnect technology, achieved double the performance per GPU for GPT-3 pre-training and a 2.2x boost for Llama 2 70B fine-tuning compared to the previous Hopper generation. Each benchmark system incorporated eight Blackwell GPUs operating at a 1,000 W TDP, connected via NVLink Switch for scale-up.

The network infrastructure utilized NVIDIA ConnectX-7 SuperNICs and Quantum-2 InfiniBand switches, enabling high-speed node-to-node communication for distributed training workloads. While previous Hopper-based systems required 256 GPUs to optimize performance for the GPT-3 175B benchmark, Blackwell accomplished the same task with just 64 GPUs, leveraging its larger HBM3e memory capacity and bandwidth. One thing to look out for is the upcoming GB200 NVL72 system, which promises even more significant gains past the 2.2x. It features expanded NVLink domains, higher memory bandwidth, and tight integration with NVIDIA Grace CPUs, complemented by ConnectX-8 SuperNIC and Quantum-X800 switch technologies. With faster switching and better data movement with Grace-Blackwell integration, we could see even more software optimization from NVIDIA to push the performance envelope.

Google Shows Production NVIDIA "Blackwell" GB200 NVL System for Cloud

Last week, we got a preview of Microsoft's Azure production-ready NVIDIA "Blackwell" GB200 system, showing that only a third of the rack that goes in the data center is actually holding the compute elements, with the other two-thirds holding the cooling compartment to cool down the immense heat output from tens of GB200 GPUs. Today, Google is showing off a part of its own infrastructure ahead of the Google Cloud App Dev & Infrastructure Summit, taking place on October 30, digitally as an event. Shown below are two racks standing side by side, connecting NVIDIA "Blackwell" GB200 NVL cards with the rest of the Google infrastructure. Unlike Microsoft Azure, Google Cloud uses a different data center design in its facilities.

There is one rack with power distribution units, networking switches, and cooling distribution units, all connected to the compute rack, which houses power supplies, GPUs, and CPU servers. Networking equipment is present, and it connects to Google's "global" data center network, which is Google's own data center fabric. We are not sure what is the fabric connection of choice between these racks; as for optimal performance, NVIDIA recommends InfiniBand (Mellanox acquisition). However, given that Google's infrastructure is set up differently, there may be Ethernet switches present. Interestingly, Google's design of GB200 racks differs from Azure's, as it uses additional rack space to distribute the coolant to its local heat exchangers, i.e., coolers. We are curious to see if Google releases more information on infrastructure, as it has been known as the infrastructure king because of its ability to scale and keep everything organized.

NVIDIA Contributes Blackwell Platform Design to Open Hardware Ecosystem, Accelerating AI Infrastructure Innovation

To drive the development of open, efficient and scalable data center technologies, NVIDIA today announced that it has contributed foundational elements of its NVIDIA Blackwell accelerated computing platform design to the Open Compute Project (OCP) and broadened NVIDIA Spectrum-X support for OCP standards.

At this year's OCP Global Summit, NVIDIA will be sharing key portions of the NVIDIA GB200 NVL72 system electro-mechanical design with the OCP community — including the rack architecture, compute and switch tray mechanicals, liquid-cooling and thermal environment specifications, and NVIDIA NVLink cable cartridge volumetrics — to support higher compute density and networking bandwidth.

NVIDIA "Blackwell" GPUs are Sold Out for 12 Months, Customers Ordering in 100K GPU Quantities

NVIDIA's "Blackwell" series of GPUs, including B100, B200, and GB200, are reportedly sold out for 12 months or an entire year. This directly means that if a new customer is willing to order a new Blackwell GPU now, there is a 12-month waitlist to get that GPU. Analyst from Morgan Stanley Joe Moore confirmed that in a meeting with NVIDIA and its investors, NVIDIA executives confirmed that the demand for "Blackwell" is so great that there is a 12-month backlog to fulfill first before shipping to anyone else. We expect that this includes customers like Amazon, META, Microsoft, Google, Oracle, and others, who are ordering GPUs in insane quantities to keep up with the demand from their customers.

The previous generation of "Hopper" GPUs was ordered in 10s of thousands of GPUs, while this "Blackwell" generation was ordered in 100s of thousands of GPUs simultaneously. For NVIDIA, that is excellent news, as that demand is expected to continue. The only one standing in the way of customers is TSMC, which manufactures these GPUs as fast as possible to meet demand. NVIDIA is one of TSMC's largest customers, so wafer allocation at TSMC's facilities is only expected to grow. We are now officially in the era of the million-GPU data centers, and we can only question at what point this massive growth stops or if it will stop at all in the near future.

NVIDIA Might Consider Major Design Shift for Future 300 GPU Series

NVIDIA is reportedly considering a significant design change for its GPU products, shifting from the current on-board solution to an independent GPU socket design following the GB200 shipment in Q4, according to reports from MoneyDJ and the Economic Daily News quoted by TrendForce. This move is not new in the industry, AMD has already introduced socket design in 2023 with their MI300A series via Supermicro dedicated servers. The B300 series, expected to become NVIDIA's mainstream product in the second half of 2025, is rumored to be the main beneficiary of this design change that could improve yield rates, though it may come with some performance trade-offs.

According to the Economic Daily News, the socket design will simplify after-sales service and server board maintenance, allowing users to replace or upgrade the GPUs quickly. The report also pointed out that based on the slot design, boards will contain up to four NVIDIA GPUs and a CPU, with each GPU having its dedicated slot. This will bring benefits for Taiwanese manufacturers like Foxconn and LOTES, who will supply different components and connectors. The move seems logical since with the current on-board design, once a GPU becomes faulty, the entire motherboard needs to be replaced, leading to significant downtime and high operational and maintenance costs.

NVIDIA "Blackwell" GB200 Server Dedicates Two-Thirds of Space to Cooling at Microsoft Azure

Late Tuesday, Microsoft Azure shared an interesting picture on its social media platform X, showcasing the pinnacle of GPU-accelerated servers—NVIDIA "Blackwell" GB200-powered AI systems. Microsoft is one of NVIDIA's largest customers, and the company often receives products first to integrate into its cloud and company infrastructure. Even NVIDIA listens to feedback from companies like Microsoft about designing future products, especially those like the now-canceled NVL36x2 system. The picture below shows a massive cluster that roughly divides the compute area into a single-third of the entire system, with a gigantic two-thirds of the system dedicated to closed-loop liquid cooling.

The entire system is connected using Infiniband networking, a standard for GPU-accelerated systems due to its lower latency in packet transfer. While the details of the system are scarce, we can see that the integrated closed-loop liquid cooling allows the GPU racks to be in a 1U form for increased density. Given that these systems will go into the wider Microsoft Azure data centers, a system needs to be easily maintained and cooled. There are indeed limits in power and heat output that Microsoft's data centers can handle, so these types of systems often fit inside internal specifications that Microsoft designs. There are more compute-dense systems, of course, like NVIDIA's NVL72, but hyperscalers should usually opt for other custom solutions that fit into their data center specifications. Finally, Microsoft noted that we can expect to see more details at the upcoming Microsoft Ignite conference in November and learn more about its GB200-powered AI systems.

NVIDIA Cancels Dual-Rack NVL36x2 in Favor of Single-Rack NVL72 Compute Monster

NVIDIA has reportedly discontinued its dual-rack GB200 NVL36x2 GPU model, opting to focus on the single-rack GB200 NVL72 and NVL36 models. This shift, revealed by industry analyst Ming-Chi Kuo, aims to simplify NVIDIA's offerings in the AI and HPC markets. The decision was influenced by major clients like Microsoft, who prefer the NVL72's improved space efficiency and potential for enhanced inference performance. While both models perform similarly in AI large language model (LLM) training, the NVL72 is expected to excel in non-parallelizable inference tasks. As a reminder, the NVL72 features 36 Grace CPUs, delivering 2,592 Arm Neoverse V2 cores with 17 TB LPDDR5X memory with 18.4 TB/s aggregate bandwidth. Additionally, it includes 72 Blackwell GB200 SXM GPUs that have a massive 13.5 TB of HBM3e combined, running at 576 TB/s aggregate bandwidth.

However, this shift presents significant challenges. The NVL72's power consumption of around 120kW far exceeds typical data center capabilities, potentially limiting its immediate widespread adoption. The discontinuation of the NVL36x2 has also sparked concerns about NVIDIA's execution capabilities and may disrupt the supply chain for assembly and cooling solutions. Despite these hurdles, industry experts view this as a pragmatic approach to product planning in the dynamic AI landscape. While some customers may be disappointed by the dual-rack model's cancellation, NVIDIA's long-term outlook in the AI technology market remains strong. The company continues to work with clients and listen to their needs, to position itself as a leader in high-performance computing solutions.

NVIDIA Resolves "Blackwell" Yield Issues with New Photomask

During its Q2 2024 earnings call, NVIDIA confirmed that its upcoming Blackwell-based products are facing low-yield challenges. However, the company announced that it has implemented design changes to improve the production yields of its B100 and B200 processors. Despite these setbacks, NVIDIA remains optimistic about its production timeline. The tech giant plans to commence the production ramp of Blackwell GPUs in Q4 2024, with expected shipments worth several billion dollars by the end of the year. In an official statement, NVIDIA explained, "We executed a change to the Blackwell GPU mask to improve production yield." The company also reaffirmed that it had successfully sampled Blackwell GPUs with customers in the second quarter.

However, NVIDIA acknowledged that meeting demand required producing "low-yielding Blackwell material," which impacted its gross margins. During an earnings call, NVIDIA's CEO Jensen Huang assured investors that the supply of B100 and B200 GPUs will be there. He expressed confidence in the company's ability to mass-produce these chips starting in the fourth quarter. The Blackwell B100 and B200 GPUs use TSMC's CoWoS-L packaging technology and a complex design, which prompted rumors about the company facing yield issues with its designs. Reports suggest that initial challenges arose from mismatched thermal expansion coefficients among various components, leading to warping and system failures. However, now the company claims that the fix that solved these problems was a new GPU photomask, which bumped yields back to normal levels.

NVIDIA's New B200A Targets OEM Customers; High-End GPU Shipments Expected to Grow 55% in 2025

Despite recent rumors speculating on NVIDIA's supposed cancellation of the B100 in favor of the B200A, TrendForce reports that NVIDIA is still on track to launch both the B100 and B200 in the 2H24 as it aims to target CSP customers. Additionally, a scaled-down B200A is planned for other enterprise clients, focusing on edge AI applications.

TrendForce reports that NVIDIA will prioritize the B100 and B200 for CSP customers with higher demand due to the tight production capacity of CoWoS-L. Shipments are expected to commence after 3Q24. In light of yield and mass production challenges with CoWoS-L, NVIDIA is also planning the B200A for other enterprise clients, utilizing CoWoS-S packaging technology.

NVIDIA Blackwell's High Power Consumption Drives Cooling Demands; Liquid Cooling Penetration Expected to Reach 10% by Late 2024

With the growing demand for high-speed computing, more effective cooling solutions for AI servers are gaining significant attention. TrendForce's latest report on AI servers reveals that NVIDIA is set to launch its next-generation Blackwell platform by the end of 2024. Major CSPs are expected to start building AI server data centers based on this new platform, potentially driving the penetration rate of liquid cooling solutions to 10%.

Air and liquid cooling systems to meet higher cooling demands
TrendForce reports that the NVIDIA Blackwell platform will officially launch in 2025, replacing the current Hopper platform and becoming the dominant solution for NVIDIA's high-end GPUs, accounting for nearly 83% of all high-end products. High-performance AI server models like the B200 and GB200 are designed for maximum efficiency, with individual GPUs consuming over 1,000 W. HGX models will house 8 GPUs each, while NVL models will support 36 or 72 GPUs per rack, significantly boosting the growth of the liquid cooling supply chain for AI servers.

Global AI Server Demand Surge Expected to Drive 2024 Market Value to US$187 Billion; Represents 65% of Server Market

TrendForce's latest industry report on AI servers reveals that high demand for advanced AI servers from major CSPs and brand clients is expected to continue in 2024. Meanwhile, TSMC, SK hynix, Samsung, and Micron's gradual production expansion has significantly eased shortages in 2Q24. Consequently, the lead time for NVIDIA's flagship H100 solution has decreased from the previous 40-50 weeks to less than 16 weeks.

TrendForce estimates that AI server shipments in the second quarter will increase by nearly 20% QoQ, and has revised the annual shipment forecast up to 1.67 million units—marking a 41.5% YoY growth.

AI Startup Etched Unveils Transformer ASIC Claiming 20x Speed-up Over NVIDIA H100

A new startup emerged out of stealth mode today to power the next generation of generative AI. Etched is a company that makes an application-specific integrated circuit (ASIC) to process "Transformers." The transformer is an architecture for designing deep learning models developed by Google and is now the powerhouse behind models like OpenAI's GPT-4o in ChatGPT, Anthropic Claude, Google Gemini, and Meta's Llama family. Etched wanted to create an ASIC for processing only the transformer models, making a chip called Sohu. The claim is Sohu outperforms NVIDIA's latest and greatest by an entire order of magnitude. Where a server configuration with eight NVIDIA H100 GPU clusters pushes Llama-3 70B models at 25,000 tokens per second, and the latest eight B200 "Blackwell" GPU cluster pushes 43,000 tokens/s, the eight Sohu clusters manage to output 500,000 tokens per second.

Why is this important? Not only does the ASIC outperform Hopper by 20x and Blackwell by 10x, but it also serves so many tokens per second that it enables an entirely new fleet of AI applications requiring real-time output. The Sohu architecture is so efficient that 90% of the FLOPS can be used, while traditional GPUs boast a 30-40% FLOP utilization rate. This translates into inefficiency and waste of power, which Etched hopes to solve by building an accelerator dedicated to power transformers (the "T" in GPT) at massive scales. Given that the frontier model development costs more than one billion US dollars, and hardware costs are measured in tens of billions of US Dollars, having an accelerator dedicated to powering a specific application can help advance AI faster. AI researchers often say that "scale is all you need" (resembling the legendary "attention is all you need" paper), and Etched wants to build on that.

Blackwell Shipments Imminent, Total CoWoS Capacity Expected to Surge by Over 70% in 2025

TrendForce reports that NVIDIA's Hopper H100 began to see a reduction in shortages in 1Q24. The new H200 from the same platform is expected to gradually ramp in Q2, with the Blackwell platform entering the market in Q3 and expanding to data center customers in Q4. However, this year will still primarily focus on the Hopper platform, which includes the H100 and H200 product lines. The Blackwell platform—based on how far supply chain integration has progressed—is expected to start ramping up in Q4, accounting for less than 10% of the total high-end GPU market.

The die size of Blackwell platform chips like the B100 is twice that of the H100. As Blackwell becomes mainstream in 2025, the total capacity of TSMC's CoWoS is projected to grow by 150% in 2024 and by over 70% in 2025, with NVIDIA's demand occupying nearly half of this capacity. For HBM, the NVIDIA GPU platform's evolution sees the H100 primarily using 80 GB of HBM3, while the 2025 B200 will feature 288 GB of HBM3e—a 3-4 fold increase in capacity per chip. The three major manufacturers' expansion plans indicate that HBM production volume will likely double by 2025.

Intel's Next-Gen Falcon Shores GPU to Consume 1500 W, No Air-Cooled Variant Planned

Intel's upcoming Falcon Shores GPU is shaping up to be a powerhouse for AI and high-performance computing (HPC) workloads, but it will also be an extreme power hog. The processor, combining Gaudi and Ponte Vecchio successors into a single GPU, is expected to consume an astonishing 1500 W of power - more than even Nvidia's beefy B200 accelerator, which draws 1000 W. This immense power consumption will require advanced cooling solutions to ensure the Falcon Shores GPU operates efficiently and safely. Intel's partners may turn to liquid cooling or even full immersion liquid cooling, a technology Intel has been promoting for power-hungry data center hardware. The high power draw is the cost of the Falcon Shores GPU's formidable performance promises. Intel claims it will deliver 5x higher performance per watt and 5x more memory capacity and bandwidth compared to its Ponte Vecchio products.

Intel may need to develop proprietary hardware modules or a new Open Accelerator Module (OAM) spec to support such extreme power levels, as the current OAM 2.0 tops out around 1000 W. Slated for release in 2025, the Falcon Shores GPU will be Intel's GPU IP based on its next-gen Xe graphics architecture. It aims to be a major player in the AI accelerator market, backed by Intel's robust oneAPI software development ecosystem. While the 1500 W power consumption is sure to raise eyebrows, Intel is betting that the Falcon Shores GPU's supposedly impressive performance will make it an enticing option for AI and HPC customers willing to invest in robust cooling infrastructure. The ultra-high-end accelerator market is heating up, and the HPC accelerator market needs a Ponte Vecchio successor.

Demand for NVIDIA's Blackwell Platform Expected to Boost TSMC's CoWoS Total Capacity by Over 150% in 2024

NVIDIA's next-gen Blackwell platform, which includes B-series GPUs and integrates NVIDIA's own Grace Arm CPU in models such as the GB200, represents a significant development. TrendForce points out that the GB200 and its predecessor, the GH200, both feature a combined CPU+GPU solution, primarily equipped with the NVIDIA Grace CPU and H200 GPU. However, the GH200 accounted for only approximately 5% of NVIDIA's high-end GPU shipments. The supply chain has high expectations for the GB200, with projections suggesting that its shipments could exceed millions of units by 2025, potentially making up nearly 40 to 50% of NVIDIA's high-end GPU market.

Although NVIDIA plans to launch products such as the GB200 and B100 in the second half of this year, upstream wafer packaging will need to adopt more complex and high-precision CoWoS-L technology, making the validation and testing process time-consuming. Additionally, more time will be required to optimize the B-series for AI server systems in aspects such as network communication and cooling performance. It is anticipated that the GB200 and B100 products will not see significant production volumes until 4Q24 or 1Q25.

U.S. Updates Advanced Semiconductor Ban, Actual Impact on the Industry Will Be Insignificant

On March 29th, the United States announced another round of updates to its export controls, targeting advanced computing, supercomputers, semiconductor end-uses, and semiconductor manufacturing products. These new regulations, which took effect on April 4th, are designed to prevent certain countries and businesses from circumventing U.S. restrictions to access sensitive chip technologies and equipment. Despite these tighter controls, TrendForce believes the practical impact on the industry will be minimal.

The latest updates aim to refine the language and parameters of previous regulations, tightening the criteria for exports to Macau and D:5 countries (China, North Korea, Russia, Iran, etc.). They require a detailed examination of all technology products' Total Processing Performance (TPP) and Performance Density (PD). If a product exceeds certain computing power thresholds, it must undergo a case-by-case review. Nevertheless, a new provision, Advanced Computing Authorized (ACA), allows for specific exports and re-exports among selected countries, including the transshipment of particular products between Macau and D:5 countries.

US Government Wants Nuclear Plants to Offload AI Data Center Expansion

The expansion of AI technology affects not only the production and demand for graphics cards but also the electricity grid that powers them. Data centers hosting thousands of GPUs are becoming more common, and the industry has been building new facilities for GPU-enhanced servers to serve the need for more AI. However, these powerful GPUs often consume over 500 Watts per single card, and NVIDIA's latest Blackwell B200 GPU has a TGP of 1000 Watts or a single kilowatt. These kilowatt GPUs will be present in data centers with 10s of thousands of cards, resulting in multi-megawatt facilities. To combat the load on the national electricity grid, US President Joe Biden's administration has been discussing with big tech to re-evaluate their power sources, possibly using smaller nuclear plants. According to an Axios interview with Energy Secretary Jennifer Granholm, she has noted that "AI itself isn't a problem because AI could help to solve the problem." However, the problem is the load-bearing of the national electricity grid, which can't sustain the rapid expansion of the AI data centers.

The Department of Energy (DOE) has been reportedly talking with firms, most notably hyperscalers like Microsoft, Google, and Amazon, to start considering nuclear fusion and fission power plants to satisfy the need for AI expansion. We have already discussed the plan by Microsoft to embed a nuclear reactor near its data center facility and help manage the load of thousands of GPUs running AI training/inference. However, this time, it is not just Microsoft. Other tech giants are reportedly thinking about nuclear as well. They all need to offload their AI expansion from the US national power grid and develop a nuclear solution. Nuclear power is a mere 20% of the US power sourcing, and DOE is currently financing a Holtec Palisades 800-MW electric nuclear generating station with $1.52 billion in funds for restoration and resumption of service. Microsoft is investing in a Small Modular Reactors (SMRs) microreactor energy strategy, which could be an example for other big tech companies to follow.

Nvidia CEO Reiterates Solid Partnership with TSMC

One key takeaway from the ongoing GTC is that Nvidia's AI empire has taken shape with strong partnerships from TSMC and other Taiwanese makers, such as those major server ODMs.

According to the news report from the technology-focused media DIGITIMES Asia, during his keynote at GTC on March 18, Huang underscored his company's partnerships with TSMC, as well as the supply chain in Taiwan. Speaking to the press later, Huang said Nvidia will have a very strong demand for CoWoS, the advanced packaging services TSMC offers.

Samsung Prepares Mach-1 Chip to Rival NVIDIA in AI Inference

During its 55th annual shareholders' meeting, Samsung Electronics announced its entry into the AI processor market with the upcoming launch of its Mach-1 AI accelerator chips in early 2025. The South Korean tech giant revealed its plans to compete with established players like NVIDIA in the rapidly growing AI hardware sector. The Mach-1 generation of chips is an application-specific integrated circuit (ASIC) design equipped with LPDDR memory that is envisioned to excel in edge computing applications. While Samsung does not aim to directly rival NVIDIA's ultra-high-end AI solutions like the H100, B100, or B200, the company's strategy focuses on carving out a niche in the market by offering unique features and performance enhancements at the edge, where low power and efficient computing is what matters the most.

According to SeDaily, the Mach-1 chips boast a groundbreaking feature that significantly reduces memory bandwidth requirements for inference to approximately 0.125x compared to existing designs, which is an 87.5% reduction. This innovation could give Samsung a competitive edge in terms of efficiency and cost-effectiveness. As the demand for AI-powered devices and services continues to soar, Samsung's foray into the AI chip market is expected to intensify competition and drive innovation in the industry. While NVIDIA currently holds a dominant position, Samsung's cutting-edge technology and access to advanced semiconductor manufacturing nodes could make it a formidable contender. The Mach-1 has been field-verified on an FPGA, while the final design is currently going through a physical design for SoC, which includes placement, routing, and other layout optimizations.

Dell Expands Generative AI Solutions Portfolio, Selects NVIDIA Blackwell GPUs

Dell Technologies is strengthening its collaboration with NVIDIA to help enterprises adopt AI technologies. By expanding the Dell Generative AI Solutions portfolio, including with the new Dell AI Factory with NVIDIA, organizations can accelerate integration of their data, AI tools and on-premises infrastructure to maximize their generative AI (GenAI) investments. "Our enterprise customers are looking for an easy way to implement AI solutions—that is exactly what Dell Technologies and NVIDIA are delivering," said Michael Dell, founder and CEO, Dell Technologies. "Through our combined efforts, organizations can seamlessly integrate data with their own use cases and streamline the development of customized GenAI models."

"AI factories are central to creating intelligence on an industrial scale," said Jensen Huang, founder and CEO, NVIDIA. "Together, NVIDIA and Dell are helping enterprises create AI factories to turn their proprietary data into powerful insights."

Unwrapping the NVIDIA B200 and GB200 AI GPU Announcements

NVIDIA on Monday, at the 2024 GTC conference, unveiled the "Blackwell" B200 and GB200 AI GPUs. These are designed to offer an incredible 5X the AI inferencing performance gain over the current-gen "Hopper" H100, and come with four times the on-package memory. The B200 "Blackwell" is the largest chip physically possible using existing foundry tech, according to its makers. The chip is an astonishing 208 billion transistors, and is made up of two chiplets, which by themselves are the largest possible chips.

Each chiplet is built on the TSMC N4P foundry node, which is the most advanced 4 nm-class node by the Taiwanese foundry. Each chiplet has 104 billion transistors. The two chiplets have a high degree of connectivity with each other, thanks to a 10 TB/s custom interconnect. This is enough bandwidth and latency for the two to maintain cache coherency (i.e. address each other's memory as if they're their own). Each of the two "Blackwell" chiplets has a 4096-bit memory bus, and is wired to 96 GB of HBM3E spread across four 24 GB stacks; which totals to 192 GB for the B200 package. The GPU has a staggering 8 TB/s of memory bandwidth on tap. The B200 package features a 1.8 TB/s NVLink interface for host connectivity, and connectivity to another B200 chip.

ASUS Presents MGX-Powered Data-Center Solutions

ASUS today announced its participation at the NVIDIA GTC global AI conference, where it will showcase its solutions at booth #730. On show will be the apex of ASUS GPU server innovation, ESC NM1-E1 and ESC NM2-E1, powered by the NVIDIA MGX modular reference architecture, accelerating AI supercomputing to new heights. To help meet the increasing demands for generative AI, ASUS uses the latest technologies from NVIDIA, including the B200 Tensor Core GPU, the GB200 Grace Blackwell Superchip, and H200 NVL, to help deliver optimized AI server solutions to boost AI adoption across a wide range of industries.

To better support enterprises in establishing their own generative AI environments, ASUS offers an extensive lineup of servers, from entry-level to high-end GPU server solutions, plus a comprehensive range of liquid-cooled rack solutions, to meet diverse workloads. Additionally, by leveraging its MLPerf expertise, the ASUS team is pursuing excellence by optimizing hardware and software for large-language-model (LLM) training and inferencing and seamlessly integrating total AI solutions to meet the demanding landscape of AI supercomputing.

NVIDIA Blackwell Platform Arrives to Power a New Era of Computing

Powering a new era of computing, NVIDIA today announced that the NVIDIA Blackwell platform has arrived—enabling organizations everywhere to build and run real-time generative AI on trillion-parameter large language models at up to 25x less cost and energy consumption than its predecessor.

The Blackwell GPU architecture features six transformative technologies for accelerated computing, which will help unlock breakthroughs in data processing, engineering simulation, electronic design automation, computer-aided drug design, quantum computing and generative AI—all emerging industry opportunities for NVIDIA.

NVIDIA B100 "Blackwell" AI GPU Technical Details Leak Out

Jensen Huang's opening GTC 2024 keynote is scheduled to happen tomorrow afternoon (13:00 Pacific time)—many industry experts believe that the NVIDIA boss will take the stage and formally introduce his company's B100 "Blackwell" GPU architecture. An enlightened few have been treated to preview (AI and HPC) units—including Dell's CEO, Jeff Clarke—but pre-introduction leaks have not flowed out. Team Green is likely enforcing strict conditions upon a fortunate selection of trusted evaluators, within a pool of ecosystem partners and customers.

Today, a brave soul has broken that silence—tech tipster, AGF/XpeaGPU, fears repercussions from the leather-jacketed one. They revealed a handful of technical details, a day prior to Team Green's highly anticipated unveiling: "I don't want to spoil NVIDIA B100 launch tomorrow, but this thing is a monster. 2 dies on (TSMC) CoWoS-L, 8x8-Hi HBM3E stacks for 192 GB of memory." They also crystal balled an inevitable follow-up card: "one year later, B200 goes with 12-Hi stacks and will offer a beefy 288 GB. And the performance! It's... oh no Jensen is there... me run away!" Reuters has also joined in on the fun, with some predictions and insider information: "NVIDIA is unlikely to give specific pricing, but the B100 is likely to cost more than its predecessor, which sells for upwards of $20,000." Enterprise products are expected to arrive first—possibly later this year—followed by gaming variants, maybe months later.

Dell Exec Confirms NVIDIA "Blackwell" B100 Doesn't Need Liquid Cooling

NVIDIA's next-generation AI GPU, the B100 "Blackwell," is now in the hands of the company's biggest ecosystem partners and customers for evaluation, and one of them is Dell. Jeff Clarke, the OEM giant's chief operating officer, speaking to industry analysts in an investor teleconference, said that he is excited about the upcoming B100 and B200 chips from NVIDIA. B100 is codename for the AI GPU NVIDIA designs for PCIe add-on card and the SXM socket, meant for systems powered by x86 CPUs such as the AMD EPYC or Intel Xeon Scalable. The B200 is its variant meant for machines powered by NVIDIA's in-house Arm-based processors, such as the successor to its Grace CPU, and its combination with an AI GPU, called Grace Hopper (GH200).

Perhaps the most interesting remark by Clarke about the B100 is that he doesn't think it needs liquid cooling, and can make do with high-airflow cooling like the H100. "We're excited about what happens at the B100 and the B200, and we think that's where there's actually another opportunity to distinguish engineering confidence. Our characterization in the thermal side, you really don't need to direct-liquid cooling to get to the energy density of 1000 W per GPU. That happens next year with the B200," he said. NVIDIA is planning a 2024 debut for "Blackwell" in the AI GPU space with the B100, with B200 slated for 2025, possibly alongside a new CPU.
Return to Keyword Browsing
Nov 17th, 2024 07:23 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts