News Posts matching #NVLink

Return to Keyword Browsing

Advantech Introduces Its GPU Server SKY-602E3 With NVIDIA H200 NVL

Press Release by

Dec 11th, 2024 07:37 Discuss (0 Comments)

Advantech, a leading global provider of industrial edge AI solutions, is excited to introduce its GPU server SKY-602E3 equipped with the NVIDIA H200 NVL platform. This powerful combination is set to accelerate the offline LLM for manufacturing, providing unprecedented levels of performance and efficiency. The NVIDIA H200 NVL, requiring 600 W passive cooling, is fully supported by the compact and efficient SKY-602E3 GPU server, making it an ideal solution for demanding edge AI applications.

Core of Factory LLM Deployment: AI Vision
The SKY-602E3 GPU server excels in supporting large language models (LLMs) for AI inference and training. It features four PCIe 5.0 x16 slots, delivering high bandwidth for intensive tasks, and four PCIe 5.0 x8 slots, providing enhanced flexibility for GPU and frame grabber card expansion. The half-width design of the SKY-602E3 makes it an excellent choice for workstation environments. Additionally, the server can be equipped with the NVIDIA H200 NVL platform, which offers 1.7x more performance than the NVIDIA H100 NVL, freeing up additional PCIe slots for other expansion needs.

Read full story

Aetina Debuts at SC24 With NVIDIA MGX Server for Enterprise Edge AI

Press Release by

Nov 20th, 2024 10:59 Discuss (0 Comments)

Aetina, a subsidiary of the Innodisk Group and an expert in edge AI solutions, is pleased to announce its debut at Supercomputing (SC24) in Atlanta, Georgia, showcasing the innovative SuperEdge NVIDIA MGX short-depth edge AI server, AEX-2UA1. By integrating an enterprise-class on-premises large language model (LLM) with the advanced retrieval-augmented generation (RAG) technique, Aetina NVIDIA MGX short-depth server demonstrates exceptional enterprise edge AI performance, setting a new benchmark in Edge AI innovation. The server is powered by the latest Intel Xeon 6 processor and dual high-end double-width NVIDIA GPUs, delivering ultimate AI computing power in a compact 2U form factor, accelerating Gen AI at the edge.

The SuperEdge NVIDIA MGX server expands Aetina's product portfolio from specialized edge devices to comprehensive AI server solutions, propelling a key milestone in Innodisk Group's AI roadmap, from sensors and storage to AI software, computing platforms, and now AI edge servers.

Read full story

NVIDIA Prepares GB200 NVL4: Four "Blackwell" GPUs and Two "Grace" CPUs in a 5,400 W Server

by

Nov 19th, 2024 02:32 Discuss (6 Comments)

At SC24, NVIDIA announced its latest compute-dense AI accelerators in the form of GB200 NVL4, a single-server solution that expands the company's "Blackwell" series portfolio. The new platform features an impressive combination of four "Blackwell" GPUs and two "Grace" CPUs on a single board. The GB200 NVL4 boasts remarkable specifications for a single-server system, including 768 GB of HBM3E memory across its four Blackwell GPUs, delivering a combined memory bandwidth of 32 TB/s. The system's two Grace CPUs have 960 GB of LPDDR5X memory, making it a powerhouse for demanding AI workloads. A key feature of the NVL4 design is its NVLink interconnect technology, which enables communication between all processors on the board. This integration is important for maintaining optimal performance across the system's multiple processing units, especially during large training runs or inferencing a multi-trillion parameter model.

Performance comparisons with previous generations show significant improvements, with NVIDIA claiming the GB200 GPUs deliver 2.2x faster overall performance and 1.8x quicker training capabilities compared to their GH200 NVL4 predecessor. The system's power consumption reaches 5,400 watts, which effectively doubles the 2,700-watt requirement of the GB200 NVL2 model, its smaller sibling that features two GPUs instead of four. NVIDIA is working closely with OEM partners to bring various Blackwell solutions to market, including the DGX B200, GB200 Grace Blackwell Superchip, GB200 Grace Blackwell NVL2, GB200 Grace Blackwell NVL4, and GB200 Grace Blackwell NVL72. Fitting 5,400 W of TDP in a single server will require liquid cooling for optimal performance, and the GB200 NVL4 is expected to go inside server racks for hyperscaler customers, which usually have a custom liquid cooling systems inside their data centers.

ASRock Rack Brings End-to-End AI and HPC Server Portfolio to SC24

Press Release by

Nov 18th, 2024 21:06 Discuss (0 Comments)

ASRock Rack Inc., a leading innovative server company, today announces its presence at SC24, held at the Georgia World Congress Center in Atlanta from November 18-21. At booth #3609, ASRock Rack will showcase a comprehensive high-performance portfolio of server boards, systems, and rack solutions with NVIDIA accelerated computing platforms, helping address the needs of enterprises, organizations, and data centers.

Artificial intelligence (AI) and high-performance computing (HPC) continue to reshape technology. ASRock Rack is presenting a complete suite of solutions spanning edge, on-premise, and cloud environments, engineered to meet the demand of AI and HPC. The 2U short-depth MECAI, incorporating the NVIDIA GH200 Grace Hopper Superchip, is developed to supercharge accelerated computing and generative AI in space-constrained environments. The 4U10G-TURIN2 and 4UXGM-GNR2, supporting ten and eight NVIDIA H200 NVL PCIe GPUs respectively, are aiming to help enterprises and researchers tackle every AI and HPC challenge with enhanced performance and greater energy efficiency. NVIDIA H200 NVL is ideal for lower-power, air-cooled enterprise rack designs that require flexible configurations, delivering acceleration for AI and HPC workloads regardless of size.

Read full story

MSI - Micro-Star International

MSI Brings NVIDIA MGX AI Server to SC24

Press Release by

Nov 18th, 2024 21:00 Discuss (0 Comments)

MSI, a leading global provider of high-performance server solutions, unveiled its AI server based on the NVIDIA MGX architecture at Supercomputing 2024 (SC24) from November 19-21 at booth 3655. Purpose-built to maximize compute density, energy efficiency, and modular flexibility, MSI's MGX-based AI server is designed to handle the intensive demands of AI, HPC, and data-heavy applications, offering the scalable performance and resilience that data centers need to stay ahead in today's high-performance computing landscape.

According to Danny Hsu, General Manager of MSI's Enterprise Platform Solutions, "MSI's latest innovations mark a significant leap in computational power and efficiency, enabling organizations to maximize performance, adapt seamlessly to evolving needs, and drive efficiency, building a robust foundation for future growth in high-performance computing."

Read full story

ASUS Presents Next-Gen Infrastructure Solutions With Advanced Cooling Portfolio at SC24

Press Release by

Nov 18th, 2024 20:58 Discuss (0 Comments)

ASUS today announced its next-generation infrastructure solutions at SC24, unveiling an extensive server lineup and advanced cooling solutions, all designed to propel the future of AI. The product showcase will reveal how ASUS is working with NVIDIA and Ubitus/Ubilink to prove the immense computational power of supercomputers, using AI-powered avatar and robot demonstrations that leverage the newly-inaugurated data center. It is Taiwan's largest supercomputing facility, constructed by ASUS, and is also notable for offering flexible green-energy options to customers that desire them. As a total solution provider with a proven track record in pioneering AI supercomputing, ASUS continuously drives maximized value for customers.

To fuel digital transformation in enterprise through high-performance computing (HPC) and AI-driven architecture, ASUS provides a full line-up of server systems—ready for every scenario. ASUS AI POD, a complete rack solution equipped with NVIDIA GB200 NVL72 platform, integrates GPUs, CPUs and switches in seamless, high-speed direct communication, enhancing the training of trillion-parameter LLMs and enabling real-time inference. It features the NVIDIA GB200 Grace Blackwell Superchip and fifth-generation NVIDIA NVLink technology, while offering both liquid-to-air and liquid-to-liquid cooling options to maximize AI computing performance.

Read full story

HPE Expands Direct Liquid-Cooled Supercomputing Solutions With Two AI Systems for Service Providers and Large Enterprises

Press Release by

Nov 13th, 2024 10:13 Discuss (2 Comments)

Today, Hewlett Packard Enterprise announces its new high performance computing (HPC) and artificial intelligence (AI) infrastructure portfolio that includes leadership-class HPE Cray Supercomputing EX solutions and two systems optimized for large language model (LLM) training, natural language processing (NLP) and multi-modal model training. The new supercomputing solutions are designed to help global customers fast-track scientific research and invention.

"Service providers and nations investing in sovereign AI initiatives are increasingly turning to high-performance computing as the critical backbone enabling large-scale AI training that accelerates discovery and innovation," said Trish Damkroger, senior vice president and general manager, HPC & AI Infrastructure Solutions at HPE. "Our customers turn to us to fast-track their AI system deployment to realize value faster and more efficiently by leveraging our world-leading HPC solutions and decades of experience in delivering, deploying and servicing fully-integrated systems."

Read full story

Supermicro's Liquid-Cooled SuperClusters for AI Data Centers Powered by NVIDIA GB200 NVL72 and NVIDIA HGX B200 Systems

Press Release by

Oct 15th, 2024 14:06 Discuss (0 Comments)

Supermicro, Inc., a Total IT Solution Provider for AI, Cloud, Storage, and 5G/Edge, is accelerating the industry's transition to liquid-cooled data centers with the NVIDIA Blackwell platform to deliver a new paradigm of energy-efficiency for the rapidly heightened energy demand of new AI infrastructures. Supermicro's industry-leading end-to-end liquid-cooling solutions are powered by the NVIDIA GB200 NVL72 platform for exascale computing in a single rack and have started sampling to select customers for full-scale production in late Q4. In addition, the recently announced Supermicro X14 and H14 4U liquid-cooled systems and 10U air-cooled systems are production-ready for the NVIDIA HGX B200 8-GPU system.

"We're driving the future of sustainable AI computing, and our liquid-cooled AI solutions are rapidly being adopted by some of the most ambitious AI Infrastructure projects in the world with over 2000 liquid-cooled racks shipped since June 2024," said Charles Liang, president and CEO of Supermicro. "Supermicro's end-to-end liquid-cooling solution, with the NVIDIA Blackwell platform, unlocks the computational power, cost-effectiveness, and energy-efficiency of the next generation of GPUs, such as those that are part of the NVIDIA GB200 NVL72, an exascale computer contained in a single rack. Supermicro's extensive experience in deploying liquid-cooled AI infrastructure, along with comprehensive on-site services, management software, and global manufacturing capacity, provides customers a distinct advantage in transforming data centers with the most powerful and sustainable AI solutions."

Read full story

NVIDIA Contributes Blackwell Platform Design to Open Hardware Ecosystem, Accelerating AI Infrastructure Innovation

Press Release by

Oct 15th, 2024 12:56 Discuss (0 Comments)

To drive the development of open, efficient and scalable data center technologies, NVIDIA today announced that it has contributed foundational elements of its NVIDIA Blackwell accelerated computing platform design to the Open Compute Project (OCP) and broadened NVIDIA Spectrum-X support for OCP standards.

At this year's OCP Global Summit, NVIDIA will be sharing key portions of the NVIDIA GB200 NVL72 system electro-mechanical design with the OCP community — including the rack architecture, compute and switch tray mechanicals, liquid-cooling and thermal environment specifications, and NVIDIA NVLink cable cartridge volumetrics — to support higher compute density and networking bandwidth.

Read full story

Oracle Offers First Zettascale Cloud Computing Cluster

Press Release by

Sep 11th, 2024 10:26 Discuss (9 Comments)

Oracle today announced the first zettascale cloud computing clusters accelerated by the NVIDIA Blackwell platform. Oracle Cloud Infrastructure (OCI) is now taking orders for the largest AI supercomputer in the cloud—available with up to 131,072 NVIDIA Blackwell GPUs.

"We have one of the broadest AI infrastructure offerings and are supporting customers that are running some of the most demanding AI workloads in the cloud," said Mahesh Thiagarajan, executive vice president, Oracle Cloud Infrastructure. "With Oracle's distributed cloud, customers have the flexibility to deploy cloud and AI services wherever they choose while preserving the highest levels of data and AI sovereignty."

Read full story

GIGABYTE Announces New Liquid Cooled Solutions for NVIDIA HGX H200

Press Release by

Sep 4th, 2024 13:07 Discuss (1 Comment)

Giga Computing, a subsidiary of GIGABYTE and an industry leader in generative AI servers and advanced cooling technologies, today announced new flagship GIGABYTE G593 series servers supporting direct liquid cooling (DLC) technology to advance green data centers using NVIDIA HGX H200 GPU. As DLC technology is becoming a necessity for many data centers, GIGABYTE continues to increase its product portfolio with new DLC solutions for GPU and CPU technologies, and for these new G593 servers the cold plates are made by CoolIT Systems.

G593 Series - Tailored Cooling
The GPU-centric G593 series is custom engineered to house an 8-GPU baseboard, and its design had foresight for both air and liquid cooling. The compact 5U chassis leads the industry in its readily scalable nature, fitting up to sixty-four GPUs in a single rack and supporting 100kW of IT hardware. This helps to consolidate the IT hardware, and in turn, decrease the data center footprint. The G593 series servers for DLC are in response to the rising customer demand for greater energy efficiency. Liquids have a higher thermal conductivity than air, so they can rapidly and effectively remove heat from hot components to maintain lower operating temperatures. And by relying on water and heat exchangers, the overall energy consumption of the data center is reduced.

Read full story

ASUS Announces ESC N8-E11 AI Server with NVIDIA HGX H200

Press Release by

Aug 29th, 2024 06:24 Discuss (0 Comments)

ASUS today announced the latest marvel in the groundbreaking lineup of ASUS AI servers - ESC N8-E11, featuring the intensely powerful NVIDIA HGX H200 platform. With this AI titan, ASUS has secured its first industry deal, showcasing the exceptional performance, reliability and desirability of ESC N8-E11 with HGX H200, as well as the ability of ASUS to move first and fast in creating strong, beneficial partnerships with forward-thinking organizations seeking the world's most powerful AI solutions.

Shipments of the ESC N8-E11 with NVIDIA HGX H200 are scheduled to begin in early Q4 2024, marking a new milestone in the ongoing ASUS commitment to excellence. ASUS has been actively supporting clients by assisting in the development of cooling solutions to optimize overall PUE, guaranteeing that every ESC N8-E11 unit delivers top-tier efficiency and performance - ready to power the new era of AI.

Read full story

NVIDIA Blackwell Sets New Standard for Generative AI in MLPerf Inference Benchmark

Press Release by

Aug 28th, 2024 15:28 Discuss (3 Comments)

As enterprises race to adopt generative AI and bring new services to market, the demands on data center infrastructure have never been greater. Training large language models is one challenge, but delivering LLM-powered real-time services is another. In the latest round of MLPerf industry benchmarks, Inference v4.1, NVIDIA platforms delivered leading performance across all data center tests. The first-ever submission of the upcoming NVIDIA Blackwell platform revealed up to 4x more performance than the NVIDIA H100 Tensor Core GPU on MLPerf's biggest LLM workload, Llama 2 70B, thanks to its use of a second-generation Transformer Engine and FP4 Tensor Cores.

The NVIDIA H200 Tensor Core GPU delivered outstanding results on every benchmark in the data center category - including the latest addition to the benchmark, the Mixtral 8x7B mixture of experts (MoE) LLM, which features a total of 46.7 billion parameters, with 12.9 billion parameters active per token. MoE models have gained popularity as a way to bring more versatility to LLM deployments, as they're capable of answering a wide variety of questions and performing more diverse tasks in a single deployment. They're also more efficient since they only activate a few experts per inference - meaning they deliver results much faster than dense models of a similar size.

Read full story

ASUS Presents Comprehensive AI Server Lineup

Press Release by

Aug 15th, 2024 02:46 Discuss (1 Comment)

ASUS today announced its ambitious All in AI initiative, marking a significant leap into the server market with a complete AI infrastructure solution, designed to meet the evolving demands of AI-driven applications from edge, inference and generative AI the new, unparalleled wave of AI supercomputing. ASUS has proven its expertise lies in striking the perfect balance between hardware and software, including infrastructure and cluster architecture design, server installation, testing, onboarding, remote management and cloud services - positioning the ASUS brand and AI server solutions to lead the way in driving innovation and enabling the widespread adoption of AI across industries.

Meeting diverse AI needs
In partnership with NVIDIA, Intel and AMD, ASUS offer comprehensive AI-infrastructure solutions with robust software platforms and services, from entry-level AI servers and machine-learning solutions to full racks and data centers for large-scale supercomputing. At the forefront is the ESC AI POD with NVIDIA GB200 NVL72, a cutting-edge rack designed to accelerate trillion-token LLM training and real-time inference operations. Complemented by the latest NVIDIA Blackwell GPUs, NVIDIA Grace CPUs and 5th Gen NVIDIA NVLink technology, ASUS servers ensure unparalleled computing power and efficiency.

Read full story

ASRock Rack Unveils GPU Servers, Offers AI GPU Choices from All Three Brands

Computex by

Jun 6th, 2024 02:46 Discuss (1 Comment)

ASRock Rack sells the entire stack of servers a data-center could possibly want, and at Computex 2024, the company showed us their servers meant for AI GPUs. The 6U8M-GENOA2, as its name suggests, is a 6U server based on 2P AMD EPYC 9004 series "Genoa" processors in the SP5 package. You can configure it with even the variants of "Genoa" that come with 3D V-cache, for superior compute performance from the large cache. Each of the two SP5 sockets is wired to 12 DDR5 RDIMM slots, for a total of 24 memory channels. The server supports eight AMD Instinct MI300X or MI325X AI GPUs, which it wires out using Infinity Fabric links and PCIe Gen 5 x16 individually. A 3 kW 80 Plus Titanium PSU keeps the server fed. There are vacant Gen 5 x16 slots left even after connecting the GPUs, so you could give it a DPU-based 40 GbE NIC.

The 6U8X-EGS2 B100 is a 6U AI GPU server modeled along the 6U8M-GENOA2, with a couple of big changes. To begin with, the EPYC "Genoa" chips make way for a 2P Intel Xeon Socket E (LGA4677) CPU setup, for 2P Xeon 5 "Emerald Rapids" processors. Each socket is wired to 16 DDR5 DIMM slots (the processor itself has 8-channel DDR5, but this is a 2 DIMM-per-channel setup). The server integrates an NVIDIA NVSwitch that wires out NVLinks to eight NVIDIA B100 "Blackwell" AI GPUs. The server features eight HHHL PCIe Gen 5 x16, and five FHHL PCIe Gen 5 x16 connectors. There are vacant x16 slots for your DPU/NIC, you can even use an AIC NVIDIA BlueField card. The same 3 kW PSU as the "Genoa" system is also featured here.

Read full story

AMD, Broadcom, Cisco, Google, HPE, Intel, Meta and Microsoft Form Ultra Accelerator Link (UALink) Promoter Group to Combat NVIDIA NVLink

Press Release by

May 30th, 2024 12:51 Discuss (17 Comments)

AMD, Broadcom, Cisco, Google, Hewlett Packard Enterprise (HPE), Intel, Meta and Microsoft today announced they have aligned to develop a new industry standard dedicated to advancing high-speed and low latency communication for scale-up AI systems linking in Data Centers.

Called the Ultra Accelerator Link (UALink), this initial group will define and establish an open industry standard that will enable AI accelerators to communicate more effectively. By creating an interconnect based upon open standards, UALink will enable system OEMs, IT professionals and system integrators to create a pathway for easier integration, greater flexibility and scalability of their AI-connected data centers.

Read full story

Blackwell Shipments Imminent, Total CoWoS Capacity Expected to Surge by Over 70% in 2025

Press Release by

May 30th, 2024 07:46 Discuss (1 Comment)

TrendForce reports that NVIDIA's Hopper H100 began to see a reduction in shortages in 1Q24. The new H200 from the same platform is expected to gradually ramp in Q2, with the Blackwell platform entering the market in Q3 and expanding to data center customers in Q4. However, this year will still primarily focus on the Hopper platform, which includes the H100 and H200 product lines. The Blackwell platform—based on how far supply chain integration has progressed—is expected to start ramping up in Q4, accounting for less than 10% of the total high-end GPU market.

The die size of Blackwell platform chips like the B100 is twice that of the H100. As Blackwell becomes mainstream in 2025, the total capacity of TSMC's CoWoS is projected to grow by 150% in 2024 and by over 70% in 2025, with NVIDIA's demand occupying nearly half of this capacity. For HBM, the NVIDIA GPU platform's evolution sees the H100 primarily using 80 GB of HBM3, while the 2025 B200 will feature 288 GB of HBM3e—a 3-4 fold increase in capacity per chip. The three major manufacturers' expansion plans indicate that HBM production volume will likely double by 2025.

Read full story

NVIDIA Grace Hopper Ignites New Era of AI Supercomputing

Press Release by

May 13th, 2024 03:33 Discuss (0 Comments)

Driving a fundamental shift in the high-performance computing industry toward AI-powered systems, NVIDIA today announced nine new supercomputers worldwide are using NVIDIA Grace Hopper Superchips to speed scientific research and discovery. Combined, the systems deliver 200 exaflops, or 200 quintillion calculations per second, of energy-efficient AI processing power.

New Grace Hopper-based supercomputers coming online include EXA1-HE, in France, from CEA and Eviden; Helios at Academic Computer Centre Cyfronet, in Poland, from Hewlett Packard Enterprise (HPE); Alps at the Swiss National Supercomputing Centre, from HPE; JUPITER at the Jülich Supercomputing Centre, in Germany; DeltaAI at the National Center for Supercomputing Applications at the University of Illinois Urbana-Champaign; and Miyabi at Japan's Joint Center for Advanced High Performance Computing - established between the Center for Computational Sciences at the University of Tsukuba and the Information Technology Center at the University of Tokyo.

Read full story

Unwrapping the NVIDIA B200 and GB200 AI GPU Announcements

by

Mar 19th, 2024 02:10 Discuss (27 Comments)

NVIDIA on Monday, at the 2024 GTC conference, unveiled the "Blackwell" B200 and GB200 AI GPUs. These are designed to offer an incredible 5X the AI inferencing performance gain over the current-gen "Hopper" H100, and come with four times the on-package memory. The B200 "Blackwell" is the largest chip physically possible using existing foundry tech, according to its makers. The chip is an astonishing 208 billion transistors, and is made up of two chiplets, which by themselves are the largest possible chips.

Each chiplet is built on the TSMC N4P foundry node, which is the most advanced 4 nm-class node by the Taiwanese foundry. Each chiplet has 104 billion transistors. The two chiplets have a high degree of connectivity with each other, thanks to a 10 TB/s custom interconnect. This is enough bandwidth and latency for the two to maintain cache coherency (i.e. address each other's memory as if they're their own). Each of the two "Blackwell" chiplets has a 4096-bit memory bus, and is wired to 96 GB of HBM3E spread across four 24 GB stacks; which totals to 192 GB for the B200 package. The GPU has a staggering 8 TB/s of memory bandwidth on tap. The B200 package features a 1.8 TB/s NVLink interface for host connectivity, and connectivity to another B200 chip.

Read full story

ASUS Presents MGX-Powered Data-Center Solutions

Press Release by

Mar 18th, 2024 23:14 Discuss (0 Comments)

ASUS today announced its participation at the NVIDIA GTC global AI conference, where it will showcase its solutions at booth #730. On show will be the apex of ASUS GPU server innovation, ESC NM1-E1 and ESC NM2-E1, powered by the NVIDIA MGX modular reference architecture, accelerating AI supercomputing to new heights. To help meet the increasing demands for generative AI, ASUS uses the latest technologies from NVIDIA, including the B200 Tensor Core GPU, the GB200 Grace Blackwell Superchip, and H200 NVL, to help deliver optimized AI server solutions to boost AI adoption across a wide range of industries.

To better support enterprises in establishing their own generative AI environments, ASUS offers an extensive lineup of servers, from entry-level to high-end GPU server solutions, plus a comprehensive range of liquid-cooled rack solutions, to meet diverse workloads. Additionally, by leveraging its MLPerf expertise, the ASUS team is pursuing excellence by optimizing hardware and software for large-language-model (LLM) training and inferencing and seamlessly integrating total AI solutions to meet the demanding landscape of AI supercomputing.

Read full story

Supermicro Launches Three NVIDIA-Based, Full-Stack, Ready-to-Deploy Generative AI SuperClusters

Press Release by

Mar 18th, 2024 23:12 Discuss (2 Comments)

Supermicro, Inc., a Total IT Solution Provider for AI, Cloud, Storage, and 5G/Edge, is announcing its latest portfolio to accelerate the deployment of generative AI. The Supermicro SuperCluster solutions provide foundational building blocks for the present and the future of large language model (LLM) infrastructure. The three powerful Supermicro SuperCluster solutions are now available for generative AI workloads. The 4U liquid-cooled systems or 8U air-cooled systems are purpose-built and designed for powerful LLM training performance, as well as large batch size and high-volume LLM inference. A third SuperCluster, with 1U air-cooled Supermicro NVIDIA MGX systems, is optimized for cloud-scale inference.

"In the era of AI, the unit of compute is now measured by clusters, not just the number of servers, and with our expanded global manufacturing capacity of 5,000 racks/month, we can deliver complete generative AI clusters to our customers faster than ever before," said Charles Liang, president and CEO of Supermicro. "A 64-node cluster enables 512 NVIDIA HGX H200 GPUs with 72 TB of HBM3e through a couple of our scalable cluster building blocks with 400 Gb/s NVIDIA Quantum-2 InfiniBand and Spectrum-X Ethernet networking. Supermicro's SuperCluster solutions combined with NVIDIA AI Enterprise software are ideal for enterprise and cloud infrastructures to train today's LLMs with up to trillions of parameters. The interconnected GPUs, CPUs, memory, storage, and networking, when deployed across multiple nodes in racks, construct the foundation of today's AI. Supermicro's SuperCluster solutions provide foundational building blocks for rapidly evolving generative AI and LLMs."

Read full story

AWS and NVIDIA Extend Collaboration to Advance Generative AI Innovation

Press Release by

Mar 18th, 2024 17:37 Discuss (0 Comments)

Amazon Web Services (AWS), an Amazon.com company, and NVIDIA today announced that the new NVIDIA Blackwell GPU platform - unveiled by NVIDIA at GTC 2024 - is coming to AWS. AWS will offer the NVIDIA GB200 Grace Blackwell Superchip and B100 Tensor Core GPUs, extending the companies' long standing strategic collaboration to deliver the most secure and advanced infrastructure, software, and services to help customers unlock new generative artificial intelligence (AI) capabilities.

NVIDIA and AWS continue to bring together the best of their technologies, including NVIDIA's newest multi-node systems featuring the next-generation NVIDIA Blackwell platform and AI software, AWS's Nitro System and AWS Key Management Service (AWS KMS) advanced security, Elastic Fabric Adapter (EFA) petabit scale networking, and Amazon Elastic Compute Cloud (Amazon EC2) UltraCluster hyper-scale clustering. Together, they deliver the infrastructure and tools that enable customers to build and run real-time inference on multi-trillion parameter large language models (LLMs) faster, at massive scale, and at a lower cost than previous-generation NVIDIA GPUs on Amazon EC2.

Read full story

NVIDIA Launches Blackwell-Powered DGX SuperPOD for Generative AI Supercomputing at Trillion-Parameter Scale

Press Release by

Mar 18th, 2024 16:42 Discuss (2 Comments)

NVIDIA today announced its next-generation AI supercomputer—the NVIDIA DGX SuperPOD powered by NVIDIA GB200 Grace Blackwell Superchips—for processing trillion-parameter models with constant uptime for superscale generative AI training and inference workloads.

Featuring a new, highly efficient, liquid-cooled rack-scale architecture, the new DGX SuperPOD is built with NVIDIA DGX GB200 systems and provides 11.5 exaflops of AI supercomputing at FP4 precision and 240 terabytes of fast memory—scaling to more with additional racks.

Read full story

NVIDIA Blackwell Platform Arrives to Power a New Era of Computing

Press Release by

Mar 18th, 2024 16:39 Discuss (20 Comments)

Powering a new era of computing, NVIDIA today announced that the NVIDIA Blackwell platform has arrived—enabling organizations everywhere to build and run real-time generative AI on trillion-parameter large language models at up to 25x less cost and energy consumption than its predecessor.

The Blackwell GPU architecture features six transformative technologies for accelerated computing, which will help unlock breakthroughs in data processing, engineering simulation, electronic design automation, computer-aided drug design, quantum computing and generative AI—all emerging industry opportunities for NVIDIA.

Read full story

Gigabyte Unveils Comprehensive and Powerful AI Platforms at NVIDIA GTC

Press Release by

Mar 18th, 2024 15:50 Discuss (0 Comments)

GIGABYTE Technology and Giga Computing, a subsidiary of GIGABYTE and an industry leader in enterprise solutions, will showcase their solutions at the GIGABYTE booth #1224 at NVIDIA GTC, a global AI developer conference running through March 21. This event will offer GIGABYTE the chance to connect with its valued partners and customers, and together explore what the future in computing holds.

The GIGABYTE booth will focus on GIGABYTE's enterprise products that demonstrate AI training and inference delivered by versatile computing platforms based on NVIDIA solutions, as well as direct liquid cooling (DLC) for improved compute density and energy efficiency. Also not to be missed at the NVIDIA booth is the MGX Pavilion, which features a rack of GIGABYTE servers for the NVIDIA GH200 Grace Hopper Superchip architecture.

Read full story

Return to Keyword Browsing

Dec 22nd, 2024 00:23 EST change timezone

Latest GPU Drivers

New Forum Posts

00:08 by eidairaman1
How good is the Apevia 600W Gold 80 Plus power supply? (27)
00:06 by RaceT3ch
Ghetto Mods (4519)
23:51 by nguyen
What 4 nv series card do you have? Oc or uv? (16)
23:40 by eidairaman1
What do you do for a living? (388)
23:24 by THEBOSS619
[Intel AX1xx/AX2xx/AX4xx/AX16xx/BE2xx/BE17xx] Intel Modded Wi-Fi Driver with Intel® Killer™ Features (230)
23:21 by THEBOSS619
Solidigm NVMe Custom Modded Driver for All NVMe Brands SSDs & Any NVMe SSDs (171)
23:11 by eidairaman1
I can't change vbios rx 5700 xt sapphire nitro+ (8)
22:50 by abysal
Post the idle temperature of your CPU (116)
22:17 by Shrek
Packet radio (2)
21:52 by Launcestonian
Arc OC'ing, anyone? (21)

Popular Reviews

Dec 19th, 2024 Arrow Lake Retested with Latest 24H2 Updates and 0x114 Microcode
Dec 20th, 2024 Team Group T-FORCE Dark AirFlow I SSD Cooler Review
Dec 12th, 2024 Intel Arc B580 Review - Excellent Value
Dec 19th, 2024 Montech MKey PRO Wireless Mechanical Keyboard Review
Dec 18th, 2024 DUNU DK3001BD In-Ear Monitors Review - Brain Dance Time!
Nov 6th, 2024 AMD Ryzen 7 9800X3D Review - The Best Gaming Processor
Dec 13th, 2024 ASRock Arc B580 Steel Legend Review
Dec 18th, 2024 Dangbei Atom ALPD Laser Projector Review
Dec 16th, 2024 FiiO BTR17 Portable Bluetooth DAC and Headphones Amplifier Review
Dec 17th, 2024 Endgame Gear XM2w 4K Review

Controversial News Posts