Thursday, December 7th 2023
Supermicro Extends AI and GPU Rack Scale Solutions with Support for AMD Instinct MI300 Series Accelerators
Supermicro, Inc., a Total IT Solution Manufacturer for AI, Cloud, Storage, and 5G/Edge, is announcing three new additions to its AMD-based H13 generation of GPU Servers, optimized to deliver leading-edge performance and efficiency, powered by the new AMD Instinct MI300 Series accelerators. Supermicro's powerful rack scale solutions with 8-GPU servers with the AMD Instinct MI300X OAM configuration are ideal for large model training.
The new 2U liquid-cooled and 4U air-cooled servers with the AMD Instinct MI300A Accelerated Processing Units (APUs) accelerators are available and improve data center efficiencies and power the fast-growing complex demands in AI, LLM, and HPC. The new systems contain quad APUs for scalable applications. Supermicro can deliver complete liquid-cooled racks for large-scale environments with up to 1,728 TFlops of FP64 performance per rack. Supermicro worldwide manufacturing facilities streamline the delivery of these new servers for AI and HPC convergence."We are very excited to expand our rack scale Total IT Solutions for AI training with the latest generation of AMD Instinct accelerators, with up to 3.4X the performance improvement compared to previous generations," said Charles Liang, president and CEO of Supermicro. "With our ability to deliver 4,000 liquid-cooled racks per month from our worldwide manufacturing facilities, we can deliver the newest H13 GPU solutions with either the AMD's Instinct MI300X accelerator or the AMD Instinct MI300A APU. Our proven architecture allows 1:1 400G networking dedicated for each GPU designed for large-scale AI and supercomputing clusters capable of fully integrated liquid cooling solutions, giving customers a competitive advantage for performance and superior efficiency with ease of deployment."
The LLM optimized AS -8125GS-TNMR2 system is built on Supermicro's building block architecture, a proven design for high-performance AI systems with air and liquid cooled rack scale designs. The balanced system design associates a GPU with a 1:1 networking to provide a large pool of high bandwidth memory across nodes and racks to fit today's largest language models with up to trillions of parameters, maximizing parallel computing and minimizing the training time and inference latency. The 8U system with the MI300X OAM accelerator offers the raw acceleration power of 8-GPU with the AMD Infinity Fabric Links, enabling up to 896 GB/s of peak theoretical P2P I/O bandwidth on the open standard platform with industry-leading 1.5 TB HBM3 GPU memory in a single system, as well as native sparse matrix support, designed to save power, lower compute cycles and reduce memory use for AI workloads. Each server features dual socket AMD EPYC 9004 series processors with up to 256 cores. At rack scale, over 1000 CPU cores, 24 TB of DDR5 memory, 6.144 TB of HBM3 memory, and 9728 Compute Units are available for the most challenging AI environments. Using the OCP Accelerator Module (OAM), with which Supermicro has significant experience in 8U configurations, brings a fully configured server to market faster than a custom design, reducing costs and time to delivery.
Supermicro is also introducing a density optimized 2U liquid-cooled server, the AS -2145GH-TNMR, and a 4U air-cooled server, the AS -4145GH-TNMR, each with 4 AMD Instinct MI300A accelerators. The new servers are designed for HPC and AI applications, requiring extremely fast CPU to GPU communication. The APU eliminates redundant memory copies by combining the highest-performing AMD CPU, GPU, and HBM3 memory on a single chip. Each server contains leadership x86 "Zen 4" CPU cores for application scale-up. Also, each server includes 512 GB of HBM3 memory. In a full rack (48U) solution consisting of 21 2U systems, over 10T B of HBM3 memory is available, as well as 19,152 Compute Units. The HBM3 to CPU memory bandwidth is 5.3 TB/second.
Both systems feature dual AIOMs with 400G Ethernet support and expanded networking options designed to improve space, scalability, and efficiency for high-performance computing. The 2U direct-to-chip liquid-cooled system delivers excellent TCO with over a 35% energy consumption savings based on 21 2U system rack solutions that produce 61,780 watts per rack over 95,256 watts air-cooled rack, as well as a 70% reduction in the number of fans compared to an air cooled system.
"AMD Instinct MI300 Series accelerators deliver leadership performance, both for longstanding accelerated high performance computing applications and for the rapidly growing demand for generative AI," said Forrest Norrod, executive vice president and general manager, Data Center Solutions Business Group, AMD. "We continue to work closely with Supermicro to bring to market leading-edge AI and HPC total solutions based on MI300 Series accelerators and leveraging Supermicro's expertise in system and data center design."
Source:
Supermicro
The new 2U liquid-cooled and 4U air-cooled servers with the AMD Instinct MI300A Accelerated Processing Units (APUs) accelerators are available and improve data center efficiencies and power the fast-growing complex demands in AI, LLM, and HPC. The new systems contain quad APUs for scalable applications. Supermicro can deliver complete liquid-cooled racks for large-scale environments with up to 1,728 TFlops of FP64 performance per rack. Supermicro worldwide manufacturing facilities streamline the delivery of these new servers for AI and HPC convergence."We are very excited to expand our rack scale Total IT Solutions for AI training with the latest generation of AMD Instinct accelerators, with up to 3.4X the performance improvement compared to previous generations," said Charles Liang, president and CEO of Supermicro. "With our ability to deliver 4,000 liquid-cooled racks per month from our worldwide manufacturing facilities, we can deliver the newest H13 GPU solutions with either the AMD's Instinct MI300X accelerator or the AMD Instinct MI300A APU. Our proven architecture allows 1:1 400G networking dedicated for each GPU designed for large-scale AI and supercomputing clusters capable of fully integrated liquid cooling solutions, giving customers a competitive advantage for performance and superior efficiency with ease of deployment."
The LLM optimized AS -8125GS-TNMR2 system is built on Supermicro's building block architecture, a proven design for high-performance AI systems with air and liquid cooled rack scale designs. The balanced system design associates a GPU with a 1:1 networking to provide a large pool of high bandwidth memory across nodes and racks to fit today's largest language models with up to trillions of parameters, maximizing parallel computing and minimizing the training time and inference latency. The 8U system with the MI300X OAM accelerator offers the raw acceleration power of 8-GPU with the AMD Infinity Fabric Links, enabling up to 896 GB/s of peak theoretical P2P I/O bandwidth on the open standard platform with industry-leading 1.5 TB HBM3 GPU memory in a single system, as well as native sparse matrix support, designed to save power, lower compute cycles and reduce memory use for AI workloads. Each server features dual socket AMD EPYC 9004 series processors with up to 256 cores. At rack scale, over 1000 CPU cores, 24 TB of DDR5 memory, 6.144 TB of HBM3 memory, and 9728 Compute Units are available for the most challenging AI environments. Using the OCP Accelerator Module (OAM), with which Supermicro has significant experience in 8U configurations, brings a fully configured server to market faster than a custom design, reducing costs and time to delivery.
Supermicro is also introducing a density optimized 2U liquid-cooled server, the AS -2145GH-TNMR, and a 4U air-cooled server, the AS -4145GH-TNMR, each with 4 AMD Instinct MI300A accelerators. The new servers are designed for HPC and AI applications, requiring extremely fast CPU to GPU communication. The APU eliminates redundant memory copies by combining the highest-performing AMD CPU, GPU, and HBM3 memory on a single chip. Each server contains leadership x86 "Zen 4" CPU cores for application scale-up. Also, each server includes 512 GB of HBM3 memory. In a full rack (48U) solution consisting of 21 2U systems, over 10T B of HBM3 memory is available, as well as 19,152 Compute Units. The HBM3 to CPU memory bandwidth is 5.3 TB/second.
Both systems feature dual AIOMs with 400G Ethernet support and expanded networking options designed to improve space, scalability, and efficiency for high-performance computing. The 2U direct-to-chip liquid-cooled system delivers excellent TCO with over a 35% energy consumption savings based on 21 2U system rack solutions that produce 61,780 watts per rack over 95,256 watts air-cooled rack, as well as a 70% reduction in the number of fans compared to an air cooled system.
"AMD Instinct MI300 Series accelerators deliver leadership performance, both for longstanding accelerated high performance computing applications and for the rapidly growing demand for generative AI," said Forrest Norrod, executive vice president and general manager, Data Center Solutions Business Group, AMD. "We continue to work closely with Supermicro to bring to market leading-edge AI and HPC total solutions based on MI300 Series accelerators and leveraging Supermicro's expertise in system and data center design."
Comments on Supermicro Extends AI and GPU Rack Scale Solutions with Support for AMD Instinct MI300 Series Accelerators
There are no comments yet.