Tuesday, August 24th 2021
Samsung Brings In-memory Processing Power to Wider Range of Applications
Samsung Electronics the world leader in advanced memory technology, today showcased its latest advancements with processing-in-memory (PIM) technology at Hot Chips 33—a leading semiconductor conference where the most notable microprocessor and IC innovations are unveiled each year. Samsung's revelations include the first successful integration of its PIM-enabled High Bandwidth Memory (HBM-PIM) into a commercialized accelerator system, and broadened PIM applications to embrace DRAM modules and mobile memory, in accelerating the move toward the convergence of memory and logic.
In February, Samsung introduced the industry's first HBM-PIM (Aquabolt-XL), which incorporates the AI processing function into Samsung's HBM2 Aquabolt, to enhance high-speed data processing in supercomputers and AI applications. The HBM-PIM has since been tested in the Xilinx Virtex Ultrascale+ (Alveo) AI accelerator, where it delivered an almost 2.5X system performance gain as well as more than a 60% cut in energy consumption."HBM-PIM is the industry's first AI-tailored memory solution being tested in customer AI-accelerator systems, demonstrating tremendous commercial potential," said Nam Sung Kim, senior vice president of DRAM Product & Technology at Samsung Electronics. "Through standardization of the technology, applications will become numerous, expanding into HBM3 for next-generation supercomputers and AI applications, and even into mobile memory for on-device AI as well as for memory modules used in data centers."
"Xilinx has been collaborating with Samsung Electronics to enable high-performance solutions for data center, networking and real-time signal processing applications starting with the Virtex UltraScale+ HBM family, and recently introduced our new and exciting Versal HBM series products," said Arun Varadarajan Rajagopal, senior director, Product Planning at Xilinx, Inc. "We are delighted to continue this collaboration with Samsung as we help to evaluate HBM-PIM systems for their potential to achieve major performance and energy-efficiency gains in AI applications."
DRAM modules powered by PIM
The Acceleration DIMM (AXDIMM) brings processing to the DRAM module itself, minimizing large data movement between the CPU and DRAM to boost the energy efficiency of AI accelerator systems. With an AI engine built inside the buffer chip, the AXDIMM can perform parallel processing of multiple memory ranks (sets of DRAM chips) instead of accessing just one rank at a time, greatly enhancing system performance and efficiency. Since the module can retain its traditional DIMM form factor, the AXDIMM facilitates drop-in replacement without requiring system modifications. Currently being tested on customer servers, the AXDIMM can offer approximately twice the performance in AI-based recommendation applications and a 40% decrease in system-wide energy usage.
"SAP has been continuously collaborating with Samsung on their new and emerging memory technologies to deliver optimal performance on SAP HANA and help database acceleration," said Oliver Rebholz, head of HANA core research & innovation at SAP. "Based on performance projections and potential integration scenarios, we expect significant performance improvements for in-memory database management system (IMDBMS) and higher energy efficiency via disaggregated computing on AXDIMM. SAP is looking to continue its collaboration with Samsung in this area."
Mobile memory that brings AI from data center to device
Samsung's LPDDR5-PIM mobile memory technology can provide independent AI capabilities without data center connectivity. Simulation tests have shown that the LPDDR5-PIM can more than double performance while reducing energy usage by over 60% when used in applications such as voice recognition, translation and chatbot.
Energizing the ecosystem
Samsung plans to expand its AI memory portfolio by working with other industry leaders to complete standardization of the PIM platform in the first half of 2022. The company will also continue to foster a highly robust PIM ecosystem in assuring wide applicability across the memory market.
In February, Samsung introduced the industry's first HBM-PIM (Aquabolt-XL), which incorporates the AI processing function into Samsung's HBM2 Aquabolt, to enhance high-speed data processing in supercomputers and AI applications. The HBM-PIM has since been tested in the Xilinx Virtex Ultrascale+ (Alveo) AI accelerator, where it delivered an almost 2.5X system performance gain as well as more than a 60% cut in energy consumption."HBM-PIM is the industry's first AI-tailored memory solution being tested in customer AI-accelerator systems, demonstrating tremendous commercial potential," said Nam Sung Kim, senior vice president of DRAM Product & Technology at Samsung Electronics. "Through standardization of the technology, applications will become numerous, expanding into HBM3 for next-generation supercomputers and AI applications, and even into mobile memory for on-device AI as well as for memory modules used in data centers."
"Xilinx has been collaborating with Samsung Electronics to enable high-performance solutions for data center, networking and real-time signal processing applications starting with the Virtex UltraScale+ HBM family, and recently introduced our new and exciting Versal HBM series products," said Arun Varadarajan Rajagopal, senior director, Product Planning at Xilinx, Inc. "We are delighted to continue this collaboration with Samsung as we help to evaluate HBM-PIM systems for their potential to achieve major performance and energy-efficiency gains in AI applications."
DRAM modules powered by PIM
The Acceleration DIMM (AXDIMM) brings processing to the DRAM module itself, minimizing large data movement between the CPU and DRAM to boost the energy efficiency of AI accelerator systems. With an AI engine built inside the buffer chip, the AXDIMM can perform parallel processing of multiple memory ranks (sets of DRAM chips) instead of accessing just one rank at a time, greatly enhancing system performance and efficiency. Since the module can retain its traditional DIMM form factor, the AXDIMM facilitates drop-in replacement without requiring system modifications. Currently being tested on customer servers, the AXDIMM can offer approximately twice the performance in AI-based recommendation applications and a 40% decrease in system-wide energy usage.
"SAP has been continuously collaborating with Samsung on their new and emerging memory technologies to deliver optimal performance on SAP HANA and help database acceleration," said Oliver Rebholz, head of HANA core research & innovation at SAP. "Based on performance projections and potential integration scenarios, we expect significant performance improvements for in-memory database management system (IMDBMS) and higher energy efficiency via disaggregated computing on AXDIMM. SAP is looking to continue its collaboration with Samsung in this area."
Mobile memory that brings AI from data center to device
Samsung's LPDDR5-PIM mobile memory technology can provide independent AI capabilities without data center connectivity. Simulation tests have shown that the LPDDR5-PIM can more than double performance while reducing energy usage by over 60% when used in applications such as voice recognition, translation and chatbot.
Energizing the ecosystem
Samsung plans to expand its AI memory portfolio by working with other industry leaders to complete standardization of the PIM platform in the first half of 2022. The company will also continue to foster a highly robust PIM ecosystem in assuring wide applicability across the memory market.
9 Comments on Samsung Brings In-memory Processing Power to Wider Range of Applications
www.extremetech.com/extreme/216235-scientists-are-still-figuring-out-how-their-own-memristors-work
en.wikipedia.org/wiki/3D_XPoint
It has a lot lower latency than flash, but it is still worse than DRAM for access times and cell life. It's also between Flash and DRAM on density
But if 3d X Point is the future or Memristor, you're going to have ti keep separate cache RAM (Optane performance is too slow)
Coherency is a relatively tame beast, by-comparison.