Wednesday, April 24th 2024

SMART Modular Technologies Introduces New Family of CXL Add-in Cards for Memory Expansion

SMART Modular Technologies, Inc. ("SMART"), a division of SGH (Nasdaq: SGH) and a global leader in memory solutions, solid-state drives, and advanced memory, announces its new family of Add-In Cards (AICs) which implements the Compute Express Link (CXL) standard and also supports industry standard DDR5 DIMMs. These are the first in their class, high-density DIMM AICs to adopt the CXL protocol. The SMART 4-DIMM and 8-DIMM products enable server and data center architects to add up to 4 TB of memory in a familiar, easy-to-deploy form factor.

"The market for CXL memory components for data center applications is expected to grow rapidly. Initial production shipments are expected in late 2024 and will surpass the $2 billion mark by 2026. Ultimately, CXL attach rates in the server market will reach 30% including both expansion and pooling use cases," stated Mike Howard, vice president of DRAM and memory markets at TechInsights, an intelligence source to semiconductor innovation and related markets.
"The CXL protocol is an important step toward achieving industry standard memory disaggregation and sharing which will significantly improve the way memory is deployed in the coming years," said Andy Mills, senior director of advanced product development at SMART Modular, reinforcing Howard's market analysis and SMART's rationale for developing this family of CXL-related products.

SMART's 4-DIMM and 8-DIMM AICs are built using advanced CXL controllers which eliminate memory bandwidth bottlenecks and capacity constraints for compute-intensive workloads encountered in Artificial Intelligence (AI), high performance computing (HPC), and Machine Learning (ML). These emerging applications require larger amounts of high-speed memory that exceed what current servers can accommodate. Attempts to add more memory via the traditional DIMM-based parallel bus interface is becoming problematic due to pin limitations on CPUs, so the industry is turning to CXL-based solutions which are more pin efficient.

Technical Specifications
About SMART's 4-DIMM and 8-DIMM DDR5 AICs
  • Available in type 3 PCIe Gen 5 Full Height, Half Length (FHHL) PCIe form factor.
  • The 4-DIMM AIC (CXA-4F1W) accommodates four DDR5 RDIMMs with a maximum of 2 TB of memory capacity when using 512 GB RDIMMs, and the 8-DIMM AIC (CXA-8F2W) accommodates eight DDR5 RDIMMs with a maximum of 4 TB of memory capacity.
  • The 4-DIMM AIC uses a single CXL controller implementing one x16 CXL port while the 8-DIMM AIC uses two CXL controllers to implement two x8 ports, both resulting in a total bandwidth of 64 GB/s.
  • The CXL controllers support "Reliability, Availability, and Serviceability" (RAS) features, and advanced analytics.
  • Both offer enhanced security features with in-band or side band (SMBus) monitoring capability.
  • To accelerate memory processing, these add-in cards are compatible with SMART's Zefr ZDIMMs.
CXL also enables lower cost scaling of memory capacity. Using Smart's AICs enables servers to reach up to 1 TB of memory per CPU with cost-effective 64 GB RDIMMs. They also offer an opportunity for supply chain optionality. Replacing high density RDIMMs with a greater number of lower density modules can enable lower system memory costs depending on market conditions.

Visit SMART's 4-DIMM product page and 8-DIMM AIC product page for further information, and the CMM/CXL family page for information on SMART's other products using the CXL standard. SMART will provide samples to OEMs upon request. These new CXL-based AIC products join SMART's ZDIMM line of DRAM as ideal solutions for demanding memory design-in applications.
Add your own comment

10 Comments on SMART Modular Technologies Introduces New Family of CXL Add-in Cards for Memory Expansion

#1
Chaitanya
Installing DIMMs into that 4 slot card is going to be a pain compared to 8 slot version where DIMMs are perpendicular to card(though that comes are cost of no of slots used).
Posted on Reply
#2
TumbleGeorge
btarunrboth resulting in a total bandwidth of 64 GB/s.
63.015 GB/s? 64 GT/s.
Posted on Reply
#3
persondb
I think this would be interesting if it came to client/consumer platforms. Imagine if you could use a PCIe x8 one with one or two DIMMs to expand your Memory pool with 32GB/s or so?

Could be useable for a lot of applications, I think. Like a huge buffer space for games.
Posted on Reply
#4
Calenhad
persondbI think this would be interesting if it came to client/consumer platforms. Imagine if you could use a PCIe x8 one with one or two DIMMs to expand your Memory pool with 32GB/s or so?

Could be useable for a lot of applications, I think. Like a huge buffer space for games.
You're still limited by the PCIe bus speed, which in your example of PCIe x8 would be twice the transfer speed of a nvme drive. So why bother, just stick with nvme drives. Alot more bang for the buck and a less complicated storage hierarchy.

I mean there are use cases for this. Certain types of servers being one of them. But for something like a huge buffer space for games? Those already exist. They are called nvme ssds.
Posted on Reply
#5
Wirko
TumbleGeorge63.015 GB/s? 64 GT/s.
That's the best case for large transfers. Small transfers have a large percentage of overhead: a PCIe packet header is ~16 bytes and some commands and addresses have to be transmitted too. In contrast, DDR has a command/address bus that's separate from data bus. Small transfers can be really small - 64 bytes in DDR, which equals one cache line in current CPUs.

On the other hand, in PCIe's favour, that 64 GB/s is in each direction.
Posted on Reply
#6
TumbleGeorge
WirkoThat's the best case for large transfers. Small transfers have a large percentage of overhead: a PCIe packet header is ~16 bytes and some commands and addresses have to be transmitted too. In contrast, DDR has a command/address bus that's separate from data bus. Small transfers can be really small - 64 bytes in DDR, which equals one cache line in current CPUs.

On the other hand, in PCIe's favour, that 64 GB/s is in each direction.

@Wikipedia.
Posted on Reply
#7
Scrizz
TumbleGeorge
@Wikipedia.
What is that supposed to be?
Needs more context.


Also, this would have much higher latency than say an Optane DIMM, right?
Posted on Reply
#8
Wirko
TumbleGeorge@Wikipedia.
Here's the note below that table:


I'm not being very exact here, I don't make a distinction between 64 and 63.015. But I'm making a distinction between 64 and 64+64.
ScrizzAlso, this would have much higher latency than say an Optane DIMM, right?
No, it will still be far faster. I've read somewhere (Tom's) that the additional latency is supposed to be similar to when a processor is accessing RAM of the next closest NUMA node (i.e. another processor's RAM in a 2 or 4-processor system). That's more than 100 ns. Of course, DRAM also has ~50 ns on its own.
The remote memory, or in this case, a hybrid RAM/flash memory device, is accessible over the PCIe bus, which comes at the cost of ~170-250ns of latency, or roughly the cost of a NUMA hop.
By the way, what's the latency of dGPU memory when read/written by the CPU? That's a situation somewhat similar to CXL memory because data moves in packets over PCIe.
Posted on Reply
#9
persondb
CalenhadYou're still limited by the PCIe bus speed, which in your example of PCIe x8 would be twice the transfer speed of a nvme drive. So why bother, just stick with nvme drives. Alot more bang for the buck and a less complicated storage hierarchy.

I mean there are use cases for this. Certain types of servers being one of them. But for something like a huge buffer space for games? Those already exist. They are called nvme ssds.
Yes, you would be limited to PCIe x8 speeds, but since those are Gen 5, it would still be 32GB/s. NVMe Drives are struggling to even reach 12 GB/s with those super expensive Gen 5 drives.

NVMe drives have a lot of issues, yes the sequential is fast but what about the Random 4k? The progress there hasn't gone there that much. That was kinda of the whole point of optane actually. Such a solution would give orders of magnitude more performance in quite a few areas and be able to be used in more ways, due to latency, iops, etc.
Posted on Reply
#10
Scrizz
WirkoNo, it will still be far faster. I've read somewhere (Tom's) that the additional latency is supposed to be similar to when a processor is accessing RAM of the next closest NUMA node (i.e. another processor's RAM in a 2 or 4-processor system). That's more than 100 ns. Of course, DRAM also has ~50 ns on its own.
I'm talking about Optane DIMMs not SSDs
Here's an interesting paper that talks about persistent memory/Optane and CXL www3.cs.stonybrook.edu/~anshul/dimes23-pmem.pdf
Posted on Reply
Nov 19th, 2024 08:17 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts