Corsair by d-Matrix Enables GPU-Free AI Inference

Nomad76 · Nov 19, 2024

d-Matrix today unveiled Corsair, an entirely new computing paradigm designed from the ground-up for the next era of AI inference in modern datacenters. Corsair leverages d-Matrix's innovative Digital In-Memory Compute architecture (DIMC), an industry first, to accelerate AI inference workloads with industry-leading real-time performance, energy efficiency, and cost savings as compared to GPUs and other alternatives.

The emergence of reasoning agents and interactive video generation represents the next level of AI capabilities. These leverage more inference computing power to enable models to "think" more and produce higher quality outputs. Corsair is the ideal inference compute solution with which enterprises can unlock new levels of automation and intelligence without compromising on performance, cost or power.

"We saw transformers and generative AI coming, and founded d-Matrix to address inference challenges around the largest computing opportunity of our time," said Sid Sheth, cofounder and CEO of d-Matrix. "The first-of-its-kind Corsair compute platform brings blazing fast token generation for high interactivity applications with multiple users, making Gen AI commercially viable."

Analyst firm Gartner predicts a 160% increase in data center energy consumption over the next two years, driven by AI and GenAI. As a result, Gartner estimates 40% of existing AI data centers will be operationally constrained by power availability by 2027. Deploying AI models at scale could make them quickly cost-prohibitive.

d-Matrix Industry Firsts and Breakthroughs
d-Matrix combines several world's first innovations in silicon, software, chiplet packaging and interconnect fabrics to accelerate AI inference.

Generative inference is inherently memory bound. d-Matrix breaks through this memory bandwidth barrier with a novel DIMC architecture that tightly integrates memory and compute. Scaling is achieved using DMX Link for high-speed energy-efficient die-to-die connectivity across chiplets in a package, and DMX Bridge for connecting packages across two cards. d-Matrix is among the first in the industry to natively support block floating point numerical formats, now an OCP standard called Micro-scaling (MX), for greater inference efficiency. These industry-first innovations are seamlessly integrated under the hood by d-Matrix's Aviator software stack that gives AI developers a familiar user experience and tooling.

Corsair comes in an industry standard PCIe Gen 5 full height full length card form factor, with pairs of cards connected via DMX Bridge cards. Each Corsair card is powered by DIMC compute cores with 2400 TFLOPs of 8-bit peak compute, 2 GB of integrated Performance Memory, and up to 256 GB of off-chip Capacity Memory. The DIMC architecture delivers ultra-high memory bandwidth of 150 TB/s, significantly higher than HBM. Corsair delivers up to 10x faster interactive speed, 3x better performance per total cost of ownership (TCO), and 3x greater energy efficiency.

"d-Matrix is at the forefront of a monumental shift in Gen AI as the first company to fully address the pain points of AI in the enterprise", said Michael Stewart, managing partner of M12, Microsoft's Venture Fund. "Built by a world-class team and introducing category-defining breakthroughs, d-Matrix's compute platform radically changes the ability for enterprises to access infrastructure for AI operations and enable them to incrementally scale out operations without the energy constraints and latency concerns that have held AI back from enterprise adoption. d-Matrix is democratizing access to the hardware needed to power AI in standard form factor to make Gen AI finally attainable for everyone."

Availability of d-Matrix Corsair inference solutions
Corsair is sampling to early-access customers and will be broadly available in Q2'2025. d-Matrix is proud to be collaborating with OEMs and System Integrators to bring Corsair based solutions to the market.

"We are excited to collaborate with d-Matrix on their Corsair ultra-high bandwidth in-memory compute solution, which is purpose-built for generative AI, and accelerate the adoption of sustainable AI computing," said Vik Malyala, Senior Vice President for Technology and AI, Supermicro. "Our high-performance end-to-end liquid- and air- cooled systems incorporating Corsair are ideal for next-level AI compute."

"Combining d-Matrix's Corsair PCIe card with GigaIO SuperNODE's industry-leading scale-up architecture creates a transformative solution for enterprises deploying next-generation AI inference at scale," said Alan Benjamin, CEO at GigaIO. "Our single-node server supports 64 or more Corsairs, delivering massive processing power and low-latency communication between cards. The Corsair SuperNODE eliminates complex multi-node configurations and simplifies deployment, enabling enterprises to quickly adapt to evolving AI workloads while significantly improving their TCO and operational efficiency."

"By integrating d-Matrix Corsair, Liqid enables unmatched capability, flexibility, and efficiency, overcoming traditional limitations to deliver exceptional inference performance. In the rapidly advancing AI landscape, we enable customers to meet stringent inference demands with Corsair's ultra-low latency solution," said Sumit Puri, Co-Founder at Liqid.

View at TechPowerUp Main Site | Source

Kapone33 · Nov 19, 2024

Ok I understand that it runs over PCIe. I don't understand what this actually is though. A PCIe controller? I do understand that it is some kind of logic board.

LabRat 891 · Nov 19, 2024

kapone32 said:
Ok I understand that it runs over PCIe. I don't understand what this actually is though. A PCIe controller? I do understand that it is some kind of logic board.

All those press releases about memory w/ compute integrated -that's what this is.

(from what I gather) It's an MCM ASIC w/ die+package-integral massive cache, basically.

igormp · Nov 19, 2024

kapone32 said:
Ok I understand that it runs over PCIe. I don't understand what this actually is though. A PCIe controller? I do understand that it is some kind of logic board.

It's a huge memory-like block that has inputs and outputs, with those outputs being stored into the memory itself. It makes it so that you don't need to move data back and forth from your memory to your compute die (like we currently do with GPUs/CPUs), but rather just throw in an input into your memory block, and it is responsible for doing the calculation and keeping the result in constant time, instead of require to iterate over all the data.
Here's a nice presentation on that topic:

https://cdn.opptylab.com/hf/assets/eelftheriou-esscirc-2022.pdf

Redwoodz · Nov 20, 2024

Technology sounds great. The name is a weird choice.

Caring1 · Nov 20, 2024

"We saw transformers"
I think most people have seen that movie
What's that got to do with this add in card, does Megatron have one?

igormp · Nov 20, 2024

Caring1 said:
"We saw transformers"
I think most people have seen that movie
What's that got to do with this add in card, does Megatron have one?

"Transformers" refers to the underlying architecture used by LLMs (think of ChatGPT) made by google. Here's the paper about it in case you want to give it a go:

https://arxiv.org/pdf/1706.03762

This card helps to perform the operations needed in this architecture in a way faster manner (or so they claim).

System Name	Best AMD Computer
Processor	AMD 7900X3D
Motherboard	Asus X670E E Strix
Cooling	In Win SR36
Memory	GSKILL DDR5 32GB 5200 30
Video Card(s)	Sapphire Pulse 7900XT (Watercooled)
Storage	Corsair MP 700, Seagate 530 2Tb, Adata SX8200 2TBx2, Kingston 2 TBx2, Micron 8 TB, WD AN 1500
Display(s)	GIGABYTE FV43U
Case	Corsair 7000D Airflow
Audio Device(s)	Corsair Void Pro, Logitch Z523 5.1
Power Supply	Deepcool 1000M
Mouse	Logitech g7 gaming mouse
Keyboard	Logitech G510
Software	Windows 11 Pro 64 Steam. GOG, Uplay, Origin
Benchmark Scores	Firestrike: 46183 Time Spy: 25121

System Name	Metalia
Processor	AMD Ryzen 7 5800X3D
Motherboard	Asus TuF Gaming X570-PLUS
Cooling	ID Cooling 280mm AIO w/ Arctic P14s
Memory	2x32GB DDR4-3600
Video Card(s)	Sapphire Pulse RX 9070 XT
Storage	Optane P5801X 400GB, Samsung 990Pro 2TB
Display(s)	LG ‎32GS95UV 32" OLED 240/480hz 4K/1080P Dual Mode
Case	Geometric Future M8 Dharma
Audio Device(s)	Xonar Essence STX
Power Supply	Seasonic Focus GX-1000 Gold
Mouse	Attack Shark R3 Magnesium - White
Keyboard	Keychron K8 Pro - White - Tactile Brown Switch
Software	Windows 10 IoT Enterprise LTSC 2021

Processor	5950x
Motherboard	B550 ProArt
Cooling	Fuma 2
Memory	4x32GB 3200MHz Corsair LPX
Video Card(s)	2x RTX 3090
Display(s)	LG 42" C2 4k OLED
Power Supply	XPG Core Reactor 850W
Software	I use Arch btw

System Name	H7 Flow 2024
Processor	AMD 5800X3D
Motherboard	Asus X570 Tough Gaming
Cooling	Custom liquid
Memory	32 GB DDR4
Video Card(s)	Intel ARC A750
Storage	Crucial P5 Plus 2TB.
Display(s)	AOC 24" Freesync 1m.s. 75Hz
Mouse	Lenovo
Keyboard	Eweadn Mechanical
Software	W11 Pro 64 bit

Processor	5950x
Motherboard	B550 ProArt
Cooling	Fuma 2
Memory	4x32GB 3200MHz Corsair LPX
Video Card(s)	2x RTX 3090
Display(s)	LG 42" C2 4k OLED
Power Supply	XPG Core Reactor 850W
Software	I use Arch btw

Corsair by d-Matrix Enables GPU-Free AI Inference

Nomad76

News Editor

Kapone33

LabRat 891

igormp

Redwoodz

Caring1

igormp