Tuesday, November 19th 2024

Corsair by d-Matrix Enables GPU-Free AI Inference

d-Matrix today unveiled Corsair, an entirely new computing paradigm designed from the ground-up for the next era of AI inference in modern datacenters. Corsair leverages d-Matrix's innovative Digital In-Memory Compute architecture (DIMC), an industry first, to accelerate AI inference workloads with industry-leading real-time performance, energy efficiency, and cost savings as compared to GPUs and other alternatives.

The emergence of reasoning agents and interactive video generation represents the next level of AI capabilities. These leverage more inference computing power to enable models to "think" more and produce higher quality outputs. Corsair is the ideal inference compute solution with which enterprises can unlock new levels of automation and intelligence without compromising on performance, cost or power.
"We saw transformers and generative AI coming, and founded d-Matrix to address inference challenges around the largest computing opportunity of our time," said Sid Sheth, cofounder and CEO of d-Matrix. "The first-of-its-kind Corsair compute platform brings blazing fast token generation for high interactivity applications with multiple users, making Gen AI commercially viable."
Analyst firm Gartner predicts a 160% increase in data center energy consumption over the next two years, driven by AI and GenAI. As a result, Gartner estimates 40% of existing AI data centers will be operationally constrained by power availability by 2027. Deploying AI models at scale could make them quickly cost-prohibitive.

d-Matrix Industry Firsts and Breakthroughs
d-Matrix combines several world's first innovations in silicon, software, chiplet packaging and interconnect fabrics to accelerate AI inference.

Generative inference is inherently memory bound. d-Matrix breaks through this memory bandwidth barrier with a novel DIMC architecture that tightly integrates memory and compute. Scaling is achieved using DMX Link for high-speed energy-efficient die-to-die connectivity across chiplets in a package, and DMX Bridge for connecting packages across two cards. d-Matrix is among the first in the industry to natively support block floating point numerical formats, now an OCP standard called Micro-scaling (MX), for greater inference efficiency. These industry-first innovations are seamlessly integrated under the hood by d-Matrix's Aviator software stack that gives AI developers a familiar user experience and tooling.

Corsair comes in an industry standard PCIe Gen 5 full height full length card form factor, with pairs of cards connected via DMX Bridge cards. Each Corsair card is powered by DIMC compute cores with 2400 TFLOPs of 8-bit peak compute, 2 GB of integrated Performance Memory, and up to 256 GB of off-chip Capacity Memory. The DIMC architecture delivers ultra-high memory bandwidth of 150 TB/s, significantly higher than HBM. Corsair delivers up to 10x faster interactive speed, 3x better performance per total cost of ownership (TCO), and 3x greater energy efficiency.

"d-Matrix is at the forefront of a monumental shift in Gen AI as the first company to fully address the pain points of AI in the enterprise", said Michael Stewart, managing partner of M12, Microsoft's Venture Fund. "Built by a world-class team and introducing category-defining breakthroughs, d-Matrix's compute platform radically changes the ability for enterprises to access infrastructure for AI operations and enable them to incrementally scale out operations without the energy constraints and latency concerns that have held AI back from enterprise adoption. d-Matrix is democratizing access to the hardware needed to power AI in standard form factor to make Gen AI finally attainable for everyone."


Availability of d-Matrix Corsair inference solutions
Corsair is sampling to early-access customers and will be broadly available in Q2'2025. d-Matrix is proud to be collaborating with OEMs and System Integrators to bring Corsair based solutions to the market.

"We are excited to collaborate with d-Matrix on their Corsair ultra-high bandwidth in-memory compute solution, which is purpose-built for generative AI, and accelerate the adoption of sustainable AI computing," said Vik Malyala, Senior Vice President for Technology and AI, Supermicro. "Our high-performance end-to-end liquid- and air- cooled systems incorporating Corsair are ideal for next-level AI compute."

"Combining d-Matrix's Corsair PCIe card with GigaIO SuperNODE's industry-leading scale-up architecture creates a transformative solution for enterprises deploying next-generation AI inference at scale," said Alan Benjamin, CEO at GigaIO. "Our single-node server supports 64 or more Corsairs, delivering massive processing power and low-latency communication between cards. The Corsair SuperNODE eliminates complex multi-node configurations and simplifies deployment, enabling enterprises to quickly adapt to evolving AI workloads while significantly improving their TCO and operational efficiency."

"By integrating d-Matrix Corsair, Liqid enables unmatched capability, flexibility, and efficiency, overcoming traditional limitations to deliver exceptional inference performance. In the rapidly advancing AI landscape, we enable customers to meet stringent inference demands with Corsair's ultra-low latency solution," said Sumit Puri, Co-Founder at Liqid.
Source: d-Matrix
Add your own comment

6 Comments on Corsair by d-Matrix Enables GPU-Free AI Inference

#1
kapone32
Ok I understand that it runs over PCIe. I don't understand what this actually is though. A PCIe controller? I do understand that it is some kind of logic board.
Posted on Reply
#2
LabRat 891
kapone32Ok I understand that it runs over PCIe. I don't understand what this actually is though. A PCIe controller? I do understand that it is some kind of logic board.
All those press releases about memory w/ compute integrated -that's what this is.

(from what I gather) It's an MCM ASIC w/ die+package-integral massive cache, basically.
Posted on Reply
#3
igormp
kapone32Ok I understand that it runs over PCIe. I don't understand what this actually is though. A PCIe controller? I do understand that it is some kind of logic board.
It's a huge memory-like block that has inputs and outputs, with those outputs being stored into the memory itself. It makes it so that you don't need to move data back and forth from your memory to your compute die (like we currently do with GPUs/CPUs), but rather just throw in an input into your memory block, and it is responsible for doing the calculation and keeping the result in constant time, instead of require to iterate over all the data.
Here's a nice presentation on that topic:
cdn.opptylab.com/hf/assets/eelftheriou-esscirc-2022.pdf
Posted on Reply
#4
Redwoodz
Technology sounds great. The name is a weird choice.
Posted on Reply
#5
Caring1
"We saw transformers"
I think most people have seen that movie
What's that got to do with this add in card, does Megatron have one?
Posted on Reply
#6
igormp
Caring1"We saw transformers"
I think most people have seen that movie
What's that got to do with this add in card, does Megatron have one?
"Transformers" refers to the underlying architecture used by LLMs (think of ChatGPT) made by google. Here's the paper about it in case you want to give it a go:
arxiv.org/pdf/1706.03762

This card helps to perform the operations needed in this architecture in a way faster manner (or so they claim).
Posted on Reply
Dec 11th, 2024 20:28 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts