• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Corsair by d-Matrix Enables GPU-Free AI Inference

Nomad76

News Editor
Staff member
Joined
May 21, 2024
Messages
666 (3.64/day)
d-Matrix today unveiled Corsair, an entirely new computing paradigm designed from the ground-up for the next era of AI inference in modern datacenters. Corsair leverages d-Matrix's innovative Digital In-Memory Compute architecture (DIMC), an industry first, to accelerate AI inference workloads with industry-leading real-time performance, energy efficiency, and cost savings as compared to GPUs and other alternatives.

The emergence of reasoning agents and interactive video generation represents the next level of AI capabilities. These leverage more inference computing power to enable models to "think" more and produce higher quality outputs. Corsair is the ideal inference compute solution with which enterprises can unlock new levels of automation and intelligence without compromising on performance, cost or power.



"We saw transformers and generative AI coming, and founded d-Matrix to address inference challenges around the largest computing opportunity of our time," said Sid Sheth, cofounder and CEO of d-Matrix. "The first-of-its-kind Corsair compute platform brings blazing fast token generation for high interactivity applications with multiple users, making Gen AI commercially viable."

Analyst firm Gartner predicts a 160% increase in data center energy consumption over the next two years, driven by AI and GenAI. As a result, Gartner estimates 40% of existing AI data centers will be operationally constrained by power availability by 2027. Deploying AI models at scale could make them quickly cost-prohibitive.

d-Matrix Industry Firsts and Breakthroughs
d-Matrix combines several world's first innovations in silicon, software, chiplet packaging and interconnect fabrics to accelerate AI inference.

Generative inference is inherently memory bound. d-Matrix breaks through this memory bandwidth barrier with a novel DIMC architecture that tightly integrates memory and compute. Scaling is achieved using DMX Link for high-speed energy-efficient die-to-die connectivity across chiplets in a package, and DMX Bridge for connecting packages across two cards. d-Matrix is among the first in the industry to natively support block floating point numerical formats, now an OCP standard called Micro-scaling (MX), for greater inference efficiency. These industry-first innovations are seamlessly integrated under the hood by d-Matrix's Aviator software stack that gives AI developers a familiar user experience and tooling.

Corsair comes in an industry standard PCIe Gen 5 full height full length card form factor, with pairs of cards connected via DMX Bridge cards. Each Corsair card is powered by DIMC compute cores with 2400 TFLOPs of 8-bit peak compute, 2 GB of integrated Performance Memory, and up to 256 GB of off-chip Capacity Memory. The DIMC architecture delivers ultra-high memory bandwidth of 150 TB/s, significantly higher than HBM. Corsair delivers up to 10x faster interactive speed, 3x better performance per total cost of ownership (TCO), and 3x greater energy efficiency.

"d-Matrix is at the forefront of a monumental shift in Gen AI as the first company to fully address the pain points of AI in the enterprise", said Michael Stewart, managing partner of M12, Microsoft's Venture Fund. "Built by a world-class team and introducing category-defining breakthroughs, d-Matrix's compute platform radically changes the ability for enterprises to access infrastructure for AI operations and enable them to incrementally scale out operations without the energy constraints and latency concerns that have held AI back from enterprise adoption. d-Matrix is democratizing access to the hardware needed to power AI in standard form factor to make Gen AI finally attainable for everyone."


Availability of d-Matrix Corsair inference solutions
Corsair is sampling to early-access customers and will be broadly available in Q2'2025. d-Matrix is proud to be collaborating with OEMs and System Integrators to bring Corsair based solutions to the market.

"We are excited to collaborate with d-Matrix on their Corsair ultra-high bandwidth in-memory compute solution, which is purpose-built for generative AI, and accelerate the adoption of sustainable AI computing," said Vik Malyala, Senior Vice President for Technology and AI, Supermicro. "Our high-performance end-to-end liquid- and air- cooled systems incorporating Corsair are ideal for next-level AI compute."

"Combining d-Matrix's Corsair PCIe card with GigaIO SuperNODE's industry-leading scale-up architecture creates a transformative solution for enterprises deploying next-generation AI inference at scale," said Alan Benjamin, CEO at GigaIO. "Our single-node server supports 64 or more Corsairs, delivering massive processing power and low-latency communication between cards. The Corsair SuperNODE eliminates complex multi-node configurations and simplifies deployment, enabling enterprises to quickly adapt to evolving AI workloads while significantly improving their TCO and operational efficiency."

"By integrating d-Matrix Corsair, Liqid enables unmatched capability, flexibility, and efficiency, overcoming traditional limitations to deliver exceptional inference performance. In the rapidly advancing AI landscape, we enable customers to meet stringent inference demands with Corsair's ultra-low latency solution," said Sumit Puri, Co-Founder at Liqid.

View at TechPowerUp Main Site | Source
 
Joined
Jun 2, 2017
Messages
9,097 (3.33/day)
System Name Best AMD Computer
Processor AMD 7900X3D
Motherboard Asus X670E E Strix
Cooling In Win SR36
Memory GSKILL DDR5 32GB 5200 30
Video Card(s) Sapphire Pulse 7900XT (Watercooled)
Storage Corsair MP 700, Seagate 530 2Tb, Adata SX8200 2TBx2, Kingston 2 TBx2, Micron 8 TB, WD AN 1500
Display(s) GIGABYTE FV43U
Case Corsair 7000D Airflow
Audio Device(s) Corsair Void Pro, Logitch Z523 5.1
Power Supply Deepcool 1000M
Mouse Logitech g7 gaming mouse
Keyboard Logitech G510
Software Windows 11 Pro 64 Steam. GOG, Uplay, Origin
Benchmark Scores Firestrike: 46183 Time Spy: 25121
Ok I understand that it runs over PCIe. I don't understand what this actually is though. A PCIe controller? I do understand that it is some kind of logic board.
 
Joined
Apr 18, 2019
Messages
2,342 (1.15/day)
Location
Olympia, WA
System Name Sleepy Painter
Processor AMD Ryzen 5 3600
Motherboard Asus TuF Gaming X570-PLUS/WIFI
Cooling FSP Windale 6 - Passive
Memory 2x16GB F4-3600C16-16GVKC @ 16-19-21-36-58-1T
Video Card(s) MSI RX580 8GB
Storage 2x Samsung PM963 960GB nVME RAID0, Crucial BX500 1TB SATA, WD Blue 3D 2TB SATA
Display(s) Microboard 32" Curved 1080P 144hz VA w/ Freesync
Case NZXT Gamma Classic Black
Audio Device(s) Asus Xonar D1
Power Supply Rosewill 1KW on 240V@60hz
Mouse Logitech MX518 Legend
Keyboard Red Dragon K552
Software Windows 10 Enterprise 2019 LTSC 1809 17763.1757
Ok I understand that it runs over PCIe. I don't understand what this actually is though. A PCIe controller? I do understand that it is some kind of logic board.
All those press releases about memory w/ compute integrated -that's what this is.

(from what I gather) It's an MCM ASIC w/ die+package-integral massive cache, basically.
 
Joined
May 10, 2023
Messages
238 (0.43/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
Ok I understand that it runs over PCIe. I don't understand what this actually is though. A PCIe controller? I do understand that it is some kind of logic board.
It's a huge memory-like block that has inputs and outputs, with those outputs being stored into the memory itself. It makes it so that you don't need to move data back and forth from your memory to your compute die (like we currently do with GPUs/CPUs), but rather just throw in an input into your memory block, and it is responsible for doing the calculation and keeping the result in constant time, instead of require to iterate over all the data.
Here's a nice presentation on that topic:
 
Top