Tuesday, August 27th 2024

FuriosaAI Unveils RNGD Power-Efficient AI Processor at Hot Chips 2024

Today at Hot Chips 2024, FuriosaAI is pulling back the curtain on RNGD (pronounced "Renegade"), our new AI accelerator designed for high-performance, highly efficient large language model (LLM) and multimodal model inference in data centers. As part of his Hot Chips presentation, Furiosa co-founder and CEO June Paik is sharing technical details and providing the first hands-on look at the fully functioning RNGD card.

With a TDP of 150 watts, a novel chip architecture, and advanced memory technology like HBM3, RNGD is optimized for inference with demanding LLMs and multimodal models. It's built to deliver high performance, power efficiency, and programmability all in a single product - a trifecta that the industry has struggled to achieve in GPUs and other AI chips.
A key milestone for RNGD
As industry experts will know, the process of squeezing every drop of performance from a chip takes many steps. Furiosa achieved a full bring-up of RNGD just weeks after obtaining the first silicon samples - an exceptionally rapid timeline in the chip industry. TSMC delivered the first RNGD chips in May, we booted the hardware less than a week later, and we were running industry standard Llama 3.1 models in early June.

We started delivering the first RNGD silicon to early access customers in July and showed our first private demo last week. There's much more work to do before RNGD is running in data centers around the world, but we've reached an exciting milestone and we're pleased to be able to share these updates on our progress.

With more updates to come
Our priority now is refining our software stack as we ramp up RNGD production. This roadmap follows our successful track record with Furiosa's first-generation chip, introduced in 2021.

With our first-gen product, which targeted computer vision applications in data centers and edge server deployments, Furiosa submitted our first MLPerf benchmark results three weeks after receiving first silicon. We then used compiler enhancements to achieve a 113% performance increase in the next MLPerf submission six months later.

This is a typical path for new silicon. For example, six months after launching their powerful H100 chip and submitting it to MLPerf, NVIDIA announced 2.4x performance improvements achieved entirely through software improvements.

The process will be similar with RNGD. Right now, a single RNGD is generating about 12 queries per second when running the GPT-J 6B model, but we expect that number to increase as we refine our software stack over the coming weeks and months. We're also sharing RNGD target performance numbers on several LLMs:
Furiosa has deliberately kept a low profile until now, because we know the industry doesn't need more hype and bold promises about things that don't yet exist. (Also, Furiosa is 95% engineers, so marketing hasn't exactly been top of mind.)

Stay informed on the latest RNGD news
But Hot Chips is an exciting turning point for Furiosa and RNGD. If you come by our Hot Chips booth this week, you'll see we've brought a large engineering team to talk with anyone who is interested in our work. We're eager to hear what the AI community thinks of RNGD, what questions you have, and what you want to hear from us as we work to make the chip widely available in early 2025. We'll also showcase the first live demo of RNGD.

Stay tuned for more benchmark results, availability details, and other updates in the coming weeks and months.
Source: FuriosaAI
Add your own comment

7 Comments on FuriosaAI Unveils RNGD Power-Efficient AI Processor at Hot Chips 2024

#1
azrael
<insert random Mad Max quip here> :p
Posted on Reply
#2
Daven
Regardless of AI, I'm happy to see a diversification of processor and co-processor architectures. Intel peaked at 478 systems in the top 500 supercomputer list in June 2019. A whopping 96%! Now we are seeing computing coming from all over the place instead of just one provider.
Posted on Reply
#3
JasBC
Ugly-ass Cybertruck typeface
Posted on Reply
#5
Tropick
That's some crazy ass power delivery circuitry for a 150W TDP :wtf:
Posted on Reply
#6
Jism
TropickThat's some crazy ass power delivery circuitry for a 150W TDP :wtf:
At least 17 phase, with an additional 4 for it's HBM. I guess they want a peak perfect transient response - nothing should be left to it's on in regards of compute.
Posted on Reply
#7
Wirko
TropickThat's some crazy ass power delivery circuitry for a 150W TDP :wtf:
There's also a crazy big processor die in there - unless they seriously blew it up in the render (it's either a photo or a superb render). Calculated from the known length of the PCIe connector (89 mm), the die is about 27 x 28 mm = 756 mm². Modern chips with the best cooling can reach up to about 1W/mm².
Posted on Reply
Oct 6th, 2024 15:19 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts