Wednesday, February 26th 2025

AheadComputing Introduces Breakthrough CPU Architecture for General-Purpose Computing, With Jim Keller on Board

AheadComputing today announced it has secured $21.5M in seed funding to rapidly develop and commercialize its breakthrough microprocessor architecture designed to meet the new, unique computing demands across AI, cloud, and edge devices. The funding was led by Eclipse, with participation from Maverick Capital, Fundomo, EPIQ Capital Group, LLC, and legendary CPU architect and current Tenstorrent CEO Jim Keller, who developed cutting-edge semiconductors for Apple, AMD, Tesla, and Intel.

Today, general-purpose computing faces unprecedented challenges due to the rapid expansion of AI and machine learning workloads. A recent report found that 82% of organizations experienced performance issues with their AI workloads over the past year, primarily due to bandwidth shortages and data processing limitations. While specialized accelerators dominate headlines, they rely heavily on general-purpose processors for critical tasks before, after, and in-between AI operations. Existing architectures have struggled to keep pace with the demands of these emerging workloads, creating a bottleneck in compute performance that impacts industries ranging from cloud to edge computing. AheadComputing addresses this critical gap by providing innovative solutions designed to transform how general-purpose computing meets modern demands.
AheadComputing was founded in 2024 by semiconductor industry veterans and former Intel CPU architects Debbie Marr, Jonathan Pearce, Mark Dechene, and Srikanth Srinivasan who collectively have a century of experience in identifying CPU bottlenecks, dreaming up innovative ways to resolve them, and shipping them in real products. They saw an opportunity to develop 64-bit RISC-V application processors that deliver breakthrough per-core performance. The company aims to address the growing demand for general-purpose computing performance amid the rapid rise of AI applications. The demand is fueled in part by many everyday AI applications with limited to low parallelism. These AI workloads will run on general purpose computing platforms. On top of that, programmers are increasingly relying on AI generators to assist in programming tasks for productivity. These generated programs are single or low-parallelism workloads and will run on general-purpose platforms. AheadComputing's microarchitecture innovations will drive breakthrough performance improvements while optimizing power efficiency, making it a game-changer for server, client, mobile, and edge applications, justifying the move away from legacy computer architectures to RISC-V.

"The compute landscape is evolving rapidly, and AheadComputing is positioned to lead this transformation by delivering unprecedented performance in general-purpose processors," said Debbie Marr, CEO of AheadComputing. "With this funding, we will expand our world-class engineering team and accelerate the development of our core IP, enabling customers to meet their most demanding computing needs."

AheadComputing's approach focuses on overcoming the limitations of current architectures, addressing challenges such as per-core performance, thermal density constraints and multiprocessor scalability. The company's first products will provide industry-leading single-thread and multi-core performance, setting new benchmarks in the computing landscape.

"As esteemed former senior Intel CPU architects, AheadComputing's leadership team is uniquely equipped to solve the complex challenges facing today's computing industry," said Greg Reichow, Partner at Eclipse. "Their commitment to delivering the highest performance cores—while ensuring energy efficiency—will significantly impact multiple industries like mobile, industrial and networking."

The seed funding will be used to continue expanding their world-class team, advance microarchitecture development, and demonstrate the company's leadership in CPU core performance. The company has grown from four to 40 employees in just five months, underscoring the demand for its cutting-edge solutions and strong market traction.

AheadComputing is actively seeking strategic partners to bring its core technology to microprocessors and specialized logic chips across diverse sectors, including cloud computing, AI, and mobile devices.

For more information, visit aheadcomputing.com
Add your own comment

23 Comments on AheadComputing Introduces Breakthrough CPU Architecture for General-Purpose Computing, With Jim Keller on Board

#1
ncrs
Tenstorrent's stuff isn't even out yet and Jim Keller's getting involved in yet another RISC-V revolutionary startup? I'm tempted to use the "2 more weeks" meme here.
Bandwidth has been the limiting factor in CPU performance for basically all of computing. The core microarchitecture isn't really that important in this aspect, it's everything around the core that is, from caches through interconnects to memory controllers. If anything RISC-V is the easiest part of the puzzle here. I'm eagerly awaiting details of what this new venture wants to do in memory subsystem design.
Posted on Reply
#2
hsew
ncrsTenstorrent's stuff isn't even out yet and Jim Keller's getting involved in yet another RISC-V revolutionary startup? I'm tempted to use the "2 more weeks" meme here.
Bandwidth has been the limiting factor in CPU performance for basically all of computing. The core microarchitecture isn't really that important in this aspect, it's everything around the core that is, from caches through interconnects to memory controllers. If anything RISC-V is the easiest part of the puzzle here. I'm eagerly awaiting details of what this new venture wants to do in memory subsystem design.
Both Intel’s Broadwell-C and AMD’s X3D, as well as HEDT in general, are proof that memory bandwidth isn’t everything. Some tasks are accelerated, sure, but at the end of the day, app speed is always going to come down to good coding, and by a wide margin.

The problem with balancing the memory subsystem is always going to be cost. More (or exotic) memory channels = more pins = more complexity = more cost. They have to be able to sell these chips, and most users are just browsing the web or watching Netflix.
Posted on Reply
#3
ncrs
hsewBoth Intel’s Broadwell-C and AMD’s X3D, as well as HEDT in general, are proof that memory bandwidth isn’t everything. Some tasks are accelerated, sure, but at the end of the day, app speed is always going to come down to good coding, and by a wide margin.
If it wasn't everything for AI, mentioned 9 times in this press release, then individual B200 wouldn't be running on a ~8 TB/s HBM stack.
"Good coding" is expensive and time-consuming, unfortunately pushing more hardware at a problem is often the solution of choice. The best would be a combination of both, but that rarely happens.
hsewThe problem with balancing the memory subsystem is always going to be cost. More (or exotic) memory channels = more pins = more complexity = more cost. They have to be able to sell these chips, and most users are just browsing the web or watching Netflix.
Of course, but AheadCompuyting is claiming to be overcoming typical limitations. Without more details it's impossible to judge what they mean.
Posted on Reply
#4
hsew
ncrsIf it wasn't everything for AI, mentioned 9 times in this press release, then individual B200 wouldn't be running on a ~8 TB/s HBM stack.
Oh, you know that’s just to hype up the investors :laugh:
Posted on Reply
#5
igormp
ncrsTenstorrent's stuff isn't even out yet
Wdym? They have products available for sale for quite some time now, and even cloud offerings of those.
Otherwise, yeah, RISC-V is just a minor detail, the memory subsystem is way more important.
hsewBoth Intel’s Broadwell-C and AMD’s X3D, as well as HEDT in general, are proof that memory bandwidth isn’t everything. Some tasks are accelerated, sure, but at the end of the day, app speed is always going to come down to good coding, and by a wide margin.

The problem with balancing the memory subsystem is always going to be cost. More (or exotic) memory channels = more pins = more complexity = more cost. They have to be able to sell these chips, and most users are just browsing the web or watching Netflix.
As said above, this is mostly meant for AI stuff, which is heavily bottlenecked by memory bandwidth.
Posted on Reply
#6
ncrs
igormpWdym? They have products available for sale for quite some time now, and even cloud offerings of those.
You're right, but what they offer is accelerators hosted by Xeons/EPYCs. They can't run general purpose code as far as I'm aware. TT-Ascalon is what I was referring to - a general purpose RISC-V CPU.
Posted on Reply
#7
igormp
ncrsYou're right, but what they offer is accelerators hosted by Xeons/EPYCs. They can't run general purpose code as far as I'm aware. TT-Ascalon is what I was referring to - a general purpose RISC-V CPU.
I don't think Tensorrent core product is meant to be that CPU, accelerators is what they're mostly focused at.
That RISC-V core will likely be used as a glorified controller within those accelerators, similar to Nvidia's Falcon.
Posted on Reply
#8
ncrs
igormpI don't think Tensorrent core product is meant to be that CPU, accelerators is what they're mostly focused at.
That RISC-V core will likely be used as a glorified controller within those accelerators, similar to Nvidia's Falcon.
They do have a whole range of Ascalon variants on their site targeting every segment, just like AheadComputing claims their design will do.
Posted on Reply
#9
tnbw
ncrsI'm eagerly awaiting details of what this new venture wants to do in memory subsystem design.
Their work on CPU caches while they were at Intel is well documented in patents. Much of it was innovations to increase the number of loads/stores per cycle from data cache to the core.
Posted on Reply
#10
ncrs
tnbwTheir work on CPU caches while they were at Intel is well documented in patents. Much of it was innovations to increase the number of loads/stores per cycle from data cache to the core.
Who owns the IP described in those patents? Those employees or Intel itself? In the latter case I doubt Intel will cheaply license them to a potential competitor.
Posted on Reply
#11
tnbw
ncrsWho owns the IP described in those patents? Those employees or Intel itself? In the latter case I doubt Intel will cheaply license them to a potential competitor.
Intel owns those patents, but AheadComputing obviously won't do the exact same design as they worked on while at Intel. Considering that those architects are some of the top in their field, I'm sure that they have a plan with regards to navigating basic IP hurdles (plus every other similar startup has to deal with this issue too).
Posted on Reply
#12
ncrs
tnbwIntel owns those patents, but AheadComputing obviously won't do the exact same design as they worked on while at Intel. Considering that those architects are some of the top in their field, I'm sure that they have a plan with regards to navigating basic IP hurdles (plus every other similar startup has to deal with this issue too).
So that brings us back to my original point of waiting on more details since they can't do exactly what they already done when at Intel.
Posted on Reply
#13
tnbw
ncrsSo that brings us back to my original point of waiting on more details since they can't do exactly what they already done when at Intel.
Earlier you said that "AheadCompuyting is claiming to be overcoming typical limitations. Without more details it's impossible to judge what they mean." Going off of their work at Intel, it's pretty clear how they're likely planning to get around these "typical limitations": cache clustering. They patented one version of cache clustering while at Intel, but they can just modify it a bit and avoid patent infringement. The overall innovative ideas (and major benefits) remain the same.
Posted on Reply
#14
ncrs
tnbwEarlier you said that "AheadCompuyting is claiming to be overcoming typical limitations. Without more details it's impossible to judge what they mean." Going off of their work at Intel, it's pretty clear how they're likely planning to get around these "typical limitations": cache clustering. They patented one version of cache clustering while at Intel, but they can just modify it a bit and avoid patent infringement. The overall innovative ideas (and major benefits) remain the same.
Replicating what they've already done at Intel in a product potentially 5+ years in the future (comparison to Tenstorrent which started in 2020 and haven't released their RISC-V CPUs yet) is not a "game-changer" as they claimed in this press release. What is more cache is just a piece of the puzzle for a successful high performance SoC design.
Don't get me wrong, I want them to succeed as much as I want to buy a great Tenstorrent RISC-V CPU, but making grandiose claims so early isn't a good look.
Posted on Reply
#15
ScaLibBDP
@ncrs
>>...AheadCompuyting is claiming to be overcoming typical limitations...

Let me remind to everybody that Tachyum also claimed a lot and where is it now?

All major Data Center and AI providers continue to buy NVIDIA hardware.

>>...develop 64-bit RISC-V application processors that deliver breakthrough per-core performance...

PS: Also, I have Not seen any HPC Linpack benchmark results for RISC-V processors.
Posted on Reply
#16
tnbw
ncrsReplicating what they've already done at Intel in a product potentially 5+ years in the future (comparison to Tenstorrent which started in 2020 and haven't released their RISC-V CPUs yet) is not a "game-changer" as they claimed in this press release. What is more cache is just a piece of the puzzle for a successful high performance SoC design.
Don't get me wrong, I want them to succeed as much as I want to buy a great Tenstorrent RISC-V CPU, but making grandiose claims so early isn't a good look.
Their work at Intel was never released. Have you heard of "Royal Core" before? AheadComputing's CEO (Debbie Marr) was its chief architect. They are sitting on tons innovations that have never been seen before outside of academic papers. And it's not like they're going to stop coming up with new stuff just because they left Intel. Their competitive advantage will remain until someone designs a core that's similar to what they're designing (not happening anytime soon I'd guess, especially in the RISC-V space).
Posted on Reply
#17
ncrs
tnbwTheir work at Intel was never released. Have you heard of "Royal Core" before? AheadComputing's CEO (Debbie Marr) was its chief architect. They are sitting on tons innovations that have never been seen before outside of academic papers. And it's not like they're going to stop coming up with new stuff just because they left Intel. Their competitive advantage will remain until someone designs a core that's similar to what they're designing (not happening anytime soon I'd guess, especially in the RISC-V space).
That makes me even less confident, especially with the rumors of Royal Core being cancelled in the end due to being physically too big (3-4 times the size of Zen5 at Intel 20A).
ScaLibBDP@ncrs
>>...AheadCompuyting is claiming to be overcoming typical limitations...

Let me remind to everybody that Tachyum also claimed a lot and where is it now?

All major Data Center and AI providers continue to buy NVIDIA hardware.

>>...develop 64-bit RISC-V application processors that deliver breakthrough per-core performance...

PS: Also, I have Not seen any HPC Linpack benchmark results for RISC-V processors.
While not specifically Linpack or HPC focused there are 3 recent articles about RISC-V cores on Chips and Cheese:
Posted on Reply
#18
tnbw
ncrsThat makes me even less confident, especially with the rumors of Royal Core being cancelled in the end due to being physically too big (3-4 times the size of Zen5 at Intel 20A).
It wasn't exactly cancelled because of that, since the plan from the start was for it to be that big. If anything, it was cancelled due to its focus on ST performance over PPA, at a time where Intel cared most about PPA. From AheadComputing's blog posts, its clear that they are still keeping their ST-perf-above-all attitude. Whether that's a good bet in the first place is a different question, but Jim Keller investing in them should add significant credibility to their vision.
Posted on Reply
#19
ScaLibBDP
ncrsThat makes me even less confident, especially with the rumors of Royal Core being cancelled in the end due to being physically too big (3-4 times the size of Zen5 at Intel 20A).


While not specifically Linpack or HPC focused there are 3 recent articles about RISC-V cores on Chips and Cheese:
In the 2nd article there is a chart with SPEC CPU2017 results and, unfortunately, in HPC world the Linpack benchmark is the Most Widely used.

In HPC world Computer Scientists and Software Engineers always want to see values in FLOPs and don't care about numbers from a SPEC benchmark because it is Not a free benchmark.

RISC-V processor designers really understand that Linpack tests will show that most RISC-V processors fail to outperform Intel Pentium 4 processors. I even don't want to speak about legacy 3rd Gen Intel Ivy Bridge etc architectures.

There is already an ARM-based Fugaku supercomputer ( uses A64FX 48C 2.2GHz processors) and I don't think a RISC-V supercomputer will be deployed any time soon.
Posted on Reply
#20
ncrs
ScaLibBDPIn the 2nd article there is a chart with SPEC CPU2017 results and, unfortunately, in HPC world the Linpack benchmark is the Most Widely used.

In HPC world Computer Scientists and Software Engineers always want to see values in FLOPs and don't care about numbers from a SPEC benchmark because it is Not a free benchmark.

RISC-V processor designers really understand that Linpack tests will show that most RISC-V processors fail to outperform Intel Pentium 4 processors. I even don't want to speak about legacy 3rd Gen Intel Ivy Bridge etc architectures.

There is already an ARM-based Fugaku supercomputer ( uses A64FX 48C 2.2GHz processors) and I don't think a RISC-V supercomputer will be deployed any time soon.
You're generalizing a bit with Linpack's importance in HPC. There's other metrics, but at the moment RISC-V is not excelling in any of them. There isn't a single RISC-V design that focuses on FPU operations, especially since vector extensions aren't popular.
Fugaku is a special case since A64FX is the only ARM implementing SVE with 512-bit width, and it has unusually high number of them too. It was basically designed with FP in mind in times when GPUs weren't that critical for computations. There are other ARM-based supercomputers like the 7th Alps built on NVIDIA Grace Hopper, but their performance is derived chiefly from GPUs.
tnbwIt wasn't exactly cancelled because of that, since the plan from the start was for it to be that big. If anything, it was cancelled due to its focus on ST performance over PPA, at a time where Intel cared most about PPA. From AheadComputing's blog posts, its clear that they are still keeping their ST-perf-above-all attitude. Whether that's a good bet in the first place is a different question, but Jim Keller investing in them should add significant credibility to their vision.
We'll see in 5 years, at least, if they get enough funding in the first place. I wish them all the best since competition is great for consumers.
Posted on Reply
#21
ScaLibBDP
ncrsYou're generalizing a bit with Linpack's importance in HPC. There's other metrics, but at the moment RISC-V is not excelling in any of them. There isn't a single RISC-V design that focuses on FPU operations, especially since vector extensions aren't popular.
Fugaku is a special case since A64FX is the only ARM implementing SVE with 512-bit width, and it has unusually high number of them too. It was basically designed with FP in mind in times when GPUs weren't that critical for computations. There are other ARM-based supercomputers like the 7th Alps built on NVIDIA Grace Hopper, but their performance is derived chiefly from GPUs.


We'll see in 5 years, at least, if they get enough funding in the first place. I wish them all the best since competition is great for consumers.
>>...You're generalizing a bit with Linpack's importance in HPC...

No, no. I did not. Every Tech-segment has its own set of benchmarks and when the most important is Not known it is Not good.

Look in Gaming, the metric is FPs ( frames per second ), in HPC the metric is FLOPs ( floating point operations per second ). When it comes to a car we don't measure its performance in Kgs-per-100kms!

What I wanted to say is that a number from a SPEC benchmark, for example 16384, is actually useless for Gaming fans and HPC professionals.
Posted on Reply
#22
ncrs
ScaLibBDP>>...You're generalizing a bit with Linpack's importance in HPC...

No, no. I did not. Every Tech-segment has its own set of benchmarks and when the most important is Not known it is Not good.

Look in Gaming, the metric is FPs ( frames per second ), in HPC the metric is FLOPs ( floating point operations per second ). When it comes to a car we don't measure its performance in Kgs-per-100kms!

What I wanted to say is that a number from a SPEC benchmark, for example 16384, is actually useless for Gaming fans and HPC professionals.
Not every HPC workload is focused on FLOPS, and believe me I know. You can have a billion FP execution units in a CPU but without the cache and memory subsystem to feed them they will be idle most of the time.
SPEC isn't useless either, as it indicates general core performance which matters as well.
Posted on Reply
#23
ScaLibBDP
ncrsNot every HPC workload is focused on FLOPS, and believe me I know. You can have a billion FP execution units in a CPU but without the cache and memory subsystem to feed them they will be idle most of the time.
SPEC isn't useless either, as it indicates general core performance which matters as well.
>>...Not every HPC workload is focused on FLOPS, and believe me I know...

I'm talking about just one widely used Linpack benchmark because it easily allows to compare different systems based on different CPUs. As a C/C++ Software Engineer I've been involved in RISC-V software engineering for almost 3 years ( the work started in March 2022 ) and my primary complaint is that there is No any consumer-ready system I would rate as Good-To-Buy or compare to my primary development Dell Precision Mobile Workstation with Intel Extreme Edition processor used for HPC related projects.

During last 3 years I watched all Youtube videos uploaded to RISC-V International channel and I saw that most RISC-V companies do Not proceed with tape outs of their RISC-V designs, do 99.99% functional tests in QEMU only ( ! ), and expect that somebody would buy their IP(s) for a couple of million dollars.

Even Google complained about lack of RISC-V hardware with vectorization support during RISC-V 2024 Summit North America. You'll be smiling but Google uses toy-like RISC-V SBCs from Aliexpress.

There are No any good RISC-V notebooks for the consumer market today. DC ROMA RISC-V notebook from DeepComputing can Not be used for serious work.. Their latest DIY version of DC ROMA RISC-V notebook is a step back and I would describe it as a complete disaster.

In a situation like that the best approach is to use as powerful as possible x86-based system with 96 GB or 128 GB of memory for all functional tests in QEMU with emulated 32, or 48, or even 64 RISC-V cores. This is exactly what I do but unfortunately when I execute the Linpack benchmark in that environment the FLOPs number is "fake" and it is directly related to the processing power of the Intel processor used for emulation!

That's the Biggest problem.

Are we going to buy any RISC-V SBCs for the project? No, it does Not make sense at all!

Three Linux for RISC-V setups in QEMU are used, Ubuntu, Debian and Fedora. It is easy to switch between setups in a matter of seconds. Everything works on x86-based systems, like Dell Precision Mobile Workstation, and it allows to do the job.
Posted on Reply
Add your own comment
Feb 28th, 2025 17:16 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts