Thursday, August 10th 2023
AMD "Strix Point" Company's First Hybrid Processor, 4P+8E ES Surfaces
Beating previous reports that AMD is increasing the CPU core count of its mobile monolithic processors from the present 8-core/16-thread to 12-core/24-thread; we are learning that the next-gen processor from the company, codenamed "Strix Point," will in fact be the company's first hybrid processor. The chip is expected to feature two kinds of CPU cores, with "Zen 5" being the microarchitecture behind the performance cores, and "Zen 5c" behind the efficiency cores. An engineering sample featuring 4 P-cores, and 8 E-cores, surfaced on the web, thanks to Performancedatabases. A HWiNFO screenshot reveals the engineering sample's core-configuration of 4x P-cores and 8x E-cores, with identical L1 cache sizes. Things get a little fuzzy with the L2 cache size detection, and L3 cache.
We know from the current "Zen 4c" core design that it is essentially a compacted version of "Zen 4" designed for higher-density chiplets that have 16 cores; and that it has both the same ISA and IPC as "Zen 4," with the only difference being that "Zen 4c" is designed with lower amounts of shared L3 caches at their disposal, are generally configured with lower clock speeds, and have higher energy efficiency than "Zen 4." "Zen 4c" cores also 35% smaller in die-area than "Zen 4." The company could develop "Zen 5c" CPU cores with similar design goals.The "Strix Point" silicon could hence have two CCX (CPU core complexes); one of which has the larger "Zen 5" P-cores and certain amount of L3 cache, and another CCX with the smaller "Zen 5c" cores, and their own L3 caches. This would essentially be similar to "Renoir," which has two 4-core CCXs of "Zen 2" cores. The L1 cache sizes for both kinds of cores is identical—48 KB L1D and 32 KB L1I, and it's likely that both core types have 1 MB of dedicated L2 caches per core. The L3 cache sizes could vary between the two CCXs, with the P-core CCX having 16 MB (4 MB per core), and the E-core CCX 8 MB (512 KB per core).
It would be interesting to imagine how AMD handles the hybrid architecture from a software standpoint. Intel uses Thread Director, a hardware-based solution that's designed to send the right kind of compute workload to the right kind of CPU core. AMD could either try to develop its own version of Thread Director, or use a less sophisticated OS-based solution such as what it's doing with its multi-CCD client processors.
Sources:
Performancedatabases, IThome, VideoCardz
We know from the current "Zen 4c" core design that it is essentially a compacted version of "Zen 4" designed for higher-density chiplets that have 16 cores; and that it has both the same ISA and IPC as "Zen 4," with the only difference being that "Zen 4c" is designed with lower amounts of shared L3 caches at their disposal, are generally configured with lower clock speeds, and have higher energy efficiency than "Zen 4." "Zen 4c" cores also 35% smaller in die-area than "Zen 4." The company could develop "Zen 5c" CPU cores with similar design goals.The "Strix Point" silicon could hence have two CCX (CPU core complexes); one of which has the larger "Zen 5" P-cores and certain amount of L3 cache, and another CCX with the smaller "Zen 5c" cores, and their own L3 caches. This would essentially be similar to "Renoir," which has two 4-core CCXs of "Zen 2" cores. The L1 cache sizes for both kinds of cores is identical—48 KB L1D and 32 KB L1I, and it's likely that both core types have 1 MB of dedicated L2 caches per core. The L3 cache sizes could vary between the two CCXs, with the P-core CCX having 16 MB (4 MB per core), and the E-core CCX 8 MB (512 KB per core).
It would be interesting to imagine how AMD handles the hybrid architecture from a software standpoint. Intel uses Thread Director, a hardware-based solution that's designed to send the right kind of compute workload to the right kind of CPU core. AMD could either try to develop its own version of Thread Director, or use a less sophisticated OS-based solution such as what it's doing with its multi-CCD client processors.
86 Comments on AMD "Strix Point" Company's First Hybrid Processor, 4P+8E ES Surfaces
The only difference is cache. Density is more manufacturing than core differences. Is a Celeron a different core than a Pentium due to cache differences? No, Intel gives them the same ‘cove’ codename. Cache differences have existed for a long time on the same cores for the sake of product differentiation.
An Intel p-core uses a ‘cove’ architecture. An Intel e-core uses a ‘mont’ architecture. Same goes for ARM SoCs. Different architectures for different on chip cores. AMD c and non-c cores are the same. But that’s not a BIG.little hybrid design. That’s just little. We are arguing whether an AMD c and non-c mixture is really a hybrid design as the cores are essentially the same. I’m arguing that AMDs design is not BIG.little (two different core architectures) as Intel and ARM have defined it while others say cache and clock differences classify as hybrid and therefore still require sophisticated thread managers.
Those are very common.
Yes, ARM meant to pair different architectures(but with the same features implemented, just at different performance targets) for big.LITTLE, but a lot of manufacturers used the same core but implemented in different ways, so exactly like what AMD is doing here.
Regarding the clock, we have a bergamo reaching 3.1Ghz even though it has 128c (It could be more due to TDP)
C-cores will definitely enable more flexible and diverse SKUs across mobility line-ups.
You probably misunderstood what I wrote.
Oh well, I learned something new.
So I guess you misunderstood my post. It's fine.
This has a result that's basically the same as big.little which is... different performance cores. It doesn't really matter if the core has the same IPC or not really.
- firstly, cheaper to produce since they have smaller surface area, but since it will have more cores, the die area will remain the same, so not really cheaper to begin with
- secondly, more expensive since you need to stack cache
- so such design will be more expensive than one with regular Zen 5 cores with no cache stacking
Which means that it could turn out to be faster than pure Zen 5 cores (due to more cores) in some workloads, while in others obviously not (due to lower clocks due to stacked cache), all the while being more expensive.Well if your use case is such that such configuration benefits it, you could still go for this kind of design even being more expensive, it could turn out to be cheaper per core.
But that's a BIG IF. And exactly a reason to have hybrids in the first place.
Besides being more expensive, another point is that the cache is stacked over the L3, effectively doubling it. But APUs only have half of the L3, so using 3D cache would only reach the same amount as desktop processors. add it all up and you will see that such a product would make no sense.
To quote that source "(the 7440U) offers 4 cores (quad core) based on the Zend 4/Zen 4c architecture that supports hyperthreading (8 threads). The cores clock from 3 (base) up to 4.5 GHz (single core boost). The processor includes 4 MB L2 cache and 8 MB L3 cache. The chip is based on the smaller Phoenix2 series with two bigger Zen 4 cores and two smaller Zen 4c cores (with less cache)...".
You are trying to maximise power and cooling requirements of each chassis and hopefully either a decent hardware scheduler or improvements to OS based scheduling will get the most benefits from this.
If this comes to desktop I can see it being viable in the areas of say business CPUs/Low end CPUs with intergrated GPUs. Lower power consumption with decent core counts. its also the fact that currently the 3d cache uses vias in the L3 cache to connect between the substrate and the 3d vcache so there is a requirement for the actual physical space the L3 takes up.
If I were to guess, I bet the C cores don’t boost as high, and might not exceed the “all core” rated speed.