Monday, September 11th 2023
Die-shot Suggests "Phoenix 2" is AMD's First Hybrid Processor
The 4 nm "Phoenix 2" monolithic APU silicon powering the lower end of AMD's Ryzen 7040-series mobile processors, could very well be the company's first hybrid core processor, even though the company doesn't advertise it as such. We first caught whiff of "Phoenix 2" back in July, when it was described as being a physically smaller chip than the regular "Phoenix." It was known to have just 6 CPU cores, and a smaller iGPU with 4 RDNA3 compute units; in comparison to the 8 CPU cores and 12 compute units of the "Phoenix" silicon. At the time a lack of 2 CPU cores and 8 CUs were known to be behind the significant reduction in die size from 178 mm² to 137 mm², but it turns out that there's a lot more to "Phoenix 2."
A die shot of "Phoenix 2" emerged on Chinese social media platform QQ, which reveals two distinct kinds of CPU cores. There are six cores in all, but two of them appear larger than the other four. The obvious inference here, is that the larger cores are "Zen 4," and the smaller ones are the compacted "Zen 4c." The "Zen 4c" core has the same core machinery as "Zen 4," albeit it is re-arranged to favor lower area on the die. The trade-off here is that the "Zen 4c" core operates at lower voltages and lower clock-speeds than the regular "Zen 4" cores. At the same clock speeds, both kinds of cores have an identical IPC. The two also have an identical ISA, so any software threads migrating between the cores will not encounter runtime errors. Unlike Intel Thread Director, AMD can use a less sophisticated software-based solution to ensure that the right kind of workload is allocated to the right kind of cores, and prevent undesirable migration between the two kinds of cores. Unlike the hardware-based Thread Director, AMD's solution can be continually updated.
Sources:
HXL (Twitter), VideoCardz
A die shot of "Phoenix 2" emerged on Chinese social media platform QQ, which reveals two distinct kinds of CPU cores. There are six cores in all, but two of them appear larger than the other four. The obvious inference here, is that the larger cores are "Zen 4," and the smaller ones are the compacted "Zen 4c." The "Zen 4c" core has the same core machinery as "Zen 4," albeit it is re-arranged to favor lower area on the die. The trade-off here is that the "Zen 4c" core operates at lower voltages and lower clock-speeds than the regular "Zen 4" cores. At the same clock speeds, both kinds of cores have an identical IPC. The two also have an identical ISA, so any software threads migrating between the cores will not encounter runtime errors. Unlike Intel Thread Director, AMD can use a less sophisticated software-based solution to ensure that the right kind of workload is allocated to the right kind of cores, and prevent undesirable migration between the two kinds of cores. Unlike the hardware-based Thread Director, AMD's solution can be continually updated.
62 Comments on Die-shot Suggests "Phoenix 2" is AMD's First Hybrid Processor
Shared L3 on 6 physical cores. That's a first.
The die area they spent on shader processors is really tiny.
Had they gone with 4 WGPs instead of just 2, I wonder if this chip would have been much better at 7-15W gaming than the full Phoenix, seeing as the latter seems to constantly struggle with power and memory bandwidth bottlenecks.
However for ultrabooks using office apps this chip might be a winner.
7540U has been available in the new Thinkpad P14s Gen 4 AMD for a couple weeks now, and long since rumored to be PHX-2. If this is true, then if 7540U can use both dies then there's a non-negligible practical difference depending on what you get, and no promises as to what you get? 1 WGP is 2CUs. 4CU will forever be cheeks, this isn't the gaming monster you're looking for lol. 6CU RDNA2 barely matched Vega. 8CU Ryzen 5 is so far looking to be pretty exciting, but given Phoenix-2 exists now, who knows what sort of supply there actually is for the full 8CU 7640U, when and if it'll actually make an appearance.
Phoenix is only running into "power" issues if you are attempting to extract max possible performance out of 12CU, which like all past APU families is best done when it comes to desktop. Like 680M it still games more than admirably with most of its performance at 15-20W. CPU cores power budget is also just fine, so it's not like Intel where getting down to only 2 P-cores in -U is legitimately mandatory to get into thinner ultrabooks. Which is why PHX-2 still seems like a solution to supply problems rather than any real need on the power/thermal side.
Since Vega it's always not been very viable to make up for core config with clock - bigger iGPU core is always better even at lower clock unless major major power issues (which isn't the case).
7640U might have a better chance of hitting that 10W target of yours. You could clock 4CU to the moon and it's still not going anywhere fast.
down with ecores!!! DOWN WITH THEM!!!!
:cry::cry::cry::cry:
I'm looking forward to seeing this in an affordable 15W Thin & Light.
Time will tell though. My guess is you shouldn't worry.
Zen4C is just a normal Zen4 core squashed down and unable to clock very high. For servers and laptops where the power envelope is the limiting factor, there's no point in having lots of full-fat Zen4 cores because when all of them are loaded they won't be able to run anywhere near their boost clocks anyway.
Zen4C is just acceptance that for these products there's no point wasting die area on having every single core capable of max turbo boost. Think of this Phoenix2 chip as a regular 6C/12T Ryzen5, but only 2C/4T are capable of high turbo clockspeeds. For a 15-28W laptop CPU, that's mostly already true, because the PPT limit means that multi-threaded workloads rarely get much above base clocks.
www.phoronix.com/review/amd-epyc-9754-bergamo are there any statements claiming c-cores or hybrid design is coming for desktop?
There is no difference between P-cores and E-cores in Intel client implementation, ISA-wise. The issue has and will always be that of different performance targets for different cores, of which this will be the exact same scenario, due to clocks and cache differences.
Yes, technically the P-core support AVX-512, but those are disabled/fused off, so it doesn't make a difference in terms of ISA.
The difference between AMDs and Intels hybrid approach is that Intel is using different architectures (Golden Cove/Gracemont) but AMD is not (all cores Zen 4). It would be a different story if say AMD was combining Zen 2 and Zen 4 cores but they are not.
You are comparing apple to oranges. Golden Cove and Gracemont can run the exact same code, there is no difference in them. Aside ofc, the fused off/disabled extensions of Golden Cove like AVX512 which is to achieve the same thing.
You aren't making a point at all on why it makes a difference and how it causes issues, which is why you are making the wrong conclusions. To start with, what is the problem with E-cores vs P-cores? Why does the architecture makes a difference in Gracemont vs Golden Cove?
It does because it end ups with different performance target. If you were to simplify and say that Gracemont can execute something at 3 IPC @ 3GHz and Golden Cove at 4 IPC @ 5 GHz, the end result really is that one is much faster than the other.
The problem then arises with the scheduling of tasks/thread into the correct core, this is a big issue specially for games, which often want their frames to be done as soon as possible. If you push a high-priority thread like that into an e-core, you will have a much more sluggish experience in the game than you would have otherwise with just the P-cores, and those might even being under-utilized.
Let me give a more relevant example. If AMD mixed Jaguar and Zen architectures they would be doing the same thing as Intel. Such a hybrid architecture would suck because while Jaguar cores are small, they are super slow by today’s standards.
Also, for your second point, the issue is just the performance then? One is fast, the other is slow.
How that isn't true for Zen 4 vs Zen 4c?
If the 2 Zen 4 cores are loaded and the program needs more cores then the Zen4C will be the bottleneck just like how Gracemont is for Golden Cove. Or there might be cases like the OS schedules tasks to Zen4C(which will boost to the highest clock) instead of Zen4 cores
The physical implementation is different anyhow too, so how knows the effect of stuff like the different Memory Cell that they are using for Zen4C.
www.semianalysis.com/p/zen-4c-amds-response-to-hyperscale