Monday, September 11th 2023

Die-shot Suggests "Phoenix 2" is AMD's First Hybrid Processor

The 4 nm "Phoenix 2" monolithic APU silicon powering the lower end of AMD's Ryzen 7040-series mobile processors, could very well be the company's first hybrid core processor, even though the company doesn't advertise it as such. We first caught whiff of "Phoenix 2" back in July, when it was described as being a physically smaller chip than the regular "Phoenix." It was known to have just 6 CPU cores, and a smaller iGPU with 4 RDNA3 compute units; in comparison to the 8 CPU cores and 12 compute units of the "Phoenix" silicon. At the time a lack of 2 CPU cores and 8 CUs were known to be behind the significant reduction in die size from 178 mm² to 137 mm², but it turns out that there's a lot more to "Phoenix 2."

A die shot of "Phoenix 2" emerged on Chinese social media platform QQ, which reveals two distinct kinds of CPU cores. There are six cores in all, but two of them appear larger than the other four. The obvious inference here, is that the larger cores are "Zen 4," and the smaller ones are the compacted "Zen 4c." The "Zen 4c" core has the same core machinery as "Zen 4," albeit it is re-arranged to favor lower area on the die. The trade-off here is that the "Zen 4c" core operates at lower voltages and lower clock-speeds than the regular "Zen 4" cores. At the same clock speeds, both kinds of cores have an identical IPC. The two also have an identical ISA, so any software threads migrating between the cores will not encounter runtime errors. Unlike Intel Thread Director, AMD can use a less sophisticated software-based solution to ensure that the right kind of workload is allocated to the right kind of cores, and prevent undesirable migration between the two kinds of cores. Unlike the hardware-based Thread Director, AMD's solution can be continually updated.
Sources: HXL (Twitter), VideoCardz
Add your own comment

62 Comments on Die-shot Suggests "Phoenix 2" is AMD's First Hybrid Processor

#1
ToTTenTranz
Here are the labelled zones:



Shared L3 on 6 physical cores. That's a first.
The die area they spent on shader processors is really tiny.
Had they gone with 4 WGPs instead of just 2, I wonder if this chip would have been much better at 7-15W gaming than the full Phoenix, seeing as the latter seems to constantly struggle with power and memory bandwidth bottlenecks.

However for ultrabooks using office apps this chip might be a winner.
Posted on Reply
#2
tabascosauz
No word on what product this die shot is actually out of?

7540U has been available in the new Thinkpad P14s Gen 4 AMD for a couple weeks now, and long since rumored to be PHX-2. If this is true, then if 7540U can use both dies then there's a non-negligible practical difference depending on what you get, and no promises as to what you get?
ToTTenTranzShared L3 on 6 physical cores. That's a first.
The die area they spent on shader processors is really tiny.
Had they gone with 4 WGPs instead of just 2, I wonder if this chip would have been much better at 7-15W gaming than the full Phoenix, seeing as the latter seems to constantly struggle with power and memory bandwidth bottlenecks.

However for ultrabooks using office apps this chip might be a winner.
1 WGP is 2CUs. 4CU will forever be cheeks, this isn't the gaming monster you're looking for lol. 6CU RDNA2 barely matched Vega. 8CU Ryzen 5 is so far looking to be pretty exciting, but given Phoenix-2 exists now, who knows what sort of supply there actually is for the full 8CU 7640U, when and if it'll actually make an appearance.

Phoenix is only running into "power" issues if you are attempting to extract max possible performance out of 12CU, which like all past APU families is best done when it comes to desktop. Like 680M it still games more than admirably with most of its performance at 15-20W. CPU cores power budget is also just fine, so it's not like Intel where getting down to only 2 P-cores in -U is legitimately mandatory to get into thinner ultrabooks. Which is why PHX-2 still seems like a solution to supply problems rather than any real need on the power/thermal side.

Since Vega it's always not been very viable to make up for core config with clock - bigger iGPU core is always better even at lower clock unless major major power issues (which isn't the case).
Posted on Reply
#3
konga
tabascosauzPhoenix is only running into "power" issues if you are attempting to extract max possible performance out of 12CU, which like all past APU families is best done when it comes to desktop. Like 680M it still games more than admirably with most of its performance at 15-20W. CPU cores power budget is also just fine, so it's not like Intel where getting down to only 2 P-cores in -U is legitimately mandatory to get into thinner ultrabooks. Which is why PHX-2 still seems like a solution to supply problems rather than any real need on the power/thermal side.
Phoenix has proven to be very inefficient below 15W. The Steam Deck performs better than the ROG Ally and other 7840U devices at 5 - 10W for instance. Efficiency scales in both directions, and there's always a sweet spot for each processor. The 7840U's is at around 20W. We're still waiting for a new APU whose sweet spot is around 10W.
Posted on Reply
#4
tabascosauz
kongaPhoenix has proven to be very inefficient below 15W. The Steam Deck performs better than the ROG Ally and other 7840U devices at 5 - 10W for instance. Efficiency scales in both directions, and there's always a sweet spot for each processor. The 7840U's is at around 20W. We're still waiting for a new APU whose sweet spot is around 10W.
15-20W is what even the thinnest Ryzen ultrabooks are capable of. I don't see how this is anything but a Z1 Extreme and ROG Ally problem, and that's not a device where 7840U, 7640U or Phoenix-2 is. ROG Ally has what appears to be plenty of optimization issues, and handheld form factor inherently limits the thermal and power envelope.

7640U might have a better chance of hitting that 10W target of yours. You could clock 4CU to the moon and it's still not going anywhere fast.
Posted on Reply
#6
P4-630
Space Lynx@P4-630 I'm fucked either way now mate... :(
This one is a mobile CPU but yeah they are coming for desktop too, it's the future.. :D
Posted on Reply
#7
Space Lynx
Astronaut
@lexluthermiester for context, I hate ecores with a passion. I saw total war warhammer utilize 81% in ecore and like 10% in pcore randomly at times, causing the game to have low fps dips once in awhile. 99% of games work fine, but I want ALL games to work fine.

down with ecores!!! DOWN WITH THEM!!!!


:cry::cry::cry::cry:
Posted on Reply
#8
Chrispy_
This is going to be an absolutely fantastic laptop chip. 2 "P cores" but 12 homogenous threads and a GPU that should be enough for everything outside of this generation's AAA gaming.

I'm looking forward to seeing this in an affordable 15W Thin & Light.
Posted on Reply
#9
lexluthermiester
Space Lynx@lexluthermiester for context, I hate ecores with a passion. I saw total war warhammer utilize 81% in ecore and like 10% in pcore randomly at times, causing the game to have low fps dips once in awhile. 99% of games work fine, but I want ALL games to work fine.

down with ecores!!! DOWN WITH THEM!!!!


:cry::cry::cry::cry:
You might be worrying a bit much. Intel's "E-core" situation is unlikely to translate over to AMD's side of things. The Ryzen ISA is different from Intel's offerings. While it's all still X86/X64, there are serious differences between them, which means that problems encountered with Intel's Big/Little designs will not automagically happen on AMD's version of Big/Little.

Time will tell though. My guess is you shouldn't worry.
Posted on Reply
#10
Chrispy_
lexluthermiesterYou might be worrying a bit much. Intel's "E-core" situation is unlikely to translate over to AMD's side of things. The Ryzen ISA is different from Intel's offerings. While it's all still X86/X64, there are serious differences between them, which means that problems encountered with Intel's Big/Little designs will not automagically happen on AMD's version of Big/Little.

Time will tell though. My guess is you shouldn't worry.
Yeah, these aren't E-cores like Intel's with different architecture.

Zen4C is just a normal Zen4 core squashed down and unable to clock very high. For servers and laptops where the power envelope is the limiting factor, there's no point in having lots of full-fat Zen4 cores because when all of them are loaded they won't be able to run anywhere near their boost clocks anyway.

Zen4C is just acceptance that for these products there's no point wasting die area on having every single core capable of max turbo boost. Think of this Phoenix2 chip as a regular 6C/12T Ryzen5, but only 2C/4T are capable of high turbo clockspeeds. For a 15-28W laptop CPU, that's mostly already true, because the PPT limit means that multi-threaded workloads rarely get much above base clocks.
Posted on Reply
#11
Daven
Space Lynx@lexluthermiester for context, I hate ecores with a passion. I saw total war warhammer utilize 81% in ecore and like 10% in pcore randomly at times, causing the game to have low fps dips once in awhile. 99% of games work fine, but I want ALL games to work fine.

down with ecores!!! DOWN WITH THEM!!!!


:cry::cry::cry::cry:
AMD doesn’t use e-cores so you are fine. Just avoid Intel processors.
Chrispy_Yeah, these aren't E-cores like Intel's with different architecture.

Zen4C is just a normal Zen4 core squashed down and unable to clock very high. For servers and laptops where the power envelope is the limiting factor, there's no point in having lots of full-fat Zen4 cores because when all of them are loaded they won't be able to run anywhere near their boost clocks anyway.

Zen4C is just acceptance that for these products there's no point wasting die area on having every single core capable of max turbo boost. Think of this Phoenix2 chip as a regular 6C/12T Ryzen5, but only 2C/4T are capable of high turbo clockspeeds. For a 15-28W laptop CPU, that's mostly already true, because the PPT limit means that multi-threaded workloads rarely get much above base clocks.
Very nice explanation. AMD c-cores is gonna be another one of those internet myths. Space Lynx even knows they aren’t e-cores but reflexively responds to these articles as if they are.
Posted on Reply
#12
Unregistered
Space Lynx@lexluthermiester for context, I hate ecores with a passion. I saw total war warhammer utilize 81% in ecore and like 10% in pcore randomly at times, causing the game to have low fps dips once in awhile. 99% of games work fine, but I want ALL games to work fine.

down with ecores!!! DOWN WITH THEM!!!!


:cry::cry::cry::cry:
c-cores are not like intel e-cores

www.phoronix.com/review/amd-epyc-9754-bergamo
P4-630This one is a mobile CPU but yeah they are coming for desktop too, it's the future.. :D
are there any statements claiming c-cores or hybrid design is coming for desktop?
#13
napata
lexluthermiesterYou might be worrying a bit much. Intel's "E-core" situation is unlikely to translate over to AMD's side of things. The Ryzen ISA is different from Intel's offerings. While it's all still X86/X64, there are serious differences between them, which means that problems encountered with Intel's Big/Little designs will not automagically happen on AMD's version of Big/Little.

Time will tell though. My guess is you shouldn't worry.
In practice Intel's "E-core" situation already exists with the 7900X3D & 7950X3D: faster and slower cores where performance can nosedive if the wrong scheduling happens.
Posted on Reply
#14
Daven
napataIn practice Intel's "E-core" situation already exists with the 7900X3D & 7950X3D: faster and slower cores where performance can nosedive if the wrong scheduling happens.
We’ve had situations like this ever since chiplets, favored cores, etc. CPU topographies are getting increasingly complicated since the days of a single, high clocked core are long gone due to process and physics limitations.
Posted on Reply
#15
JustBenching
lexluthermiesterYou might be worrying a bit much. Intel's "E-core" situation is unlikely to translate over to AMD's side of things. The Ryzen ISA is different from Intel's offerings. While it's all still X86/X64, there are serious differences between them, which means that problems encountered with Intel's Big/Little designs will not automagically happen on AMD's version of Big/Little.

Time will tell though. My guess is you shouldn't worry.
And how is this different in practice? If a workload decides to load the C core instead of the full fat core, the end result is the same
Posted on Reply
#16
persondb
lexluthermiesterYou might be worrying a bit much. Intel's "E-core" situation is unlikely to translate over to AMD's side of things. The Ryzen ISA is different from Intel's offerings. While it's all still X86/X64, there are serious differences between them, which means that problems encountered with Intel's Big/Little designs will not automagically happen on AMD's version of Big/Little.

Time will tell though. My guess is you shouldn't worry.
Differences such as...?

There is no difference between P-cores and E-cores in Intel client implementation, ISA-wise. The issue has and will always be that of different performance targets for different cores, of which this will be the exact same scenario, due to clocks and cache differences.

Yes, technically the P-core support AVX-512, but those are disabled/fused off, so it doesn't make a difference in terms of ISA.
Posted on Reply
#17
Daven
persondbDifferences such as...?

There is no difference between P-cores and E-cores in Intel client implementation, ISA-wise. The issue has and will always be that of different performance targets for different cores, of which this will be the exact same scenario, due to clocks and cache differences.

Yes, technically the P-core support AVX-512, but those are disabled/fused off, so it doesn't make a difference in terms of ISA.
The 8086 processor released in the 1970s has the same ISA as Intel and AMD processors today. However if Intel was using 8086 processors as e-cores you would be crying into your soup.

The difference between AMDs and Intels hybrid approach is that Intel is using different architectures (Golden Cove/Gracemont) but AMD is not (all cores Zen 4). It would be a different story if say AMD was combining Zen 2 and Zen 4 cores but they are not.
Posted on Reply
#18
JustBenching
DavenThe 8086 processor released in the 1970s has the same ISA as Intel and AMD processors today. However if Intel was using 8086 processors as e-cores you would be crying into your soup.

The difference between AMDs and Intels hybrid approach is that Intel is using different architectures (Golden Cove/Gracemont) but AMD is not (all cores Zen 4). It would be a different story if say AMD was combining Zen 2 and Zen 4 cores but they are not.
But what is the practical difference? If a process goes to the 4c core, you are screwed
Space Lynx@lexluthermiester for context, I hate ecores with a passion. I saw total war warhammer utilize 81% in ecore and like 10% in pcore randomly at times, causing the game to have low fps dips once in awhile. 99% of games work fine, but I want ALL games to work fine.

down with ecores!!! DOWN WITH THEM!!!!


:cry::cry::cry::cry:
That's not a problem with the ecores, the only way this happens is if the game decided for some reason to ignore the thread director and do its own thing.
Posted on Reply
#19
persondb
DavenThe 8086 processor released in the 1970s has the same ISA as Intel and AMD processors today. However if Intel was using 8086 processors as e-cores you would be crying into your soup.

The difference between AMDs and Intels hybrid approach is that Intel is using different architectures (Golden Cove/Gracemont) but AMD is not (all cores Zen 4). It would be a different story if say AMD was combining Zen 2 and Zen 4 cores but they are not.
Different architectures that implement the same ISA. Of course, the 8086 wouldn't work because it's not the same ISA...

You are comparing apple to oranges. Golden Cove and Gracemont can run the exact same code, there is no difference in them. Aside ofc, the fused off/disabled extensions of Golden Cove like AVX512 which is to achieve the same thing.

You aren't making a point at all on why it makes a difference and how it causes issues, which is why you are making the wrong conclusions. To start with, what is the problem with E-cores vs P-cores? Why does the architecture makes a difference in Gracemont vs Golden Cove?

It does because it end ups with different performance target. If you were to simplify and say that Gracemont can execute something at 3 IPC @ 3GHz and Golden Cove at 4 IPC @ 5 GHz, the end result really is that one is much faster than the other.

The problem then arises with the scheduling of tasks/thread into the correct core, this is a big issue specially for games, which often want their frames to be done as soon as possible. If you push a high-priority thread like that into an e-core, you will have a much more sluggish experience in the game than you would have otherwise with just the P-cores, and those might even being under-utilized.
Posted on Reply
#20
Daven
persondbDifferent architectures that implement the same ISA. Of course, the 8086 wouldn't work because it's not the same ISA...

You are comparing apple to oranges. Golden Cove and Gracemont can run the exact same code, there is no difference in them. Aside ofc, the fused off/disabled extensions of Golden Cove like AVX512 which is to achieve the same thing.

You aren't making a point at all on why it makes a difference and how it causes issues, which is why you are making the wrong conclusions. To start with, what is the problem with E-cores vs P-cores? Why does the architecture makes a difference in Gracemont vs Golden Cove?

It does because it end ups with different performance target. If you were to simplify and say that Gracemont can execute something at 3 IPC @ 3GHz and Golden Cove at 4 IPC @ 5 GHz, the end result really is that one is much faster than the other.

The problem then arises with the scheduling of tasks/thread into the correct core, this is a big issue specially for games, which often want their frames to be done as soon as possible. If you push a high-priority thread like that into an e-core, you will have a much more sluggish experience in the game than you would have otherwise with just the P-cores, and those might even being under-utilized.
8086 uses the same ISA as today’s AMD and Intel processors. Its called x86. There have been different implementations of the x86 ISA over the years. We call those architectures.

Let me give a more relevant example. If AMD mixed Jaguar and Zen architectures they would be doing the same thing as Intel. Such a hybrid architecture would suck because while Jaguar cores are small, they are super slow by today’s standards.
Posted on Reply
#21
persondb
Daven8086 uses the same ISA as today’s AMD and Intel processors. Its called x86. There have been different implementations of the x86 ISA over the years. We call those architectures.

Let me give a more relevant example. If AMD mixed Jaguar and Zen architectures they would be doing the same thing as Intel. Such a hybrid architecture would suck because while Jaguar cores are small, they are super slow by today’s standards.
It doesn't use the same ISA at all. The current ISA is x86-64 (with it's many many extensions) which is a superset of the one implemented by the 8086, so it starts in Real Mode(the 16-bit mode akin to 8086) but usually it gets configured to get out of it and into the modern modes. But it's really not the same at all, one big difference is just how addressing is done(think the weird memory segmentation stuff in 286).

Also, for your second point, the issue is just the performance then? One is fast, the other is slow.
How that isn't true for Zen 4 vs Zen 4c?
Posted on Reply
#22
AnotherReader
Daven8086 uses the same ISA as today’s AMD and Intel processors. Its called x86. There have been different implementations of the x86 ISA over the years. We call those architectures.

Let me give a more relevant example. If AMD mixed Jaguar and Zen architectures they would be doing the same thing as Intel. Such a hybrid architecture would suck because while Jaguar cores are small, they are super slow by today’s standards.
Using smaller cores for low power scenarios would be good, but they would have to be much smaller and more power efficient than Gracemont. Apple's excellent cores have shown that in power constrained scenarios, being able to schedule background threads on wimpy cores is a win. Laptops are the right home for mixing these low power cores with a few big cores. Intel, on the other hand, uses E cores for increasing multi threaded performance.
Posted on Reply
#23
Squared
fevgatosAnd how is this different in practice? If a workload decides to load the C core instead of the full fat core, the end result is the same
What's different is that a Zen 4c core will behave exactly like a Zen 4 core when at the same clock speed (except maybe for cache, I'm not sure). So if the Zen 4 cores are loaded, a third Zen 4 core couldn't boost enough to outpetform a Zen 4c core. So there shouldn't be any difference in performance. Now Windows does have to schedule work to the Zen 4 cores first, but that's trivial. Remember that because of simultaneous multi-threading, this processor presents 12 cores to Windows, and Windows has to choose just one thread on each core until it gets to 7 threads. I don't hear concern about how that's working.
Posted on Reply
#24
persondb
SquaredWhat's different is that a Zen 4c core will behave exactly like a Zen 4 core when at the same clock speed (except maybe for cache, I'm not sure). So if the Zen 4 cores are loaded, a third Zen 4 core couldn't boost enough to outpetform a Zen 4c core. So there shouldn't be any difference in performance. Now Windows does have to schedule with to the Zen 4 cores first, but that's trivial. Remember that because if simultaneous multi-threading, this processor presents 12 cores to Windows, and Windows has to choose just one thread on each core until it gets to 7 threads. I don't hear concern about how that's working.
Notice the same clock speed. This isn't going to have the same clock speed at all, if AMD could do a core that was half the size and had roughly the same clock speed, they would just do that...

If the 2 Zen 4 cores are loaded and the program needs more cores then the Zen4C will be the bottleneck just like how Gracemont is for Golden Cove. Or there might be cases like the OS schedules tasks to Zen4C(which will boost to the highest clock) instead of Zen4 cores

The physical implementation is different anyhow too, so how knows the effect of stuff like the different Memory Cell that they are using for Zen4C.

www.semianalysis.com/p/zen-4c-amds-response-to-hyperscale
Posted on Reply
#25
JustBenching
SquaredWhat's different is that a Zen 4c core will behave exactly like a Zen 4 core when at the same clock speed (except maybe for cache, I'm not sure). So if the Zen 4 cores are loaded, a third Zen 4 core couldn't boost enough to outpetform a Zen 4c core. So there shouldn't be any difference in performance. Now Windows does have to schedule with to the Zen 4 cores first, but that's trivial. Remember that because if simultaneous multi-threading, this processor presents 12 cores to Windows, and Windows has to choose just one thread on each core until it gets to 7 threads. I don't hear concern about how that's working.
If the Zen 4c core would perform as well as the full fat core then there wouldn't be any full fat cores. Obviously that is not the case, zen 4c will be slower so it will have the same "issues" ecores do.
Posted on Reply
Add your own comment
Dec 11th, 2024 20:30 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts