Thursday, May 30th 2024

Intel's "Skymont" E-core Posts a Double-digit IPC Gain Over "Crestmont": Leaked Presentation

Amid all the attention the next-generation "Lion Cove" P-cores powering the upcoming "Lunar Lake" and "Arrow Lake" microarchitectures get as they compete with AMD's "Zen 5," it's easy to lose sight of the next-generation "Skymont" E-cores that will feature in both the upcoming Intel microarchitectures, and as standalone cores in the "Twin Lake" low-power processor. Pictures from an Intel presentation, possibly to PC OEMs, got leaked to the web. These are just thumbnails, we can't see the whole slides, but the person who took the pictures captioned them in a now-deleted social media post on the Chinese microblogging platform Weibo.

And now, the big reveal—the "Skymont" E-core is said to offer a double-digit IPC gain over the "Crestmont" E-core powering the current "Meteor Lake" processor, which in itself posted a roughly 4% IPC gain over the "Gracemont" E-cores found in the "Raptor Lake" and "Alder Lake" microarchitectures. Such an IPC gain over "Gracemont" should make the "Skymont" E-core match the IPC of the "Sunny Cove" or "Willow Cove" P-cores powering the "Ice Lake" and "Tiger Lake" microarchitectures, respectively, which were both within the 90th percentile of the AMD "Zen 3" core in IPC.
Intel is achieving this double-digit IPC gain over "Crestmont" through an improved branch prediction unit, a broader 9-wide Decode unit compared to the 6-wide Decode unit of "Crestmont," and an 8-wide integer ALU, compared to 4 Integer ALU on its predecessor, a dependency optimization in the out-of-order engine, and deeper queuing across the engine. The E-cores might still be arranged in clusters that share an L2 cache among a certain number of cores.
Source: HXL (Twitter)
Add your own comment

27 Comments on Intel's "Skymont" E-core Posts a Double-digit IPC Gain Over "Crestmont": Leaked Presentation

#1
Daven
Arrow Lake…

Improved iGPU…check
Improved AI…check
Improved E-cores…check
Improved P-cores…syntax error…
…recompiling…

Lower P core clocks…check
Less threads…check
Lower P-core IPC…………………………
Posted on Reply
#2
phanbuey
DavenArrow Lake…

Improved iGPU…check
Improved AI…check
Improved E-cores…check
Improved P-cores…syntax error…
…recompiling…

Lower P core clocks…check
Less threads…check
Lower P-core IPC…………………………
Lower P core IPC? I think you're thinking meteor lake.
Posted on Reply
#3
Daven
phanbueyLower P core IPC? I think you're thinking meteor lake.
I’m thinking Lion Cove = Redwood Cove

But I left the question open.
Posted on Reply
#4
Nanochip
DavenArrow Lake…

Improved iGPU…check
Improved AI…check
Improved E-cores…check
Improved P-cores…syntax error…
…recompiling…

Lower P core clocks…check
Less threads…check
Lower P-core IPC…………………………
Lion Cove is in Arrow Lake vs Redwood Cove in meteor lake.
Posted on Reply
#5
phanbuey
DavenI’m thinking Lion Cove = Redwood Cove

But I left the question open.
I see....



That would be wild if true. Also a market share killer.
Posted on Reply
#6
Daven
NanochipLion Cove is in Arrow Lake vs Redwood Cove in meteor lake.
Ask yourself this. Why doesn’t the leaked presentation sing the high IPC increase of the P-cores from the roof tops? Answer: Because there is no increase of the IPC in the P-cores to sing about.
Posted on Reply
#7
Nanochip
DavenAsk yourself this. Why doesn’t the leaked presentation sing the high IPC increase of the P-cores from the roof tops? Answer: Because there is no increase of the IPC in the P-cores to sing about.
I find that claim (of no IPC increase) to be dubious. We will see.

Given that lion cove will be fabbed on a more advanced node (than raptor lake), I also expect much better power efficiency. Intel needs to get its power consumption in check.
Posted on Reply
#8
P4-630
Next week Computex 2024, then we probably know more.
Posted on Reply
#9
Darmok N Jalad
DavenAsk yourself this. Why doesn’t the leaked presentation sing the high IPC increase of the P-cores from the roof tops? Answer: Because there is no increase of the IPC in the P-cores to sing about.
Perhaps there is an IPC increase on the P cores, but they can't clock them high enough to show an actual performance uplift over previous generations. Intel has kinda painted themselves in the corner by pushing Raptor Lake so hard, when it appears that Meteor Lake can't meet those type of targets, and maybe Arrow Lake can't either.
Posted on Reply
#10
usiname
DavenI’m thinking Lion Cove = Redwood Cove

But I left the question open.
They won't be same core, because Redwood (meteor lake) has HT, while the Lion does not have HT
Posted on Reply
#11
DavidC1
This is exciting.

What I can make out:
-Flexible & Scalable: Shows Lunarlake and Arrowlake
-Increased IPC Gains: Left is likely Int and right is FP. FP gains are greater. Integer gains seem to be not 1.1n x but 1.2n x or 1.3n x.
-Two graphs of "something" Power and Performance compared to another core. Guessing one is for performance and other is for power(Arrowlake and Lunarlake). For the Performance graph it shows 2x performance with more power. At the same power it seems to be showing 1.5 or 1.7x. It also seems to say "1/3 power" at same performance. The low power graph shows massive gains(2.4x at the same power?), probably compared to Crestmont LP? Scales to 4x or 5x compared to Crestmont LP at higher power.
-Skymont uArch Goals
-"something" Decode: 9-wide(3x3), "Nanocache"?
-"something" Predict: 128 bytes, Faster
-"something" OoOE Engine: 8-wide and 16-wide, Dependency "streaming"?
-Deeper Queueing with More Resources

Predictions. 10% faster in Int and little bit behind FP compared to Golden Cove.
Posted on Reply
#12
TumbleGeorge
NanochipI also expect much better power efficiency.
Define "much" in the Intel cause. If had they succeeded in achieving something substantial, it would have already been announced in big letters and numbers in all the internet.
Posted on Reply
#13
Darmok N Jalad
TumbleGeorgeDefine "much" in the Intel cause. If had they succeeded in achieving something substantial, it would have already been announced in big letters and numbers in all the internet.
Maybe only in a leak. Right now Intel is still trying to sell RLR and ML products. If they play their hand too soon, then they just hurt current sales. I could only see them tipping their hand sooner if Snapdragon X is actually a success.
Posted on Reply
#14
G777
Darmok N JaladPerhaps there is an IPC increase on the P cores, but they can't clock them high enough to show an actual performance uplift over previous generations. Intel has kinda painted themselves in the corner by pushing Raptor Lake so hard, when it appears that Meteor Lake can't meet those type of targets, and maybe Arrow Lake can't either.
It's this. There were leaks stating that there would be single-digit uplift in absolute single-thread performance from Raptor Lake to Arrow Lake, but that's comparing 6+ GHz Raptor Cove with ~5.5GHz Lion Cove, so there should be an IPC increase.

This should bolde well for the efficiency of lower clocked parts.
Posted on Reply
#15
JohH
9 wide decode?
Isn't Zen 4 only 4 wide?
Posted on Reply
#16
phanbuey
G777It's this. There were leaks stating that there would be single-digit uplift in absolute single-thread performance from Raptor Lake to Arrow Lake, but that's comparing 6+ GHz Raptor Cove with ~5.5GHz Lion Cove, so there should be an IPC increase.

This should bolde well for the efficiency of lower clocked parts.
If that's true then prepare for the $700 9800X3D
Posted on Reply
#17
watzupken
To be honest, the news don't excite me. The problem is that companies are busy trying to deliver higher IPC and efficiency on their performance cores, while Intel is busy improving its efficiency cores, which under some loads, may not be used. And I would prefer Intel to rename their "efficient" cores to something like companion cores because it is very clear that the main intent is to make up for multi-core numbers and less for efficiency. So much so they had to further bifurcate the e-cores to include another LP e-core in their Meteor Lake chips.
Posted on Reply
#18
Darmok N Jalad
watzupkenTo be honest, the news don't excite me. The problem is that companies are busy trying to deliver higher IPC and efficiency on their performance cores, while Intel is busy improving its efficiency cores, which under some loads, may not be used. And I would prefer Intel to rename their "efficient" cores to something like companion cores because it is very clear that the main intent is to make up for multi-core numbers and less for efficiency. So much so they had to further bifurcate the e-cores to include another LP e-core in their Meteor Lake chips.
Yeah, I have seen this myself with my work laptop. Sometimes the E cores aren't doing much of anything when both(!) P cores are pegged at 100%. And this is running MS Office under Windows 11.
Posted on Reply
#19
DemonicRyzen666
JohH9 wide decode?
Isn't Zen 4 only 4 wide?
I believe it's 4 wide marco code only.
No idea what micro is, I haven't look at the diagrams for Zen 4, I'm sure it's around 6 or 8 though for micro, as it's usually double the macro.
Posted on Reply
#20
Noyand
watzupkenTo be honest, the news don't excite me. The problem is that companies are busy trying to deliver higher IPC and efficiency on their performance cores, while Intel is busy improving its efficiency cores, which under some loads, may not be used. And I would prefer Intel to rename their "efficient" cores to something like companion cores because it is very clear that the main intent is to make up for multi-core numbers and less for efficiency. So much so they had to further bifurcate the e-cores to include another LP e-core in their Meteor Lake chips.
Lunar lake won't have LP cores btw, those leaks came from a presentation about LNL
Posted on Reply
#21
stimpy88
Countdown to P-Core deletion...
Posted on Reply
#22
jpvalverde85
E-Cores are the ones on the i/o slice, just a couple of Crestmont cores without L3 and very low clock speed, they are functional to not power up the other clusters for basic tasks, but their performance and IPC sucks absolutely bad on every test done to them in comparison to the Crestmont cores on the CPU slice that have L3, so probably the IPC from the E-cores is the one getting a big treatment.
Posted on Reply
#23
Random_User
DavenAsk yourself this. Why doesn’t the leaked presentation sing the high IPC increase of the P-cores from the roof tops? Answer: Because there is no increase of the IPC in the P-cores to sing about.
Exactly. If they'd have the P cores as efficient as... rival, they'd not need the E-cores in a first place. And they would produce a cr*p ton of boring presentations, and flood all media with it, and shout about their "achievements" from every corner.
Posted on Reply
#24
phanbuey
Random_UserExactly. If they'd have the P cores as efficient as... rival, they'd not need the E-cores in a first place. And they would produce a cr*p ton of boring presentations, and flood all media with it, and shout about their "achievements" from every corner.
Sort of - the P cores are made on a larger node, so they won't ever be efficient as rival - the e cores should be on their own tile, and that tile sould be replaced with HBM or some other cache tile for gaming-grade processors, and e-cores (better ones) for consumer pc, and leave the e core free Xeons for workstations and clock them less aggressively.

Thier biggest mistake with RPL was Instead of increasing the cache amount they decided to clock the chips well beyond their capability. The ST performance of RPL core is amazing even at 5.3/5.2 ghz - if they just had more cache on that design they could have kept up in gaming at much lower wattage, and taken a very minor L in multithreading performance to the 7950x. Costs were probably the reason for this chioce, but still - would be nice to use the e core space for cache or hbm instead.
Posted on Reply
#25
InVasMani
Darmok N JaladYeah, I have seen this myself with my work laptop. Sometimes the E cores aren't doing much of anything when both(!) P cores are pegged at 100%. And this is running MS Office under Windows 11.
What do you expect in certain workload scenario's the E cores are for MT headroom they aren't intended primary cores, but secondary cores. Peak ST performance will nearly always be on the P cores due HT and clock speed differences, but peak MT favors E cores. The whole point is really to spread the work around for stuff that doesn't need peak ST performance beyond 6 to 8P cores or whatever. Depending on workload scenario's OC on P cores or E cores provides relative advantages and disadvantages.

Typically just dropping E cores clock speeds a bit and pushing P cores a bit harder and/or ensuring they can boost longer w/o thermal problems is fine in practice. Most workloads don't need peak ST performance more than 8 cores anyway. I mean hell we use to live in a world where all we heard was four cores is all you need.

That's still fairly true a good amount of the time since most workloads aren't exactly pegging 8P cores to death or even more than 4P cores in many cases. Workloads vary of course and you can point to whichever data you wish to in order to illustrate or make most points of topic banter arguments.

I'm satisfied with my 14700K for what I got it for it was good deal. Is it perfect not exactly, but does it matter to me not really. Do I even notice in daily operation not all.
Posted on Reply
Add your own comment
Jul 16th, 2024 00:42 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts