Wednesday, May 8th 2024

Core Configurations of Intel Core Ultra 200 "Arrow Lake-S" Desktop Processors Surface

Intel is giving its next-generation desktop processor lineup the Core Ultra 200 series processor model numbering. We detailed the processor numbering in our older report. The Core Ultra 200 series would be the company's first desktop processors with AI capabilities thanks to an integrated 50 TOPS-class NPU. At the heart of these processors is the "Arrow Lake" microarchitecture. Its development is the reason the company had to refresh "Raptor Lake" to cover its 2023-24 processor lineup. The company's "Meteor Lake" microarchitecture topped off at CPU core counts of 6P+8E, which would have proven to be a generational regression in multithreaded application performance over "Raptor Lake." The new "Arrow Lake-S" desktop processor has a maximum CPU core configuration of 8P+16E, which means consumers can expect at least the same core-counts at given price-points to carry over.

According to a report by Chinese tech publication Benchlife.info, the introduction of "Arrow Lake" would see Intel's desktop processor model numbering align with that of its mobile processor numbering, and incorporate the Core Ultra brand to denote the latest microarchitecture for a given processor generation. Since "Arrow Lake" is a generation ahead of "Meteor Lake," processor models in the series get numbered under Core Ultra 200 series.
Intel will likely debut the lineup with overclocker-friendly K and KF SKUs. The lineup is led by the Core Ultra 9 285K (and possibly the 285KF), which comes with an 8P+16E core configuration, a processor base power value of 125 W, and a maximum P-core boost frequency of 5.50 GHz. This is followed by the Core Ultra 7 265K (and 265KF), with an 8P+12E core configuration; and the Core Ultra 5 245K, with a 6P+8E core-configuration.

There are also some 65 W non-K models in the middle, although these don't have similar processor model numbers to the K/KF parts. There's the Core Ultra 9 275 (8P+16E, 65 W); the Core Ultra 7 255 (8P+12E, 65 W); and the Core Ultra 5 240 (6P+4E, 65 W).

"Arrow Lake" is a chiplet-based processor, just like "Meteor Lake." Its compute tile, the piece of silicon with the CPU cores, packs up to 8 "Lion Cove" performance cores (P-cores), and up to 16 "Skymont" efficiency cores (E-cores). The processor is also expected to feature a 50 TOPS-class NPU for on-device AI acceleration, and a truncated version of the Xe-LPG iGPU the company is using with "Meteor Lake," which could be branded differently from the Arc Graphics branding Intel is using on the Core Ultra 100 series mobile chips. "Arrow Lake" is also expected to debut a new CPU socket on the desktop platform, the LGA1851, with more I/O capabilities than the LGA1700 and "Raptor Lake."
Sources: BenchLife, VideoCardz
Add your own comment

101 Comments on Core Configurations of Intel Core Ultra 200 "Arrow Lake-S" Desktop Processors Surface

#51
Dr. Dro
AsniWe already know E-cores are pointless in a non-battery powered device.
This time Amd has the chance to prove that SMT/HT is more important than additional, fake, cores.
E-cores are not power efficient, but rather area-efficient. A Gracemont E-core cluster in Alder or Raptor Lake is comprised of four E-cores that take the area space of roughly 1 P-core.
Zyll GoliatHmmm....So let me get this straight no more Hyper Threading U9 will have "only" 8 performance cores but 16 efficient cores there are rumors around the net that we could expect better improvements in IPC from 5% to 15% with P-cores 'tho some people claim it will be much better improvements when it comes to the E-cores then again U9 285 have in total 24 Threads compared to the I9 14900k that have 32 Threads....hmmm is it going to be better in multithreads apps at all??? ....I mean most likely is going to be much more efficient when it comes to the power draw and probably great for gaming especially for games that don't use more than 8c.....Hmm I don't know it does not look so promising + I personally never liked when some company have need to put in product name something like "ultra" ...."extra"..."mega"..."superb"......
Cinebench might look a little prettier on Raptor Lake, but if I had to guess, that's about where it ends. Disabling HT is common practice on Raptor as is, great power and heat savings, same or better performance all-around. Doing away with SMT closes a side-channel that could be exploited and improves resource utilization of the processor. At this scale, E-cores and the new LPE-cores (if Arrow has those) will more than make up for the lack of P-core SMT.
john_So, I guess it's 4.5GHz with Intel Baseline Profile for the top model? :p

I wonder if that 5.5 is final. And at what wattage. If Intel stays at 5.5 for the final top product, then they either can't go higher, don't need to go higher thanks to IPC gains, or knew that their silicon was having instability and degrading issues at 6.x GHz and decided this time to stay at safer frequencies. Of course it is a new chip, new architecture, new node, anything can be a reason.
Similar clocks to i9-13900K. If these are final, we're probably going to see a KS version of the 285 pushing 5.7-5.8 or so.
FoulOnWhiteIf your current setup is still doing fine, is there any need to jump straight onto zen 5 or arrow lake anyway, so the 6mths does not really matter. With zen 5 i would rather wait 6mths anyway for them to get their ageesa shit together, unless they are on the ball with it this time.
But how am I gonna quench my FOMO? :roll:

Getting a 285K would be like adopting the i9-12900K. You get to experience the newest and the exciting, but you also deal with the whole truckload of bugs. I think if I'm purchasing CPUs out of necessity, i'm only going to be purchasing refined "tock" KS CPUs from here on out.

My brother's interested in upgrading as he's still rocking my old Zen 2 chip, it's a 3900XT but it's not really great anymore. Who knows, if the pricing on these Ultra chips is great and he wants my 13900KS, all he's gotta do is buy some DDR5 and it's his.
Posted on Reply
#52
Darmok N Jalad
AsniWe already know E-cores are pointless in a non-battery powered device.
This time Amd has the chance to prove that SMT/HT is more important than additional, fake, cores.
I have an Adler Lake laptop for work, where the "i7" grade is 2P+8E. Performance leaves much to be desired for an i7. Just today, I pasted into Excel that had a decent amount of calculations to do. It pegged the 2P cores while the 8E cores were underutilized. It was slow, too. Even 4P+0E would be faster, IMO. Running W11, Office, and I never notice the E cores getting fully utilized for parallel tasks. The P cores get bogged down and then it's time to wait.
Posted on Reply
#53
Outback Bronze
This Arch is all going to come down to IPC.

Let's see if they can release another Core 2.
Posted on Reply
#54
pressing on
DavenI predict bases on rumors that the Ryzen 9 9950X will be on average 30-40% faster for traditional computing tasks.
According to wccftech a claimed reliable source is now saying this will be a much lower figure, see wccftech story 'AMD Zen 5 CPUs Rumored To Feature Around 10% IPC Increase, Slightly More In Cinebench R23 Single-Thread Test'.

Also wccftech say that "...the 40% rumor never talked about IPC and only talked about a specific SPEC result".
Posted on Reply
#55
mkppo
dgianstefaniIt's more that with 24 cores, you don't need HT, which is a security hole, makes cores more complex, and was introduced to help low core count CPUs.

In the age of massive core counts HT/SMT is not needed. Without it you can clock higher, use less voltage, and design more secure processors.
This is not true at all and has entirely to do with the architecture. Without diving deep into the rabbit hole, the architecture (and the workload) determines whether there's any benefit to HT or not. There can be massive benefits or none at all. I believe Ian Cuttress had an article when he was at Anandtech a few years back, have a read.

Being a TPU staff, I'd expect you to check your facts better before putting out blanket statements like this.
Posted on Reply
#56
dgianstefani
TPU Proofreader
mkppoThis is not true at all and has entirely to do with the architecture. Without diving deep into the rabbit hole, the architecture (and the workload) determines whether there's any benefit to HT or not. There can be massive benefits or none at all. I believe Ian Cuttress had an article when he was at Anandtech a few years back, have a read.

Being a TPU staff, I'd expect you to check your facts better before putting out blanket statements like this.
So what did I say that is factually incorrect. Be specific.
Posted on Reply
#57
GerKNG
2024 and Intel still sells 8 Cores as their top Product.
Posted on Reply
#58
Dr. Dro
GerKNG2024 and Intel still sells 8 Cores as their top Product.
This isn't their top tier product, just in the same way Ryzen 9s aren't top tier products for AMD. Both of them also have 8-cores as their maximum if you nitpick, because that's what each CCD on a Ryzen chip has. The Ryzen 9's have drawbacks all the same.
Posted on Reply
#59
Darmok N Jalad
mkppoThis is not true at all and has entirely to do with the architecture. Without diving deep into the rabbit hole, the architecture (and the workload) determines whether there's any benefit to HT or not. There can be massive benefits or none at all. I believe Ian Cuttress had an article when he was at Anandtech a few years back, have a read.

Being a TPU staff, I'd expect you to check your facts better before putting out blanket statements like this.
Apple doesn't use SMT, and they use a really wide architecture.
Posted on Reply
#60
Daven
AMDK11Skylake - SunnyCove
micro-ops(decode + uop cache) from 11 to 11 +0%
Dispatch/Rename from 4 to 5 +25%
execution ports from 8 to 10 +25%
With 2xFP/ALU + 2xALU, 1xS/D + 3xAGU
for 3xFP/ALU + 1xALU, 2xS/D + 4xAGU
IPC average +18%

SunnyCove - GoldenCove
micro-ops(decode + uop cache) from 11 to 14 +27%
Dispatch/Rname from 5 to 6 +20%
execution ports from 10 to 12 +20%
With 3xFP/ALU + 1xALU, 2xS/D + 4xAGU
for 3xFP/ALU + 2xALU, 2xS/D + 5xAGU
FPU+ALU from 4 to 5 +25%
IPC average +19%

GoldenCove - LionCove
micro-ops(decode + uop cache) from 14 to 24 +71.4%
Dispatch/Rename from 6 to 8 +33.3%
execution ports from 12 to 18 +50%
With 3xFP/ALU + 2xALU, 2xS/D + 5xAGU
up to 4xFPU, 6xALU, 2xS/D + 6xAGU
FPU+ALU from 5 to 10 +100%
IPC average +??%

Two different diagrams of the LionCove core from LunarLake graphics:

LionCove introduces a larger scale redesign and expansion than previously SunnyCove to Skylake and GoldenCove to SunnyCove. I don't know how much of an increase in IPC this will give, but I have a feeling that it will be more than what the current leaks say.

ArrowLake is based on LionCove and Skymont cores.

Skymont has a 3x 3-way(9-Way) decoder, while Gracemont has a 2x 3-way(6-Way) decoder, which is an increase of 50%.


LionCove core:
Intel always represents the Predictor as one block in the diagram. In the case of LionCove it looks like 4 Tier or 4-Way.

LionCove has 24 ops from the decoder and uop cache. GoldenCobe has 14 uops (6 from the decoder and 8 from the uop cache). LionCove has an 8-10-Way and 16-14 decoder with uop cache.
Very nice comparisons. Its good to know there are changes between Raptor Lake and Arrow Lake. Let’s see if it translates into IPC improvements. Right now Raptor Lake has a 10% IPC advantage over Zen 4.
pressing onAccording to wccftech a claimed reliable source is now saying this will be a much lower figure, see wccftech story 'AMD Zen 5 CPUs Rumored To Feature Around 10% IPC Increase, Slightly More In Cinebench R23 Single-Thread Test'.

Also wccftech say that "...the 40% rumor never talked about IPC and only talked about a specific SPEC result".
I’m also including clock speed improvements but it will not be as fast as I said if IPC only goes up 10%z
Posted on Reply
#61
mkppo
dgianstefaniSo what did I say that is factually incorrect. Be specific.
This is your quote: "In the age of massive core counts HT/SMT is not needed. Without it you can clock higher, use less voltage, and design more secure processors."

First of all, 'HT/SMT is not needed' is incorrect, regardless of core counts. It depends on the arch/workload.
Secondly, SMT will not automatically lower clocks/need higher voltages. Disabling SMT in a CPU that is designed with SMT in mind might allow higher clocks/lower voltages, but you lose performance and CPU utilization which in turn might allow those clocks. But if you design an arch without SMT, there are too many variables in play to determine whether clocks will actually increase or decrease. So no, not having SMT will not automatically increase clocks.

Security part is true, SMT does require added security measures. I guess given intel's track record it's probably a good thing they're not going to have HT.

Here's a link to one of his articles on SMT: www.anandtech.com/show/16261/investigating-performance-of-multithreading-on-zen-3-and-amd-ryzen-5000/5
Dr. DroThis isn't their top tier product, just in the same way Ryzen 9s aren't top tier products for AMD. Both of them also have 8-cores as their maximum if you nitpick, because that's what each CCD on a Ryzen chip has. The Ryzen 9's have drawbacks all the same.
I mean, the disadvantage of having cores with different ISA's are on an entirely different level compared to having two different CCD's with the same cores. Even having Zen4c's on a different CCD is better than intel's approach for pretty much any server workload and sometimes causes issues on the client side as well.

"all the same" isn't really the same.
Posted on Reply
#62
Dr. Dro
mkppoI mean, the disadvantage of having cores with different ISA's are on an entirely different level compared to having two different CCD's with the same cores. Even having Zen4c's on a different CCD is better than intel's approach for pretty much any server workload and sometimes causes issues on the client side as well.

"all the same" isn't really the same.
The only disadvantage is that they're not as powerful, but that doesn't mean they're slouches either. Gracemont E-cores have very similar performance to an i7-6700K's Skylake core. And we're having 16 additional cores, complete with hardware-based thread scheduling and an operating system capable of correctly addressing them for maximum efficiency.

I'd argue that the Ryzen 9 X3D's are the ones with an issue, considered the performance inconsistencies due to mismatched cache sizes across CCDs, cross-CCD latencies, eventual Infinity Fabric bottlenecks and of course, lacking the hardware thread scheduler entirely... and even that doesn't really matter to most workloads, these compromises are intentional from preventing the CPU from being a little too good on the AMD side.
Posted on Reply
#63
Darmok N Jalad
Dr. DroThe only disadvantage is that they're not as powerful, but that doesn't mean they're slouches either. Gracemont E-cores have very similar performance to an i7-6700K's Skylake core. And we're having 16 additional cores, complete with hardware-based thread scheduling and an operating system capable of correctly addressing them for maximum efficiency.

I'd argue that the Ryzen 9 X3D's are the ones with an issue, considered the performance inconsistencies due to mismatched cache sizes across CCDs, cross-CCD latencies, eventual Infinity Fabric bottlenecks and of course, lacking the hardware thread scheduler entirely... and even that doesn't really matter to most workloads, these compromises are intentional from preventing the CPU from being a little too good on the AMD side.
Except I see in real life usage that the E cores don't get fully used when the poor P cores are maxed out. And this is in Excel, something that should handle parallel threads when doing the same calculations over and over. It just doesn't happen, and this is with W11.
Posted on Reply
#64
Minus Infinity
Dristun

They have to beat the X3D parts all around the board. I want to believe!
If only Intel weren't facing Zen 5 quite soon or Zen 5 X3D early next year.
Posted on Reply
#65
mkppo
Dr. DroThe only disadvantage is that they're not as powerful, but that doesn't mean they're slouches either. Gracemont E-cores have very similar performance to an i7-6700K's Skylake core. And we're having 16 additional cores, complete with hardware-based thread scheduling and an operating system capable of correctly addressing them for maximum efficiency.

I'd argue that the Ryzen 9 X3D's are the ones with an issue, considered the performance inconsistencies due to mismatched cache sizes across CCDs, cross-CCD latencies, eventual Infinity Fabric bottlenecks and of course, lacking the hardware thread scheduler entirely... and even that doesn't really matter to most workloads, these compromises are intentional from preventing the CPU from being a little too good on the AMD side.
Nope, the only disadvantage is not that they're not powerful. Note that I mentioned server workloads. In that scenario, the p/e-cores heterogeneous arch simply doesn't work. Look at their lineup, do you see any? Ignoring the fact that AMD destroys them in that space, but they can only go either full e-core or full p-core. Even AMD didn't do a Zen 4/Zen4c combo for servers, because their workloads don't like mixing and matching at all.

Secondly, intel's hardware scheduler simply gives pointers to windows to schedule it correctly, but it doesn't work all the time as Darmok pointed out and there are other cases too. AMD doesn't face the same issue on workloads other than games. Ian did a deep dive into it but I think it was a podcast, I can't seem to find it in a pinch. AMD tried to go all software route for X3D, simply because the only thing they need to schedule strictly to the 8 cores are games, the rest of the workloads will perform similarly with the same scheduler used for the 7950x. So yeah, the game bar is finicky with older games and a solution similar to intel would work better for sure, but that's only for games. For the rest of workloads, they simply do not need a hardware scheduling pointer.
Posted on Reply
#66
Dr. Dro
Darmok N JaladExcept I see in real life usage that the E cores don't get fully used when the poor P cores are maxed out. And this is in Excel, something that should handle parallel threads when doing the same calculations over and over. It just doesn't happen, and this is with W11.
Windows' scheduling often conflicts with the thread director. It just sucks. It has so much legacy cruft that it'll never support state of the art hardware well.
mkppoNope, the only disadvantage is not that they're not powerful. Note that I mentioned server workloads. In that scenario, the p/e-cores heterogeneous arch simply doesn't work. Look at their lineup, do you see any? Ignoring the fact that AMD destroys them in that space, but they can only go either full e-core or full p-core. Even AMD didn't do a Zen 4/Zen4c combo for servers, because their workloads don't like mixing and matching at all.

Secondly, intel's hardware scheduler simply gives pointers to windows to schedule it correctly, but it doesn't work all the time as Darmok pointed out and there are other cases too. AMD doesn't face the same issue on workloads other than games. Ian did a deep dive into it but I think it was a podcast, I can't seem to find it in a pinch. AMD tried to go all software route for X3D, simply because the only thing they need to schedule strictly to the 8 cores are games, the rest of the workloads will perform similarly with the same scheduler used for the 7950x. So yeah, the game bar is finicky with older games and a solution similar to intel would work better for sure, but that's only for games. For the rest of workloads, they simply do not need a hardware scheduling pointer.
Xeons are monolithic, it's clear as day that it's just not gonna scale to the same level as a multi-chiplet MCM.
Posted on Reply
#67
phanbuey
I think people are going to be surprised... this is on a new node.
mkppoNope, the only disadvantage is not that they're not powerful. Note that I mentioned server workloads. In that scenario, the p/e-cores heterogeneous arch simply doesn't work. Look at their lineup, do you see any? Ignoring the fact that AMD destroys them in that space, but they can only go either full e-core or full p-core. Even AMD didn't do a Zen 4/Zen4c combo for servers, because their workloads don't like mixing and matching at all.

Secondly, intel's hardware scheduler simply gives pointers to windows to schedule it correctly, but it doesn't work all the time as Darmok pointed out and there are other cases too. AMD doesn't face the same issue on workloads other than games. Ian did a deep dive into it but I think it was a podcast, I can't seem to find it in a pinch. AMD tried to go all software route for X3D, simply because the only thing they need to schedule strictly to the 8 cores are games, the rest of the workloads will perform similarly with the same scheduler used for the 7950x. So yeah, the game bar is finicky with older games and a solution similar to intel would work better for sure, but that's only for games. For the rest of workloads, they simply do not need a hardware scheduling pointer.
That's not why "it doesn't work" -- servers aren't personal devices and they're not workstations - they're pieced out into 1000 VMs and docker containers to rent out compute, so yeah if that's your function that heterogeneous won't work for that purpose. 0% of gamers and a tiny fraction of consumers have that purpose. SMT and HT is old bloated tech -- that always had a single core perfomance hit -- a hit we were willing to take for more multithreaded performance but if you can meet that using another specialized core, then you will gain performance just from the design.

The intel 14th gen optimization tool was quite eye opening -- double % gains in some games with just software e-core optimizations; we are comparing early heterogeneous implementation to end stage HT/SMT...

All mobile devices, consumer devices... have been using heterogenous and AMD is moving there with Zen 6 too - because it actually works great. It's the only way intel was even able to stay competitive on an inferior node.
Posted on Reply
#68
apoklyps3
Why does Intel still exist?
There's no competition
Posted on Reply
#69
mkppo
phanbueyI think people are going to be surprised... this is on a new node.

That's not why "it doesn't work" -- servers aren't personal devices and they're not workstations - they're pieced out into 1000 VMs and docker containers to rent out compute, so yeah if that's your function that heterogeneous won't work for that purpose. 0% of gamers and a tiny fraction of consumers have that purpose. SMT and HT is old bloated tech -- that always had a single core perfomance hit -- a hit we were willing to take for more multithreaded performance but if you can meet that using another specialized core, then you will gain performance just from the design.

The intel 14th gen optimization tool was quite eye opening -- double % gains in some games with just software e-core optimizations; we are comparing early heterogeneous implementation to end stage HT/SMT...

All mobile devices, consumer devices... have been using heterogenous and AMD is moving there with Zen 6 too - because it actually works great. It's the only way intel was even able to stay competitive on an inferior node.
Not sure what you mean when you say 'thats not why it doesnt work'. I didn't even state why it doesnt work, but stated that it doesn't work in the server space. For consumers, any heterogenous arch will work as long as the scheduler is up to scratch. I wasn't even talking about the consumer space but merely pointing out that intel's hardware scheduler doesn't always work and AMD doesn't face the same limitation with X3D's other than games.

Regarding SMT, yeah for sure having a core that's better utilized without SMT will work better than one that 'needs' it to take advantage of the arch especially in the consumer space. Again, the server space is a different matter.

In the consumer space heterogenous architectures are pretty much the way forward, especially in laptops. I didn't state otherwise..
Posted on Reply
#70
Pumper
GerKNG2024 and Intel still sells 8 Cores as their top Product.
Is it though? 2E cores are about equal to 1P core, so 8+16 configuration is pretty much the same as 16P core CPU in MT performance.
Posted on Reply
#71
mkppo
Dr. DroXeons are monolithic, it's clear as day that it's just not gonna scale to the same level as a multi-chiplet MCM.
That's not the point I was making. I was stating that they only have e or p cores in the server space because heterogenous architectures simply don't work in that space. Which is why intel either fields a bunch of P cores and gets slaughtered because of their inferior perf/w, or fields a bunch of e cores which are arguably more interesting but AMD countered those pretty hard with Zen4c and 5c is rumoured to increase performance of those little cores by a big amount.
Posted on Reply
#72
stimpy88
So, just more of the same then...
Posted on Reply
#73
arni-gx
i hope intel will not making the same mistake as i7-i9 series gen 13-14th with their motherboard, strictly use PL1-PL2 with 65-125w/253w for all i7-i9 series gen 15th.
Posted on Reply
#74
sephiroth117
I don’t see them rivaling a 7800X3D gaming efficiency anytime soon but quite curious to finally see a new design for non Y/U cpu
Posted on Reply
#75
Assimilator
Dr. Dromulti-chiplet MCM
Sorry for doing this but MCM literally means "multi-chiplet module", so you just wrote "multi-chiplet multi-chiplet module" :p
Posted on Reply
Add your own comment
Dec 3rd, 2024 14:26 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts