Wednesday, May 8th 2024

Core Configurations of Intel Core Ultra 200 "Arrow Lake-S" Desktop Processors Surface

Intel is giving its next-generation desktop processor lineup the Core Ultra 200 series processor model numbering. We detailed the processor numbering in our older report. The Core Ultra 200 series would be the company's first desktop processors with AI capabilities thanks to an integrated 50 TOPS-class NPU. At the heart of these processors is the "Arrow Lake" microarchitecture. Its development is the reason the company had to refresh "Raptor Lake" to cover its 2023-24 processor lineup. The company's "Meteor Lake" microarchitecture topped off at CPU core counts of 6P+8E, which would have proven to be a generational regression in multithreaded application performance over "Raptor Lake." The new "Arrow Lake-S" desktop processor has a maximum CPU core configuration of 8P+16E, which means consumers can expect at least the same core-counts at given price-points to carry over.

According to a report by Chinese tech publication Benchlife.info, the introduction of "Arrow Lake" would see Intel's desktop processor model numbering align with that of its mobile processor numbering, and incorporate the Core Ultra brand to denote the latest microarchitecture for a given processor generation. Since "Arrow Lake" is a generation ahead of "Meteor Lake," processor models in the series get numbered under Core Ultra 200 series.
Intel will likely debut the lineup with overclocker-friendly K and KF SKUs. The lineup is led by the Core Ultra 9 285K (and possibly the 285KF), which comes with an 8P+16E core configuration, a processor base power value of 125 W, and a maximum P-core boost frequency of 5.50 GHz. This is followed by the Core Ultra 7 265K (and 265KF), with an 8P+12E core configuration; and the Core Ultra 5 245K, with a 6P+8E core-configuration.

There are also some 65 W non-K models in the middle, although these don't have similar processor model numbers to the K/KF parts. There's the Core Ultra 9 275 (8P+16E, 65 W); the Core Ultra 7 255 (8P+12E, 65 W); and the Core Ultra 5 240 (6P+4E, 65 W).

"Arrow Lake" is a chiplet-based processor, just like "Meteor Lake." Its compute tile, the piece of silicon with the CPU cores, packs up to 8 "Lion Cove" performance cores (P-cores), and up to 16 "Skymont" efficiency cores (E-cores). The processor is also expected to feature a 50 TOPS-class NPU for on-device AI acceleration, and a truncated version of the Xe-LPG iGPU the company is using with "Meteor Lake," which could be branded differently from the Arc Graphics branding Intel is using on the Core Ultra 100 series mobile chips. "Arrow Lake" is also expected to debut a new CPU socket on the desktop platform, the LGA1851, with more I/O capabilities than the LGA1700 and "Raptor Lake."
Sources: BenchLife, VideoCardz
Add your own comment

101 Comments on Core Configurations of Intel Core Ultra 200 "Arrow Lake-S" Desktop Processors Surface

#26
persondb
AsniWe already know E-cores are pointless in a non-battery powered device.
This time Amd has the chance to prove that SMT/HT is more important than additional, fake, cores.
What additional fake core?

Also SMT/HT has it's own issues, There are a lot of ways to implement it and they are not equal in terms of performance and transistor budget. Take a look at this article which comments on how the structures can be replicated for SMT

chipsandcheese.com/2024/03/13/loongson-3a6000-a-star-among-chinese-cpus/comment-page-1/

Which the tl;dr is Statically Partioned(each thread gets a static portion of structure), Duplicated, Watermark(each thread can have up to X of resources) and Competitively Shared.

As an example, supposed you have a Core with 96 Physical Registers and two threads. How are you doing to handle it? If you assign 48 physical registers to each then you are reducing the IPC of the thread in general, it can have moments where it could have used more registers but it didn't have them while the other thread is inactive. Watermark and Competitively Shared solve that, but they are much more complex and take more transistors.

Not only that but they might also become harder to implement as the structure sizes (and other related things) increases. As an example, supposed you want to add another write port to the register file as you noticed that things were stalling due to lack of those, you will likely need to add a lot more transistor so it can run the Competitively Shared stuff.

This might mean that HT for Intel had reached such proportions that they thought that the advantages that it brings isn't enough for the drawbacks. A lot of chip designers like ARM and Apple have never used HT/SMT and their designs are very competitive.

In some cases, HT/SMT works really well and is basically a 'free performance boost' like with Pentium 4s where issues with the pipeline might have meant a lot of resources went unused. Or for situations where maximum threads is the objective like for Cloud, as the client pays for vCPU, so more threads = more vCPUs to sell(hence why IBM does 4-way or 8-way SMT).
Posted on Reply
#27
R0H1T
Intel's SMT implementation wasn't very good anyway, IIRC AMD already exceeded their MT efficiency with first gen Zen.
Posted on Reply
#28
Prima.Vera
Is it me or the new naming convention sucks big time??
Posted on Reply
#29
phanbuey
Prima.VeraIs it me or the new naming convention sucks big time??
It's terrible.
Posted on Reply
#30
qcmadness
R0H1TIntel's SMT implementation wasn't very good anyway, IIRC AMD already exceeded their MT efficiency with first gen Zen.
Not really a good sign.

That means the first thread cannot utilize all the resources available.
tabascosauzSo you basically went off the assumption that 2 generations of development and a new node will.......result in 0 efficiency gains and 0 IPC gains?

The way you described Meteor Lake makes me think you don't really understand what IPC is. The worst it can do is stay the same due to no radical arch changes. It doesn't and didn't "drop" due to a clock deficit.

That said, yes I agree, if Intel wants to compete with X3D they will have to leverage tiling to get significantly more cache. Trying to get a bit more traditional L3 or relying on Pcore arch alone isn't going to cut it.
Why it can't be?

Moving the memory controller out for the die, for example, will lower the IPC.
Posted on Reply
#31
tabascosauz
qcmadnessNot really a good sign.

That means the first thread cannot utilize all the resources available.


Why it can't be?

Moving the memory controller out for the die, for example, will lower the IPC.
Point taken, but at that point penalties from disaggregation is more an external factor doesn't really fit that well with the usual understanding of IPC not really extending into uncore.

I'm also not sure I would extrapolate the current MTL interconnect behaviour to anything desktop in the future. There are significant differences in the way Ryzen mobile handles Fabric clock behaviour vs desktop - MTL seems to signal Intel heading in that direction as well.

MTL has outstanding idle package power in some designs, on par with monolithic -U and -P, so certainly the uncore behaviour is all round tailored to achieve that result.
Posted on Reply
#32
R0H1T
qcmadnessNot really a good sign.
It's slightly better than the alternative, besides no chip can be fully optimized for all sorts of workloads or programs for 100% utilization all the time.
Posted on Reply
#33
Tek-Check
Crackong188W or 253w ?
Intel says: 380W
Posted on Reply
#34
oxrufiioxo
The more I hear about this and the AI, AI, AI bs the less I'm excited about it. I do hope it's at least good enough to push AMD though it's best for consumers when both cpu makers are making decent products we don't need a rocketlake 2.0 situation.
Posted on Reply
#35
P4-630
Prima.VeraIs it me or the new naming convention sucks big time??
I don't care!

If it's at least a substantial (a leap was used previously..) overall improvement over last gen!
Posted on Reply
#36
Space Lynx
Astronaut
Competition is good for everyone, so I hope Arrow Lake does very well.
Posted on Reply
#37
Noyand
bugYou can tell that by a list of core configs and TDPs? Amazing!
Daven seems to know for a fact that arrow lake is just a die shrink of meteor lake. He's been saying that for months
Posted on Reply
#38
Daven
tabascosauzSo you basically went off the assumption that 2 generations of development and a new node will.......result in 0 efficiency gains and 0 IPC gains?

The way you described Meteor Lake makes me think you don't really understand what IPC is. The worst it can do is stay the same due to no radical arch changes. It doesn't and didn't "drop" due to a clock deficit.

That said, yes I agree, if Intel wants to compete with X3D they will have to leverage tiling to get significantly more cache. Trying to get a bit more traditional L3 or relying on Pcore arch alone isn't going to cut it.
www.techpowerup.com/forums/threads/intel-meteor-lake-p-cores-show-ipc-regression-over-raptor-lake.317317/

You can remove core functionality anytime resulting in app performance drops. These drops are factored into median app performance or IPC.
Posted on Reply
#39
tabascosauz
Davenwww.techpowerup.com/forums/threads/intel-meteor-lake-p-cores-show-ipc-regression-over-raptor-lake.317317/

You can remove core functionality anytime resulting in app performance drops.
I acknowledged in an earlier reply that disaggregation has had some impact on MTL performance, but I'm not aware of there being any removal of core functionality.

I'm aware of the SPEC testing that all the outlets reported on, but not too keen on the testing methodology there. Not much has been said about how the author verified actual clocks during test of any of the parts (did he just divide by advertised boost clocks?), and I think it rather clear from knowledge elsewhere and from his own results that SPEC 2017 is significantly more cache and memory intensive than he admits. Seeing as memory subsystem is completely apples to oranges, seems to be kinda important.

Just saying that "LPDDR5 vs. DDR5 shouldn't have much effect" is pretty bizarre. LPDDR performance is anything but great, it's just games that have recently put it in a favourable light since iGPUs like the bandwidth it provides while caring less about timings and latency. While I'm sure he made do with what he had (hard to have fair comparisons in laptops), it doesn't mean the results necessarily have any significance.
Posted on Reply
#40
AMDK11
Skylake - SunnyCove
micro-ops(decode + uop cache) from 11 to 11 +0%
Dispatch/Rename from 4 to 5 +25%
execution ports from 8 to 10 +25%
With 2xFP/ALU + 2xALU, 1xS/D + 3xAGU
for 3xFP/ALU + 1xALU, 2xS/D + 4xAGU
IPC average +18%

SunnyCove - GoldenCove
micro-ops(decode + uop cache) from 11 to 14 +27%
Dispatch/Rname from 5 to 6 +20%
execution ports from 10 to 12 +20%
With 3xFP/ALU + 1xALU, 2xS/D + 4xAGU
for 3xFP/ALU + 2xALU, 2xS/D + 5xAGU
FPU+ALU from 4 to 5 +25%
IPC average +19%

GoldenCove - LionCove
micro-ops(decode + uop cache) from 14 to 24 +71.4%
Dispatch/Rename from 6 to 8 +33.3%
execution ports from 12 to 18 +50%
With 3xFP/ALU + 2xALU, 2xS/D + 5xAGU
up to 4xFPU, 6xALU, 2xS/D + 6xAGU
FPU+ALU from 5 to 10 +100%
IPC average +??%

Two different diagrams of the LionCove core from LunarLake graphics:

LionCove introduces a larger scale redesign and expansion than previously SunnyCove to Skylake and GoldenCove to SunnyCove. I don't know how much of an increase in IPC this will give, but I have a feeling that it will be more than what the current leaks say.

ArrowLake is based on LionCove and Skymont cores.

Skymont has a 3x 3-way(9-Way) decoder, while Gracemont has a 2x 3-way(6-Way) decoder, which is an increase of 50%.


LionCove core:
Intel always represents the Predictor as one block in the diagram. In the case of LionCove it looks like 4 Tier or 4-Way.

LionCove has 24 ops from the decoder and uop cache. GoldenCobe has 14 uops (6 from the decoder and 8 from the uop cache). LionCove has an 8-10-Way and 16-14 decoder with uop cache.
Posted on Reply
#41
Denver
Considering the lower clock rate and loss of SMT, my assumption is that Intel must be giving up MT performance in favor of efficiency and gaming performance. It's going to be an interesting battle against Zen5.
Posted on Reply
#42
FoulOnWhite
certainly can't wait for tests of these, will be very interesting indeed. Gonna be an epic battle v Zen5
Posted on Reply
#43
P4-630
Also it could run cooler without HT and the maybe at somewhat lower speeds.
Posted on Reply
#44
oxrufiioxo
FoulOnWhitecertainly can't wait for tests of these, will be very interesting indeed. Gonna be an epic battle v Zen5
The thing is it's sounding like Zen5 is gonna be out 6-8 months earlier so is it really a competitor when it's half a year late.
Posted on Reply
#45
phanbuey
oxrufiioxoThe thing is it's sounding like Zen5 is gonna be out 6-8 months earlier so is it really a competitor when it's half a year late.
The real competitor will be the X3D, so the first 6 months of Zen 5 will be AMD milking it's early adopters as hard as possible - they're not releasing x3d until early 2025.

Zen 5 non-x3d will probably be on par with 7800x3D so for gaming nothing really is going to change, on the MT side you're right tho that 9950X will mop up the floor with the 14900k.
Posted on Reply
#46
AMDK11
According to leaks, ArrowLake is to be available at the end of the third quarter. Presentation next month.
Posted on Reply
#47
Steevo
For $1,00.0 USD You get instability on not just 1, but 4 cores!!! What a great deal!!!
Posted on Reply
#48
oxrufiioxo
phanbueyThe real competitor will be the X3D, so the first 6 months of Zen 5 will be AMD milking it's early adopters as hard as possible - they're not releasing x3d until early 2025.

Zen 5 non-x3d will probably be on par with 7800x3D so for gaming nothing really is going to change, on the MT side you're right tho that 9950X will mop up the floor with the 14900k.
I feel like both companies are doing this on purpose to be clear of the other one and be the new shiny thing for 3-6 months.
Posted on Reply
#49
FoulOnWhite
If your current setup is still doing fine, is there any need to jump straight onto zen 5 or arrow lake anyway, so the 6mths does not really matter. With zen 5 i would rather wait 6mths anyway for them to get their ageesa shit together, unless they are on the ball with it this time.
Posted on Reply
#50
rv8000
Setting my expectations VERY low, between the disappointing performance on the mobile core ultra parts (mostly massive reduction in MT) all they can really hope to do is reign in power consumption so it’s not obscenely in-efficient out of the box.
Posted on Reply
Add your own comment
Nov 21st, 2024 11:07 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts