Monday, January 22nd 2024

Intel 15th-Generation Arrow Lake-S Could Abandon Hyper-Threading Technology

A leaked Intel documentation we reported on a few days ago covered the Arrow Lake-S platform and some implementation details. However, there was an interesting catch in the file. The leaked document indicates that the upcoming 15th-Generation Arrow Lake desktop CPUs could lack Hyper-Threading (HT) support. The technical memo lists Arrow Lake's expected eight performance cores without any threads enabled via SMT. This aligns with previous rumors of Hyper-Threading removal. Losing Hyper-Threading could significantly impact Arrow Lake's multi-threaded application performance versus its Raptor Lake predecessors. Estimates suggest HT provides a 10-15% speedup across heavily-threaded workloads by enabling logical cores. However, for gaming, disabling HT has negligible impact and can even boost FPS in some titles. So Arrow Lake may still hit Intel's rumored 30% gaming performance targets through architectural improvements alone.

However, a replacement for the traditional HT is likely to come in the form of Rentable Units. This new approach is a response to the adoption of a hybrid core architecture, which has seen an increase in applications leveraging low-power E-cores for enhanced performance and efficiency. Rentable Units are a more efficient pseudo-multi-threaded solution that splits the first thread of incoming instructions into two partitions, assigning them to different cores based on complexity. Rentable Units will use timers and counters to measure P/E core utilization and send parts of the thread to each core for processing. This inherently requires larger cache sizes, where Arrow Lake is rumored to have 3 MB of L2 cache per core. Arrow Lake is also noted to support faster DDR5-6400 memory. But between higher clocks, more E-cores, and various core architecture updates, raw throughput metrics may not change much without Hyper-Threading.
Source: 3DCenter.org
Add your own comment

100 Comments on Intel 15th-Generation Arrow Lake-S Could Abandon Hyper-Threading Technology

#27
lemonadesoda
Thread priorities determined by admin/OS, potentially hundreds if not thousands, would be a nightmare to then have to "code forward " to the CPU with all the thread telemetry that would need to be kept, plus then add to that VM. How would you manage priorities over VMs? Nasty plus security issues.

Thread scheduling has become a nightmare with asymmetric CPU architecture. Mixing that between OS and CPU sounds like a recipe calling for too many cooks.
Posted on Reply
#28
stimpy88
I have a 16 core 32 thread CPU. Can somebody tell me what the point of SMT is for a CPU like this? Genuine question.
Posted on Reply
#29
Crackong
TumbleGeorgeTiles, chiplets?
They have done the same thing in Ultra mobile series CPU and those have HT enabled no problem....
Posted on Reply
#30
Daven
stimpy88I have a 16 core 32 thread CPU. Can somebody tell me what the point of SMT is for a CPU like this? Genuine question.
Its for applications and multitasking that would benefit from 32 threads being fully loaded. If you don’t have that use case, purchasing a 16/32 CPU was not the right CPU for you.

If you want to know what applications can benefit from 32 threads, look at CPU reviews with benchmarks that show increased performance as thread count goes up.

For multitasking, streaming, playing a game, voice chat and rendering all at the same time will definitely load all 32 threads. Someone recording a team play game session live for twitch TV is an example.

Lots of background tasks and editing a photo with complex filters while watching a 4K movie on another screen is another example that will load up 32 threads.
Posted on Reply
#31
FoulOnWhite
Be interesting to see first tests of these sans HT CPUs

Some people seem to confuse a HT thread as a real core and it is not. Nothing wrong with a 16core CPU with no HT, HT does not add that much really. A 16 core CPU is just that not a 32 core
Posted on Reply
#32
Nekajo
I think Intel knows what they are doing here. Can't wait to see Arrow Lake vs Zen 5.

If Intels new process node has lowered power usage then Intel is fully back in business again.

HT/SMT never were that important to begin with.
Posted on Reply
#33
Daven
NekajoI think Intel knows what they are doing here. Can't wait to see Arrow Lake vs Zen 5.

If Intels new process node has lowered power usage then Intel is fully back in business again.

HT/SMT never were that important to begin with.
It’s also possible Intel is leaving the high end desktop CPU business much like it is rumored that AMD is leaving the high end desktop GPU business. Isn’t wonton speculation driven by biased brand loyalty wonderful?
Posted on Reply
#34
Nekajo
DavenIt’s also possible Intel is leaving the high end CPU business much like it is rumored that AMD is leaving the high end GPU business. Isn’t wonton speculation driven by biased brand loyalty wonderful?
Intel has the best overall CPUs in the desktop segment IMO. Performance is not Intels problem, power usage is. With AMD you have to choose depending on your needs.

I bought 7800X3D because its the best gaming CPU and uses low power but for actual work its not impressive and 13700K/14700K which is priced similar will destroy my chip in most stuff outside of gaming, while only being a few percent behind in gaming.

7950X3D was supposed to be the sweet spot with good results regardless of workload, however it loses to 7800X3D in gaming and is beaten by Intel in content creation, 7950X beats it as well.

I hope AMD will address this with Zen 5. Hopefully 3D models won't be too far off the initial release and hopefully all the cores will get 3D cache this time. 7900X3D was kinda meh and 7950X3D should have 16 cores with 3D cache for sure with that price tag. In a perfect world there would be no 3D parts. The CPUs should be good overall regardless of workload.

Intels power draw is not much of a "problem" if you look at real world watt usage instead of synthetic loads.

A friend of mine has 13700K and it hovers around 100 watts in gaming which is 40 watts lower than my 7800X3D, performance is very similar.

According to Techpowerups review of 14th gen it also seems that Intel regains the lead in minimum fps at higher resolution.

14700K generally performs on par with 7950X3D in gaming while performing similar in content creation. Not too much difference, yet the i7 is much cheaper and boards are also generally cheaper than AM5 boards especially if you choose the B650E/X670E boards. Power draw is about 100 watts higher on the i7 when peaked tho.
Posted on Reply
#35
Aquinus
Resident Wat-man
AssimilatorThat said, I'm not sure why this requires HT to go away
It's because effectively scheduling these sorts of tasks without any knowledge by the scheduler regarding data dependencies and performance for the thread it's running on, there simply isn't enough data available to the CPU scheduler to effectively divvy out these tasks or move them in efficient ways. This gets a lot harder when you start adding multiple levels because there are different tradeoffs to switching to different threads. Not to mention the cost of getting it wrong is huge because context shifts are expensive from a cache locality standpoint, so more often than not you want threads to have some level of "stickiness," to them. This poses a real problem when pairing cores with different performance characteristics as well as threads via SMT which has its own series of tradeoffs.

I'd also partially blame the NT kernel for probably not having any good mechanisms for handling this sort of topology. Linux tends to have a better handle on these sorts of things because a lot of servers have to handle CPUs being on different sockets and the cost of context switching between cores on the same package versus on another package are very real. There are some really good reasons that most servers run Linux and it's because of these sorts of things.
Posted on Reply
#36
Pumper
Well, the e-cores are kind of a replacement for HT, and if P cores can be made to run at higher clocks with HT disabled to provide superiors ST performance, why not do it?
Posted on Reply
#37
BArms
I've disabled HT on all my builds since at least the 8700K. No downside at all that I can tell for normal usage.
Posted on Reply
#38
Assimilator
BArmsI've disabled HT on all my builds since at least the 8700K. No downside at all that I can tell for normal usage.
If you never enable HT how would you know how much performance you're losing FFS.
Posted on Reply
#39
Nekajo
BArmsI've disabled HT on all my builds since at least the 8700K. No downside at all that I can tell for normal usage.
I clearly remember seeing fps numbers for 9700K vs 9900K where 9900K performed much better in some games because of HT, especially when paired with a somewhat high-end GPU and this was 4-5 years ago probably

I'd probably not disable HT/SMT on a 6 core chip today, tons of games can use more than 6 threads
Posted on Reply
#40
HD64G
Almost surely next gen Intel CPUs won't have ΗΤ. Might have the reverse arch that share thoughts to the one Bulldozer had combining 2 parts of the same X to produce one more powerful thread for specific workloads.
Posted on Reply
#41
BArms
NekajoI clearly remember seeing fps numbers for 9700K vs 9900K where 9900K performed much better in some games because of HT, especially when paired with a somewhat high-end GPU and this was 4-5 years ago probably

I'd probably not disable HT/SMT on a 6 core chip today, tons of games can use more than 6 threads
To each their own I guess, to me not worth the security risk and the potential performance upside is minuscule to non-existent unless you're fully loading the CPU which I almost never do anyway.
Posted on Reply
#42
Nekajo
BArmsTo each their own I guess, to me not worth the security risk and the potential performance upside is minuscule to non-existent unless you're fully loading the CPU which I almost never do anyway.
Which security risk? Do you work for NASA?

Both Intel and AMD uses HT/SMT for a reason still. Tons of software is optimized with this in mind.

HT/SMT does more good than harm. In most software you gain, not loose performance.

Especially people with quad and hexa core chips should enable it for sure. Will make up for the lack of real cores.
Posted on Reply
#43
ThrashZone
Hi,
Yeah killing or just limiting e-core use would be best
They were always supposed to be used for background tasks and to me that is all windows tasks not necessarily user tasks so limit the use to updates/ security/ ms services....
Then p cores can do all user workloads with or without HT.

Without HT they can run a little cooler but then again p cores can clock a little higher to.

But mainly intel is just trying to mess with amd by taking away HT intel knows they have the higher clock speeds in the bag
Never underestimate intel's ability to troll amd x3d lol
Posted on Reply
#44
Assimilator
AquinusIt's because effectively scheduling these sorts of tasks without any knowledge by the scheduler regarding data dependencies and performance for the thread it's running on, there simply isn't enough data available to the CPU scheduler to effectively divvy out these tasks or move them in efficient ways. This gets a lot harder when you start adding multiple levels because there are different tradeoffs to switching to different threads. Not to mention the cost of getting it wrong is huge because context shifts are expensive from a cache locality standpoint, so more often than not you want threads to have some level of "stickiness," to them. This poses a real problem when pairing cores with different performance characteristics as well as threads via SMT which has its own series of tradeoffs.
Yeah, but can't the problems with HT be solved in a similar manner i.e. by bundling more context along with each task such that the CPU is able to make better decisions around scheduling?
AquinusI'd also partially blame the NT kernel for probably not having any good mechanisms for handling this sort of topology. Linux tends to have a better handle on these sorts of things because a lot of servers have to handle CPUs being on different sockets and the cost of context switching between cores on the same package versus on another package are very real. There are some really good reasons that most servers run Linux and it's because of these sorts of things.
But there hasn't really been a concept of heterogenous cores in the same package until now, and Windows Server has definitely encountered some of the same pains in core scheduling that Linux has. Everything I've read so far indicates that the issues that desktop Windows has encountered around scheduling over heterogenous cores is regarding high-performance/low-latency applications like games, not the line-of-business stuff that you'd find running on servers.
Posted on Reply
#45
Vayra86
BArmsTo each their own I guess, to me not worth the security risk and the potential performance upside is minuscule to non-existent unless you're fully loading the CPU which I almost never do anyway.
Euh.. ok

Just so you know, you can swallow your pride when we're not looking and quietly enable it. Its fine. ;)
Posted on Reply
#46
londiste
OnasiOut of interest, since this is a bit out of my field, was this also the case with the “separate cores sharing some HW” approach the like of which Bulldozer used?
Very minor difference, if at all. IIRC L1 data cache was one thing that was in separated units but rest of the shared parts are all the same ones that have had vulnerabilities in other architectures - caches, registers and associated management hardware.
ratirtDoes HT impact power consumption in any meaningful way? If it does, maybe that is Intel's goal. Substitute HT with ecores to reduce power consumption
Increases power consumption. After all, you get some 15-30% extra performance and that happens with more of the chip being in use. How much exactly is difficult to say and varies wildly across different use cases. Also, everything is dynamic these days - clocks, voltages etc - with power limit being the main factor (on some level which is not necessarily per core).
AssimilatorThat said, I'm not sure why this requires HT to go away; it feels like something that should be fixed at the scheduling level, not by completely reworking some very fundamental ways in which cores have been designed for over a decade. But considering how much pain scheduling over P- and E-cores has given Intel so far (WRT having both the CPU and OS having to be aware of which cores to schedule tasks to), they may have determined that this rather drastic approach - which seems to imply moving scheduling out of the operating system and fully back onto the CPU - is worth it.
P/E core scheduling itself proves to be pretty problematic even up to this point. P and E cores are quite different in performance and HT potentially widens that gap another 30-35%. I bet simply reducing the variability of how long a task takes on one kind of core makes scheduling easier on already mismatched set of cores.
AquinusI'd also partially blame the NT kernel for probably not having any good mechanisms for handling this sort of topology. Linux tends to have a better handle on these sorts of things because a lot of servers have to handle CPUs being on different sockets and the cost of context switching between cores on the same package versus on another package are very real. There are some really good reasons that most servers run Linux and it's because of these sorts of things.
I propose a different take on this - Microsoft's problems are twofold:
1. They want to and need to put their best foot forward on desktop. This leads and has lead to solutions optimized for presumed use cases. Windows Server scheduler seems to work a bit different than desktop Windows variants.
2. Microsoft is a company that needs to cater to needs and wants of hardware manufacturers. The best example I can think of was Bulldozer - AMD wanted it to be 8-core and Microsoft obliged, with performance issues and patches back and forth to get it working properly. What did Linux do? Did not give a crap and treated it as 4c/8t CPU which is what it architecturally was (or was closest to) and that worked just fine. Windows in the end went for the same solution... This is not the only example, both Intel and AMD have butted heads with Microsoft about scheduling on various strange CPUs.
Posted on Reply
#47
Redwoodz
NekajoIntel has the best overall CPUs in the desktop segment IMO. Performance is not Intels problem, power usage is. With AMD you have to choose depending on your needs.

I bought 7800X3D because its the best gaming CPU and uses low power but for actual work its not impressive and 13700K/14700K which is priced similar will destroy my chip in most stuff outside of gaming, while only being a few percent behind in gaming.

7950X3D was supposed to be the sweet spot with good results regardless of workload, however it loses to 7800X3D in gaming and is beaten by Intel in content creation, 7950X beats it as well.

I hope AMD will address this with Zen 5. Hopefully 3D models won't be too far off the initial release and hopefully all the cores will get 3D cache this time. 7900X3D was kinda meh and 7950X3D should have 16 cores with 3D cache for sure with that price tag. In a perfect world there would be no 3D parts. The CPUs should be good overall regardless of workload.

Intels power draw is not much of a "problem" if you look at real world watt usage instead of synthetic loads.

A friend of mine has 13700K and it hovers around 100 watts in gaming which is 40 watts lower than my 7800X3D, performance is very similar.

According to Techpowerups review of 14th gen it also seems that Intel regains the lead in minimum fps at higher resolution.

14700K generally performs on par with 7950X3D in gaming while performing similar in content creation. Not too much difference, yet the i7 is much cheaper and boards are also generally cheaper than AM5 boards especially if you choose the B650E/X670E boards. Power draw is about 100 watts higher on the i7 when peaked tho.
:rolleyes: Just go back to Intel, no one cares. You are arguing about a 2% difference in performance that no one really notices most of the time.
Posted on Reply
#48
dyonoctis
Lots of emotions around this, but Apple managed to have honorable MT performance without SMT since the M1. The M3 ultra should handily beat a 14900k. ARL is going to be a proof of concept, but down the line, Intel might have more flexibility when it comes to core config/core count to definitely make up for the lack of HT.
Posted on Reply
#49
ThrashZone
Hi,
MS doesn't give a crap about desktop only thing they care about is the mobile world it aligns with onedrive storage/....
Desktops are a thorn in their backside and wish they would go away and use lower carbon footprint as why lol
Posted on Reply
#50
londiste
AssimilatorBut there hasn't really been a concept of heterogenous cores in the same package until now, and Windows Server has definitely encountered some of the same pains in core scheduling that Linux has. Everything I've read so far indicates that the issues that desktop Windows has encountered around scheduling over heterogenous cores is regarding high-performance/low-latency applications like games, not the line-of-business stuff that you'd find running on servers.
ARM's big.LITTLE and its successor? Linux has had to deal with such scheduling things for a while now. So has MacOS due to Apple M-series SoCs. This should be a problem with several approaches taken to implement scheduling to those. Btw, with both 3DCache chiplets and Zen4c cores AMD seems to be heading in the same direction - not as extreme of a difference but still cores with different profiles. So Microsoft and Intel and probably AMD better put their heads together and figure out what works :)
Posted on Reply
Add your own comment
Nov 26th, 2024 05:45 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts