Thursday, August 19th 2021
Intel Thread Director Makes "Alder Lake" Hybrid Architecture Work
Intel in its Architecture Day presentation Thread Director, a hardware component present on the "Alder Lake" silicon, which makes the Hybrid architecture of the processor work flawlessly. "Alder Lake-S" is the first desktop processor with two kinds of x86 CPU cores—the larger Performance P-cores, and the smaller Efficient E-cores, which work in a setup not unlike big.LITTLE by Arm.
The x86-based "Alder Lake" processor has a much more complex ISA, and the E-cores don't have all of the instruction sets or hardware capabilities that the P-cores do. The two cores operate at very different performance/Watt bands, and are optimized for vastly different workloads. At the same time, sending a workload to the wrong kind of core could not only impact performance, but also crash, due to an ISA mismatch. Intel realized that it will take a lot more than mere OS-level awareness to solve the problem, and so innovated the Thread Director.Put simply, Intel Thread Director is a highly specialized hardware abstraction layer (HAL) that interfaces with the operating system and software on one side; and the two groups of CPU cores, on the other. Its job is to analyze a workload, distribute it among the P-core or E-core clusters, at a granular level (i.e. thread-level). If specific threads of an application don't invoke certain kinds of instructions and are determined to be low-priority, they're dispatched to the E-core cluster. Threads that lose priority are parked onto the E-cores from the P-cores, too.The P-cores get priority when a thread requires instructions exclusive to P-cores (such as AVX-512 or DLBoost). Thread Director also works with the OS kernel to discern background tasks from foreground/priority ones. This probably works with a software-side component that's included with the Chipset INF software, if not an exclusive driver. Thread Director ensures that lightweight or low-priority tasks don't needlessly invoke P-cores, and when the system is idling, the processor's power management can probably gate power to P-cores for major power savings (this is assuming Alder Lake features a power-gating technology similar to "Lakefield.").Intel will recommend Windows 11 as the most optimal OS for "Alder Lake," as it meets Thread Director half way with OS Scheduler awareness of hybrid processor architectures. It remains to be seen, however, whether Thread Director requires this.
The x86-based "Alder Lake" processor has a much more complex ISA, and the E-cores don't have all of the instruction sets or hardware capabilities that the P-cores do. The two cores operate at very different performance/Watt bands, and are optimized for vastly different workloads. At the same time, sending a workload to the wrong kind of core could not only impact performance, but also crash, due to an ISA mismatch. Intel realized that it will take a lot more than mere OS-level awareness to solve the problem, and so innovated the Thread Director.Put simply, Intel Thread Director is a highly specialized hardware abstraction layer (HAL) that interfaces with the operating system and software on one side; and the two groups of CPU cores, on the other. Its job is to analyze a workload, distribute it among the P-core or E-core clusters, at a granular level (i.e. thread-level). If specific threads of an application don't invoke certain kinds of instructions and are determined to be low-priority, they're dispatched to the E-core cluster. Threads that lose priority are parked onto the E-cores from the P-cores, too.The P-cores get priority when a thread requires instructions exclusive to P-cores (such as AVX-512 or DLBoost). Thread Director also works with the OS kernel to discern background tasks from foreground/priority ones. This probably works with a software-side component that's included with the Chipset INF software, if not an exclusive driver. Thread Director ensures that lightweight or low-priority tasks don't needlessly invoke P-cores, and when the system is idling, the processor's power management can probably gate power to P-cores for major power savings (this is assuming Alder Lake features a power-gating technology similar to "Lakefield.").Intel will recommend Windows 11 as the most optimal OS for "Alder Lake," as it meets Thread Director half way with OS Scheduler awareness of hybrid processor architectures. It remains to be seen, however, whether Thread Director requires this.
23 Comments on Intel Thread Director Makes "Alder Lake" Hybrid Architecture Work
Intel did better with Lakefield but had to learn a similar lesson.
That kind of abstraction seems needed if the common software-plattforms aka OSes arent capable yet.
Nvidia needed to involve a bunch of blackboxing when they went to the multiple frontend approach in 2008 with the GT200 µarch (GTX 280), so abstraction is not always bad or slow.
these are big claims intel...also why does it matter that windows 11 "meets thread director half way"? is thread director so overloaded that it cant actually handle the task on its own? and if so why is it not made better then?....like it seems to me that if there is hardware onboard that does this, then windows 10 should work just fine
In reality, the OS scheduler cannot do it, and this is merely a required solution to a problem of Intel's own making (the problem of incompatible ISAs caused because the Tremont based E-cores aren't designed from the ground up to work alongside Raptor Cove P-cores, Intel are just re-using some old Atom architecture that was designed for a very different market and purpose, originally).
big.LITTLE works on Linux because both types of ARM core use the same ISA. There is no problem that needs solving there, and if Intel had designed a smaller E-core that used the same ISA like a proper, built-for-pupose design should, they wouldn't need this
additional layer of hardware schedulerThread Director."The x86-based "Alder Lake" processor has a much more complex ISA, and the E-cores don't have all of the instruction sets or hardware capabilities that the P-cores do."
I guess, if we're arguing over semantics, I am "compatible" with pepperoni pizza, but you cannot give both me and a pepperoni pizza the same instructions and expect the same outcome.
"The P-cores get priority when a thread requires instructions exclusive to P-cores (such as AVX-512 or DLBoost)."
And yet it seems that AVX-512 is not supported at all, which makes either it, or AnandTech's article I quoted invalid. Hence "as far as we know".
Edit: Also Lakefield, Alder Lake's predecessor, used unified architecture by disabling non-compatible parts in P-cores.
AVX-512 is an abortion anyway; AVX2 is all that will survive now having adopted the only bit of AVX-512 worth keeping and no implementation of AVX-512 to date has been successful from a performance/Watt perspective. Not a single tear will be shed for AVX-512's hot and melty death.
Properly utilized AVX-512 has amazing perf/watt gains, but the key issue is properly utilized. You have to know why and how you're using the instructions (short computations don't make sense because of the latency, frequency and power penalties, for example):
instructions that run on E-cores can run on P-cores.
instructions that run on P-cores might run on E-cores.
instructions that use P-core exclusive ISA cannot run on E-cores.
How hard is it to understand that E-core's ISA is a subset of P-core's ISA?
If they were incompatible then it would mean that operating systems that are not aware of this would not work properly on the CPU, and I don't think Intel would allow this. You know, backwards compatibility being their strong point, and all...
Edit: AnandTech actually asked Intel about Windows 10 (at the bottom), and it is able to run on the CPU while not being aware/compatible with Intel Thread Director.
THIS ARTICLE YOU'RE REPLYING TO IS THE SOURCE
"The x86-based "Alder Lake" processor has a much more complex ISA, and the E-cores don't have all of the instruction sets or hardware capabilities that the P-cores do"
I've read nothing that would suggest ITD is a hardware scheduler capable of this. (I know I sound like a broken record) The AnandTech article said: So if P-cores are full, and the E-core gets a load with an instruction it can't handle it would create a situation that a ITD-unaware OS would not expect. If ITD is capable of autonomously moving a thread/process between E- and P- cores this again would create a situation most OS' are not designed for. Such a design is a compatibility nightmare. It's not. It is an interpretation of PR slides, a poor one at that since it mentions that Alder Lake supports AVX-512 while in fact it doesn't.
www.intel.com/content/www/us/en/newsroom/resources/press-kit-architecture-day-2021.html
You won't accept a paraphrased version from someone whose literal job description is to publish summarised press releases direct from the source, but you also won't watch the source either.
I give up. Are you a millenial, perchance?
Will you or will you not provide a direct quote from Intel that E- and P- cores are ISA-incompatible?
It's the first guy that Raja introduces. If you don't understand the words he uses for about four minutes about how they whittled down Gracemont by stripping parts of the ISA that weren't essential for efficiency I cannot help you. There is no more spoonfeeding.
A very simple question. What exactly are the incompatible instructions? If it is a real thing, it should have been digged out of Linux codebase months early.
CPU only understands a instruction AFTER the decoder stage inside the microarchitecture. Especially for a variable-length ISA such as x86, there is literally no way other than a decoder to analyze the instructions. However, the Thread Director is clearly an uncore component, and Intel mentions nothing about it having decoders nor they reversed the microarchitecturally pipelines (e.g. parking a thread when an incompat instruction has been decoded and moves to god-knows-where pipeline stage.
Intel would make a much bigger news if they realized your world-shakening design. Lakefield is still a single ISA.
@btarunr You should really going through your word again to see if there are more stuff that Intel's marketing paraphrases manipulates you to believe. Soon the world will cite YOU as the SOURCE of a hetero-ISA architecture.
One thing good about Dr. Ian Cutress, is that he actually has a PhD degree in EECS. So his article directly dismissed the false impression of incompatible ISA that Intel tries to sell.
On slides, Intel just said different "mix" of instructions, NOT different instructions. Such wording is intended to confuse audience into believing they have something more powerful. @btarunr