Thursday, June 21st 2018
OpenBSD Turns Off Hyper-Threading to Combat Intel CPU Security Issues
Lead developer for OpenBSD Mark Kettenis has announced that OpenBSD will no longer enable Hyper-Threading on Intel processors by default. This move is intended to mitigate security exploits from the Spectre ecosystem as well as TLB and cache timing attacks, because important processor resources are no longer shared between threads. Their suspicion is that some of the unreleased (or yet unknown) attacks can be stopped using this approach.
This move is supported by the fact that most newer motherboards no longer provide an option to disable Hyper-Threading via BIOS. OpenBSD users who still want to use Hyper-Threading can manually enable support for it using the sysctl hw.smt. The developers are also looking into expanding this feature to other CPUs from other vendors, should they be affected, too.The performance penalty from disabling Hyper-Threading is dependent on the software used. Highly optimized HPC software might even run faster without HT, other, more generic applications will see a performance hit. For example CineBench gains 30% with Hyper-Threading enabled.
Part of the reason why this change is happening now is due to criticism towards Intel, who keep failing at proper coordinated releases of exploits. Also Intel seems completely unresponsive to inquiries from the open source community. Only their buddies at big corporations like Apple, Google, Microsoft and Amazon get informed with enough lead time to prepare patches. That's why OpenBSD is taking the approach to immediately release a rough solution, while then waiting for Intel to come up with a fix that has a smaller performance impact.
Source:
OpenBSD Changelog Entry
This move is supported by the fact that most newer motherboards no longer provide an option to disable Hyper-Threading via BIOS. OpenBSD users who still want to use Hyper-Threading can manually enable support for it using the sysctl hw.smt. The developers are also looking into expanding this feature to other CPUs from other vendors, should they be affected, too.The performance penalty from disabling Hyper-Threading is dependent on the software used. Highly optimized HPC software might even run faster without HT, other, more generic applications will see a performance hit. For example CineBench gains 30% with Hyper-Threading enabled.
Part of the reason why this change is happening now is due to criticism towards Intel, who keep failing at proper coordinated releases of exploits. Also Intel seems completely unresponsive to inquiries from the open source community. Only their buddies at big corporations like Apple, Google, Microsoft and Amazon get informed with enough lead time to prepare patches. That's why OpenBSD is taking the approach to immediately release a rough solution, while then waiting for Intel to come up with a fix that has a smaller performance impact.
22 Comments on OpenBSD Turns Off Hyper-Threading to Combat Intel CPU Security Issues
trog
OpenBSD wasn't paid to insert them, they just failed to notice that paid "helpers" were slipping them in aparently useful commits.
Have a read:
www.theregister.co.uk/2010/12/15/openbsd_backdoor_claim/
/s
Anyway, why is it not applied to amd?
I'd be interested to know which boards are these that don't allow disabling HT, which is actually in spec from INTEL.
On a sidenote, INTEL needs to get its act together, falling over themselves at every turn, with unforced errors. They are just fortunate their chief competitor is profoundly incompetent to some degree and unable to capitalize on these numerous short comings and failures from those in charge at INTEL (ThreadRipper sales are abysmal, when they should be killing INTEL's HEDT at every turn)
The bigger discussion we should have is rather if SMT still makes sense for future CPUs. The purpose of SMT is to utilize the idle resources in the CPU to other threads while the execution is stalled due to branch mispreditions, data dependencies or cache misses. SMT may gain total throughput across multiple threads at the price of decreased throughput for a single thread, and "marginal" cost of implementation compared to a whole new CPU core.
Intel implemented HT at a time their Pentium 4 ("Netburst") architecture was struggling due to an inefficient design. At this time CPUs were single core, but multi-CPU setups existed for the enterprise market. The cost of implementation was relatively marginal, both for the front-end/prefetcher and the execution units. Making a single core have some powers of a "dual core" made sense at the time, and not only for marketing purposes, at the time it made scheduling easier and systems potentially less "hanging". It's worth mentioning that IBM's Power CPUs support 4-/8-way SMT, mainly used for executing massive amounts of threads of enterprise Java code, which normally have huge amounts of stalls due to cache misses.
But does SMT still make sense today, at least if we narrow our scope to desktops and laptops?
Performance:
The average gains is in the range of ~5%. The cases where we see 30% gains are edge-cases, there are also many cases where we see performance loss of >10%. Any synchronized workload risk losing performance with SMT, including gaming, audio processing, etc. SMT also introduces performance variability and higher latency.
Die cost:
While the cost of implementation of SMT in Pentium 4 was relatively low, modern CPU architectures relies more and more on their front-end; the prefetcher. This doesn't only add security implications, but also a strain on the prefetcher's resources, and of course the cache. Intel usually adds more L3 cache, but it's not enough to compensate for the performance loss. We are already at a point where the design cost of SMT is much greater than when it was introduced, and since we can easily add more cores to a design today, it makes more sense to prioritize more faster cores rather than cores with SMT. As we go forward, future CPU architectures is only going to be more advanced, and SMT adds more and more restrictions on the design choices.
Software - OS:
The kernel obviously have to be aware of SMT, and treats the CPU as x "strong" cores and x "weak" cores. This might have been simple when most computers had 1 core, but today with even HEDT soon getting >20 cores, this complexity becomes completely unnecessary.
Software - programs:
It's harder for a single program with many threads to scale well if SMT than for multiple programs. E.g. on an 8-core, some workloads will scale better with 8 threads, while others scale better with 16 threads. This comes down to how efficient the code is, and more efficient code will cause fewer stalls in the CPU, so there are fewer idle cycles to use for other threads. If the program assumes all threads are equal, splitting a workload into 16 threads vs. 8 threads may reduce performance (on the same 8-core CPU). SMT only really works well when you have a mix of "heavy" threads and "light" threads, and no need to synchronize them.
My assessment is that the costs of SMT are increasing while the gains are decreasing, and we're approaching a point where it will soon be "pointless". I would much rather trade an 8-core/16-thread CPU design for one that's either a faster 8-core or a 10-core instead.
Plus, its not like people can't just re-enable it. It is just an option that is now off by default instead of on by default.