• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

"Indirector" is Intel's Latest Branch Predictor Vulnerability, But Patch is Already Out

AleksandarK

News Editor
Staff member
Joined
Aug 19, 2017
Messages
2,582 (0.97/day)
Researchers from the University of California, San Diego, have unveiled a significant security vulnerability affecting Intel Raptor Lake and Alder Lake processors. The newly discovered flaw, dubbed "Indirector," exposes weaknesses in the Indirect Branch Predictor (IBP) and Branch Target Buffer (BTB), potentially allowing attackers to execute precise Branch Target Injection (BTI) attacks. The published study provides a detailed look into the intricate structures of the IBP and BTB within recent Intel processors, showcasing Spectre-style attach. For the first time, researchers have mapped out the size, structure, and precise functions governing index and tag hashing in these critical components. Particularly concerning is the discovery of previously unknown gaps in Intel's hardware defenses, including IBPB, IBRS, and STIBP. These findings suggest that even the latest security measures may be insufficient to protect against sophisticated attacks.

The research team developed a tool called "iBranch Locator," which can efficiently identify and manipulate specific branches within the IBP. This tool enables highly precise BTI attacks, potentially compromising security across various scenarios, including cross-process and cross-privilege environments. One of the most alarming implications of this vulnerability is its ability to bypass Address Space Layout Randomization (ASLR), a crucial security feature in modern operating systems. By exploiting the IBP and BTB, attackers could potentially break ASLR protections, exposing systems to a wide range of security threats. Experts recommend several mitigation strategies, including more aggressive use of Intel's IBPB (Indirect Branch Prediction Barrier) feature. However, the performance impact of this solution—up to 50% in some cases—makes it impractical for frequent domain transitions, such as those in browsers and sandboxes. In a statement for Tom's Hardware, Intel noted the following: "Intel reviewed the report submitted by academic researchers and determined previous mitigation guidance provided for issues such as IBRS, eIBRS and BHI are effective against this new research and no new mitigations or guidance is required."




The company was notified in February and had room to fix it. Intel also notified its system vendors, so the security layers protecting against this are in place. Here is the BHI mitigation, and here is the IBRS/eIBRS mitigation.

View at TechPowerUp Main Site | Source
 

DutchTraveller

New Member
Joined
Aug 16, 2023
Messages
10 (0.02/day)
Hyperthreading is often a factor in these security vulnerabilities.
So getting rid of it looks like a promising way to address this and increase ST performance too.

In programs that make very frequent calls to the operating system this should make a big difference.
 
Joined
Aug 20, 2007
Messages
21,467 (3.40/day)
System Name Pioneer
Processor Ryzen R9 9950X
Motherboard GIGABYTE Aorus Elite X670 AX
Cooling Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory 64GB (4x 16GB) G.Skill Flare X5 @ DDR5-6000 CL30
Video Card(s) XFX RX 7900 XTX Speedster Merc 310
Storage Intel 905p Optane 960GB boot, +2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply FSP Hydro Ti Pro 850W
Mouse Logitech G305 Lightspeed Wireless
Keyboard WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software Gentoo Linux x64 / Windows 11 Enterprise IoT 2024
Joined
Jun 18, 2021
Messages
2,547 (2.03/day)
So getting rid of it looks like a promising way to address this and increase ST performance too.

I'm doubtfull of the performance gains it would provide for single threaded apps, but either way in the real world it's ever more rare for an application to be single threaded and it's even rarer - or let's say impossible unless you're talking embedded socs - for it to be running alone in a system. Hyperthreading exists because it provides a huge efficiency boost and it's kind of nuts to suggest killing it.
 
Joined
Feb 1, 2019
Messages
3,592 (1.69/day)
Location
UK, Midlands
System Name Main PC
Processor 13700k
Motherboard Asrock Z690 Steel Legend D4 - Bios 13.02
Cooling Noctua NH-D15S
Memory 32 Gig 3200CL14
Video Card(s) 4080 RTX SUPER FE 16G
Storage 1TB 980 PRO, 2TB SN850X, 2TB DC P4600, 1TB 860 EVO, 2x 3TB WD Red, 2x 4TB WD Red
Display(s) LG 27GL850
Case Fractal Define R4
Audio Device(s) Soundblaster AE-9
Power Supply Antec HCG 750 Gold
Software Windows 10 21H2 LTSC
Disclosing is like a tutorial how to exploit it. o_O

Curious how much the patch nerfs the CPUs. Wont update the bios but of course windows update might force the microcode.

Reread the first post, looks like existing software mitigations already exist for it.
 
Joined
Aug 20, 2007
Messages
21,467 (3.40/day)
System Name Pioneer
Processor Ryzen R9 9950X
Motherboard GIGABYTE Aorus Elite X670 AX
Cooling Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory 64GB (4x 16GB) G.Skill Flare X5 @ DDR5-6000 CL30
Video Card(s) XFX RX 7900 XTX Speedster Merc 310
Storage Intel 905p Optane 960GB boot, +2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply FSP Hydro Ti Pro 850W
Mouse Logitech G305 Lightspeed Wireless
Keyboard WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software Gentoo Linux x64 / Windows 11 Enterprise IoT 2024
Joined
Jan 11, 2022
Messages
872 (0.83/day)
That's how software research is done, yes.
Unless you are a contractor for governments, then you bundle it in to hacking tools and don’t speak a word about it until some scandal reveals it to be used for unethical stuff in the best case
 

DutchTraveller

New Member
Joined
Aug 16, 2023
Messages
10 (0.02/day)
I'm doubtfull of the performance gains it would provide for single threaded apps, but either way in the real world it's ever more rare for an application to be single threaded and it's even rarer - or let's say impossible unless you're talking embedded socs - for it to be running alone in a system. Hyperthreading exists because it provides a huge efficiency boost and it's kind of nuts to suggest killing it.
For servers hyperthreading is worth it because it maximizes throughput and you should optimize your application for that.
But there is Amdahl's Law to take into account: there are often parts that are limited by the single-threaded performance.

I know a large commercial and very expensive application that uses an in-memory database but most of the many theads accessing that database need to lock it. The end result is that it is very dependent on the single-threaded performance.
On the desktop the situation is different, when cpu's with hyperthreading came out there were applications (mostly games) where it was advised to turn off hyperthreading.
We will see how it works out in practice when Lunar Lake comes out..
 
Joined
Jun 18, 2021
Messages
2,547 (2.03/day)
For servers hyperthreading is worth it because it maximizes throughput and you should optimize your application for that.

The exact same thing happens in a desktop, if you open task manager you'll notice you have around 2000 threads running at any given moment. They vary wildly in the workload and resources they require but if just 1% is very active that's already 20 threads.

Hyperthreading is almost always worth it and it's very easy to explain why: imagine a laundry service only starting a new washing cycle once the previous load comes out of the dryers. A core has many different parts to it, some of them can even be redundant (i.e. to handle higher bit counts or whatever) and it doesn't make sense to have them all doing nothing when the first operation is almost out the door.
 
Joined
Aug 20, 2007
Messages
21,467 (3.40/day)
System Name Pioneer
Processor Ryzen R9 9950X
Motherboard GIGABYTE Aorus Elite X670 AX
Cooling Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory 64GB (4x 16GB) G.Skill Flare X5 @ DDR5-6000 CL30
Video Card(s) XFX RX 7900 XTX Speedster Merc 310
Storage Intel 905p Optane 960GB boot, +2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply FSP Hydro Ti Pro 850W
Mouse Logitech G305 Lightspeed Wireless
Keyboard WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software Gentoo Linux x64 / Windows 11 Enterprise IoT 2024
Unless you are a contractor for governments, then you bundle it in to hacking tools and don’t speak a word about it until some scandal reveals it to be used for unethical stuff in the best case
That wouldn't be a software researcher now would it? That would be someone with an agenda.
 
Joined
Feb 1, 2019
Messages
3,592 (1.69/day)
Location
UK, Midlands
System Name Main PC
Processor 13700k
Motherboard Asrock Z690 Steel Legend D4 - Bios 13.02
Cooling Noctua NH-D15S
Memory 32 Gig 3200CL14
Video Card(s) 4080 RTX SUPER FE 16G
Storage 1TB 980 PRO, 2TB SN850X, 2TB DC P4600, 1TB 860 EVO, 2x 3TB WD Red, 2x 4TB WD Red
Display(s) LG 27GL850
Case Fractal Define R4
Audio Device(s) Soundblaster AE-9
Power Supply Antec HCG 750 Gold
Software Windows 10 21H2 LTSC
The exact same thing happens in a desktop, if you open task manager you'll notice you have around 2000 threads running at any given moment. They vary wildly in the workload and resources they require but if just 1% is very active that's already 20 threads.

Hyperthreading is almost always worth it and it's very easy to explain why: imagine a laundry service only starting a new washing cycle once the previous load comes out of the dryers. A core has many different parts to it, some of them can even be redundant (i.e. to handle higher bit counts or whatever) and it doesn't make sense to have them all doing nothing when the first operation is almost out the door.
The laundry aspect I feel is a bad comparison compared to the scenario you painted with all the background threads.

Where HTT is at its best is if all available cores on the system are heavily loaded, but they have gaps for i/o wait. Assigning two threads per core allows a second thread to do processing whilst the first thread is waiting. This is why HTT shines in a few workloads such as software encoding but doesnt do much for general desktop use.

Your example with the washing machine is that, the machine whilst in use is busy and cannot be used by anyone else, on a windows desktop with 2000 threads, these threads will typically only be active for very short periods of time. The contention on a low utilised CPU means it can handle multiple threads fine.

One advantage of having two classes of CPU cores in a CPU is one set of cores can have all the background stuff sent to it, which reduces thread contention for the foreground interactive app. So e.g. if you have say 2000 threads on a system, what happens if 1992 of those threads are all on the e-cores and a 8 threaded game has 8 p-cores all to itself, thats vastly superior to any legacy HTT one class CPU system.

Whether HTT is worth with it all of this going on I suppose is down to opinion, do I think an extra 10% performance for an extra 50% of power is worth it? Usually no. But for some people, they want extra performance no matter the cost, they dont care if it makes the CPU throttle due to excessive temps or requires an extra 100w of power. Every % of throughput matters. Of course HTT has extra costs now with all the security problems as well. All the security mitigations I have seen for HTT is effectively disabling HTT to mitigate it.

I think when we talk between us its down to opinions, but I think when a CPU manufacturer decides the direction needs to change, its more than an opinion, they make the chips, they know the cost in terms of silicon space, the trade offs, the gains from removing HTT and so forth, so if intel are removing HTT, then its likely a net gain from it.

I think the only workloads I have seen HTT benefit is in the server space, or on desktop CPU based encoding. Otherwise its just benchmarks. I think it can help on highly threaded games, if your physical core count is below 8, but above that the benefit seems within margin of error. HTT can actually reduce performance as well, a 2 threaded game will run better if its using 2 physical cores compared to 2 logical cores on the same physical core.
 
Joined
Aug 20, 2007
Messages
21,467 (3.40/day)
System Name Pioneer
Processor Ryzen R9 9950X
Motherboard GIGABYTE Aorus Elite X670 AX
Cooling Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory 64GB (4x 16GB) G.Skill Flare X5 @ DDR5-6000 CL30
Video Card(s) XFX RX 7900 XTX Speedster Merc 310
Storage Intel 905p Optane 960GB boot, +2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply FSP Hydro Ti Pro 850W
Mouse Logitech G305 Lightspeed Wireless
Keyboard WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software Gentoo Linux x64 / Windows 11 Enterprise IoT 2024
You guys do realize this has nothing to do with hyperthreading, right?
 
Joined
Jun 18, 2021
Messages
2,547 (2.03/day)
The laundry aspect I feel is a bad comparison compared to the scenario you painted with all the background threads.

Where HTT is at its best is if all available cores on the system are heavily loaded, but they have gaps for i/o wait. Assigning two threads per core allows a second thread to do processing whilst the first thread is waiting. This is why HTT shines in a few workloads such as software encoding but doesnt do much for general desktop use.

Your example with the washing machine is that, the machine whilst in use is busy and cannot be used by anyone else, on a windows desktop with 2000 threads, these threads will typically only be active for very short periods of time. The contention on a low utilised CPU means it can handle multiple threads fine.

One advantage of having two classes of CPU cores in a CPU is one set of cores can have all the background stuff sent to it, which reduces thread contention for the foreground interactive app. So e.g. if you have say 2000 threads on a system, what happens if 1992 of those threads are all on the e-cores and a 8 threaded game has 8 p-cores all to itself, thats vastly superior to any legacy HTT one class CPU system.

Whether HTT is worth with it all of this going on I suppose is down to opinion, do I think an extra 10% performance for an extra 50% of power is worth it? Usually no. But for some people, they want extra performance no matter the cost, they dont care if it makes the CPU throttle due to excessive temps or requires an extra 100w of power. Every % of throughput matters. Of course HTT has extra costs now with all the security problems as well. All the security mitigations I have seen for HTT is effectively disabling HTT to mitigate it.

I think when we talk between us its down to opinions, but I think when a CPU manufacturer decides the direction needs to change, its more than an opinion, they make the chips, they know the cost in terms of silicon space, the trade offs, the gains from removing HTT and so forth, so if intel are removing HTT, then its likely a net gain from it.

I think the only workloads I have seen HTT benefit is in the server space, or on desktop CPU based encoding. Otherwise its just benchmarks. I think it can help on highly threaded games, if your physical core count is below 8, but above that the benefit seems within margin of error. HTT can actually reduce performance as well, a 2 threaded game will run better if its using 2 physical cores compared to 2 logical cores on the same physical core.

No need to tell me, it was just a very layman explanation to be more easily understood ;)
 
Joined
Aug 22, 2007
Messages
3,548 (0.56/day)
Location
Terra
System Name :)
Processor Intel 13700k
Motherboard Gigabyte z790 UD AC
Cooling Noctua NH-D15
Memory 64GB GSKILL DDR5
Video Card(s) Gigabyte RTX 4090 Gaming OC
Storage 960GB Optane 905P U.2 SSD + 4TB PCIe4 U.2 SSD
Display(s) Alienware AW3423DW 175Hz QD-OLED + Nixeus 27" IPS 1440p 144Hz
Case Fractal Design Torrent
Audio Device(s) MOTU M4 - JBL 305P MKII w/2x JL Audio 10 Sealed --- X-Fi Titanium HD - Presonus Eris E5 - JBL 4412
Power Supply Silverstone 1000W
Mouse Roccat Kain 122 AIMO
Keyboard KBD67 Lite / Mammoth75
VR HMD Reverb G2 V2
Software Win 11 Pro
No need to tell me, it was just a very layman explanation to be more easily understood ;)
sounded more like pipelining than HTT ;)
 
Top