Friday, August 11th 2023

"Downfall" Intel CPU Vulnerability Can Impact Performance By 50%

Intel has recently revealed a security vulnerability named Downfall (CVE-2022-40982) that impacts multiple generations of Intel processors. The vulnerability is linked to Intel's memory optimization feature, exploiting the Gather instruction, a function that accelerates data fetching from scattered memory locations. It inadvertently exposes internal hardware registers, allowing malicious software access to data held by other programs. The flaw affects Intel mainstream and server processors ranging from the Skylake to Rocket Lake microarchitecture. The entire list of affected CPUs is here. Intel has responded by releasing updated software-level microcode to fix the flaw. However, there's concern over the performance impact of the fix, potentially affecting AVX2 and AVX-512 workloads involving the Gather instruction by up to 50%.

Phoronix tested the Downfall mitigations and reported varying performance decreases on different processors. For instance, two Xeon Platinum 8380 processors were around 6% slower in certain tests, while the Core i7-1165G7 faced performance degradation ranging from 11% to 39% in specific benchmarks. While these reductions were less than Intel's forecasted 50% overhead, they remain significant, especially in High-Performance Computing (HPC) workloads. The ramifications of Downfall are not restricted to specialized tasks like AI or HPC but may extend to more common applications such as video encoding. Though the microcode update is not mandatory and Intel provides an opt-out mechanism, users are left with a challenging decision between security and performance. Executing a Downfall attack might seem complex, but the final choice between implementing the mitigation or retaining performance will likely vary depending on individual needs and risk assessments.
Source: Phoronix
Add your own comment

162 Comments on "Downfall" Intel CPU Vulnerability Can Impact Performance By 50%

#26
R-T-B
chrcolukAny example of these javascript meltdown exploits out in the wild?
I'm not going to link malware, but you can find source code for examples on as mainstream sites as reddit:

javascript/comments/7ob6a2
Posted on Reply
#27
chrcoluk
R-T-BI'm not going to link malware, but you can find source code for examples on as mainstream sites as reddit:

javascript/comments/7ob6a2
Ok was hoping for some known in the wild examples, but understand why you wouldnt post the links. Will see if I can find anything out.
Posted on Reply
#28
Shihab
This uptake in hardware vulnerabilities, and subsequent performance penalties from patches, makes me wonder if operating systems in general should shift towards a linux kernel "flavors" -like approach. Provide two kernels: Performance-oriented with only the most critical microcode patches, and standard/secure with everything locked down.

As an added bonus: Makes all those system hardening guidelines a little easier to maintain...
ChaitanyaThere are many many editing tools that heavily rely on AVX in some form, and there are whole range of applications for WS which will also will be impacted by the "fix".
To be fair, the vulnerability affects only one set of instructions in AVX2+. Explicit vectorization could forgo the op in favor of alternatives (and afaik, this was the better choice back in the early days).
Could pose problems to auto vectorization tho. But people don't typically expect that much performance out of them...

I'm not an expert, but I wonder if this feared performance loss could be itself mitigated by rewriting code to do manual loads instead of relying on the faulty ops.
Posted on Reply
#29
chrcoluk
R-T-BI'm not going to link malware, but you can find source code for examples on as mainstream sites as reddit:

javascript/comments/7ob6a2
Just realised, I cant actually disable meltdown as I am now on a CPU with a hardware mitigation. I am going to research what you said anyway for curiosity purposes but on my system its still mitigated.

Speculation control settings for CVE-2017-5754 [rogue data cache load]

Hardware requires kernel VA shadowing: False
Posted on Reply
#30
kondamin
was there a xeon rocketlake?

sounds like it's not that big a deal if it's just older hardware but new chips aren't vulnerable to it.
Posted on Reply
#31
ncrs
ShihabThis uptake in hardware vulnerabilities, and subsequent performance penalties from patches, makes me wonder if operating systems in general should shift towards a linux kernel "flavors" -like approach. Provide two kernels: Performance-oriented with only the most critical microcode patches, and standard/secure with everything locked down.
Microsoft has been doing this for a very long time with Windows. Some mitigations for previous vulnerabilities are not active by default on consumer versions of Windows while being enabled on server editions.
ShihabTo be fair, the vulnerability affects only one set of instructions in AVX2+. Explicit vectorization could forgo the op in favor of alternatives (and afaik, this was the better choice back in the early days).
Could pose problems to auto vectorization tho. But people don't typically expect that much performance out of them...

I'm not an expert, but I wonder if this feared performance loss could be itself mitigated by rewriting code to do manual loads instead of relying on the faulty ops.
I'm simplifying this a bit here. This vulnerability allows a malicious program using the AVX2 gatherinstructions to snoop other programs (within the same core, with or without HT) that utilize different types of AVX2 instructions. In modern systems there's many places those instructions are used, particularly around encryption, but not limited to it - SIMD reading, writing and memory copying are also affected. Basically high-performance code paths.
Everything using AES-NI instructions is potentially affected, and that's accelerating the AES cipher used in many protocols including HTTPS. BitLocker also uses it for disk encryption, so theoretically a malicious program could extract its encryption keys. Whether it can be achieved in practice, especially in consumer setting is another issue. On servers it's another can of worms, particularly in cloud or VM environment.

Optimizing for AVX2/-512 brings very measurable increases in performance, so abandoning it is not really a solution. For example on my CPU AES-NI in VeraCrypt is able to achieve 16GB/s, but when disabled the performance tanks to 2.5GB/s. This decrease exceeds even the most pessimistic "up to 50%" impact of DOWNFALL mitigations.
kondaminwas there a xeon rocketlake?
Yes, both "small server" Xeon-E and workstation Xeon-W, but no big chips.
kondaminsounds like it's not that big a deal if it's just older hardware but new chips aren't vulnerable to it.
Intel is still supporting Skylake Xeons launched in 2017 until the end of this year. There's a lot of hardware still in use since then, and most likely running earlier architectures.
The performance impact is really dependent on workload and the actual use cases - if you're using a rack of servers to convert video files in an isolated network segment then you can disable mitigations. If you are a cloud/server hosting facility then you have to enable them in order to avoid potential liability.
Posted on Reply
#32
efikkan
chrcolukOk was hoping for some known in the wild examples, but understand why you wouldnt post the links. Will see if I can find anything out.
Most attack vectors for Spectre etc. rely on manipulating CPU registers to read/copy data that you shouldn't have access to. There is a tiny window of nanoseconds to read out this data. To my knowledge, most interpreted languages don't allow you to even manipulate CPU registers. I know of two main ways to execute a such attack, either you read out some "random" data which happened to be there, or you target a memory address and let the CPU prefetch it, time an attack and retrieve it before it's removed. Both of these examples would also require some bug in the interpreter. Now I haven't studied what is possible through WebAssembly, so something might be possible there... But if someone shows a loop in JavaScript leak some data from one variable to another, that's a JavaScript bug, not a CPU bug. (And I'm not surprised if there are plenty of ways to escape JavaScript's memory sandbox.)
Posted on Reply
#33
Imouto
Denverintel is suffering a streak of bad luck. :wtf:
You are saying that as if it was a teenager on a night out who's got a STD and gave birth to a whole bunch of retarded execs.
Posted on Reply
#34
Shihab
ncrsMicrosoft has been doing this for a very long time with Windows. Some mitigations for previous vulnerabilities are not active by default on consumer versions of Windows while being enabled on server editions.
Somewhat similar but not quite what I had in mind. Server and PC Windows are de facto two separate platform. The kernel approach I had in mind applies to the same platform depending on the use case, say for example video editing (high performance) and office work (low performance), both cases [generally] apply to the PC (and I'm including workstations in the definition) platform rather than servers. If going by the Windows SKU scheme, they'd need a new Windows version or modify their existing home/pro structure.
ncrsSIMD reading, writing and memory copying are also affected.
All articles I've read only mention gather instructions. Nothing I've passed mentions anything about traditional load/store instructions.

No one is arguing the benefits of vectorization in general, but there are more ways to skin this cat.
I did contest benefits of compiler auto-vectorizations in typical scenarios. Applications that do benefit from SIMD tend to have explicit implementations, no?
Posted on Reply
#35
ncrs
ShihabAll articles I've read only mention gather instructions. Nothing I've passed mentions anything about traditional load/store instructions.
Gather instructions are always used on the attacking side. The DOWNFALL whitepaper has a list of affected victim instructions. In case of Tiger Lake 850 instructions leaked data in HT environment.
ShihabNo one is arguing the benefits of vectorization in general, but there are more ways to skin this cat.
I did contest benefits of compiler auto-vectorizations in typical scenarios. Applications that do benefit from SIMD tend to have explicit implementations, no?
It doesn't really matter if they are manual or automatic optimizations when, with the test suite from the paper, 53% of tested AVX2/-512 instructions leak data.
Another issue is that most programs use libraries for high performance code, often without even knowing if they are using vectorized code or not internally. You're not rolling your own AES code, or at least you shouldn't ;), you use OpenSSL or even rely on the operating system in case of Windows/macOS. Those in turn are often vectorized/using hardware instructions like AES-NI in order to increase performance and save power.
Posted on Reply
#36
Scircura
efikkanMost attack vectors for Spectre etc. rely on manipulating CPU registers to read/copy data that you shouldn't have access to. There is a tiny window of nanoseconds to read out this data. To my knowledge, most interpreted languages don't allow you to even manipulate CPU registers. I know of two main ways to execute a such attack, either you read out some "random" data which happened to be there, or you target a memory address and let the CPU prefetch it, time an attack and retrieve it before it's removed. Both of these examples would also require some bug in the interpreter. Now I haven't studied what is possible through WebAssembly, so something might be possible there... But if someone shows a loop in JavaScript leak some data from one variable to another, that's a JavaScript bug, not a CPU bug. (And I'm not surprised if there are plenty of ways to escape JavaScript's memory sandbox.)
As was linked upthread, it is possible to do this in JS.

However, browsers quickly mitigated it by reducing the resolution of the timers that Sprectre relied on for its side channel, making data extraction impractical.

All the info you need about this is in the Wikipedia article.
Posted on Reply
#37
ncrs
ScircuraHowever, browsers quickly mitigated it by reducing the resolution of the timers that Sprectre relied on for its side channel, making data extraction impractical.
It's always a game of cat-and-mouse. Here's a paper analyzing those browser mitigations and finding them lacking in certain areas, a short excerpt:
SharedArrayBuffer have been disabled by default in Chrome 60 and Firefox 57.0.4 to mitigate Spectre. With the introduction of mitigations to transient execution attacks, they have been reimplemented. They are available by default in Firefox 79 with COOP/ COEP, and by default in Chrome 68. SharedArrayBuffer based timers are, by far, the most powerful timer available in browsers.
[...]
The offered resolution is sufficient to implement all known timing attacks. In addition, they have a very low measurement overhead and do not need amplification. An attacker using SharedArrayBuffer to build a covert channel can achieve an ideal bit rate of 50 Mbit/ sec on both browsers. This is 800 000 times higher than with performance.now() on Firefox 81 without COOP/ COEP, and 2000 times higher than Chrome 84 and Firefox 81 with COOP/COEP.
Posted on Reply
#38
chrcoluk
efikkanMost attack vectors for Spectre etc. rely on manipulating CPU registers to read/copy data that you shouldn't have access to. There is a tiny window of nanoseconds to read out this data. To my knowledge, most interpreted languages don't allow you to even manipulate CPU registers. I know of two main ways to execute a such attack, either you read out some "random" data which happened to be there, or you target a memory address and let the CPU prefetch it, time an attack and retrieve it before it's removed. Both of these examples would also require some bug in the interpreter. Now I haven't studied what is possible through WebAssembly, so something might be possible there... But if someone shows a loop in JavaScript leak some data from one variable to another, that's a JavaScript bug, not a CPU bug. (And I'm not surprised if there are plenty of ways to escape JavaScript's memory sandbox.)
Yep, thats the conclusion I had already reached, difficult to do in the wild.
Posted on Reply
#39
R-T-B
chrcolukJust realised, I cant actually disable meltdown as I am now on a CPU with a hardware mitigation. I am going to research what you said anyway for curiosity purposes but on my system its still mitigated.

Speculation control settings for CVE-2017-5754 [rogue data cache load]

Hardware requires kernel VA shadowing: False
Yeah honestly the original meltdown only applies to like skylake and older. Maybe Rocket Lake too, I'm not entirely sure anymore, the whole thing is just a wild table of "if this then" that could drive anyone insane lol.

My advice would be to trust your OS vendor unless it really is just a gaming and local code execution rig. In that case, feel free to turn them off.
Posted on Reply
#40
lexluthermiester
TumbleGeorgeIntel is not longer a company that I respect. :(
freeagent-1 for engineering shortcuts.

Booo.
Seriously? :wtf: These kinds of things are not sloppy engineering. No design team is thinking about loop-holes or whacky ways to exploit what they're creating. They're designing the fastest and most efficient way to do the things they're trying to do. These things are not intentional and are not a sign of incompetence.

Come on people, you're all smart enough to know how things work. See sense.
unwind-protectOnly a question of time until somebody triggers this from Javascript or Web assembly, so it is relevant to everybody surfing the web.
My guess is no. If it is possible from a remote vector, the difficulty will be high, if not extreme.
Posted on Reply
#41
freeagent
lexluthermiesterSeriously?
Anything to get the edge on the competition, who are we to say if shortcuts were made or not?

I am sure some concessions were made throughout product development on the entire Core line, from the first to the 14th.

What exactly they were, only someone on the inside could say.. But if a mitigation takes a certain product line back a gen or two..
Posted on Reply
#42
lexluthermiester
freeagentwho are we to say if shortcuts were made or not?
With IC design, that's not how it works.
Posted on Reply
#43
MarsM4N
lexluthermiesterSeriously? :wtf: These kinds of things are not sloppy engineering. No design team is thinking about loop-holes or whacky way to exploit what they're creating. They're designing the fastest and most efficient way to do the things they're trying to do. These things are not intentional and are not a sign of incompetence.

Come on people, you're all smart enough to know how things work. See sense.
Engineers are just humans, and humans make errors. Period. ;) I guess the only solution would be chips (and software/firmware) designed by AI. Because machines don't make errors.

Also humans are influenceable, manipulable, corrupt and ideology driven. Just ask yourself, would you pass on a truckload of cash from a GOV agency just for slipping in an exploitable bug?

Posted on Reply
#44
Tahagomizer
It requires local privileged access. If an attacker has that, the battle is already lost, so big whoop. Secure your damn systems.
Posted on Reply
#45
R-T-B
TahagomizerIt requires local privileged access.
I keep hearing privileged but is there anywhere actually saying/confirming that? Sounds to me like you just need to be able to execute code.
Posted on Reply
#46
Tahagomizer
R-T-BI keep hearing privileged but is there anywhere actually saying/confirming that? Sounds to me like you just need to be able to execute code.
True, it's a PR:L but also an AV:L, so you need at least authenticated local user access. In serious environments you shouldn't allow users to run arbitrary and unverified code. As far as home users who tend to ignore security: It's a side-channel attack. They're difficult to execute efficiently and there is a myriad of easier ways to get what you want - hell, some users will actually give you their bank password if you ask politely, therefore I say it's a "storm in a glass" situation. A lot of noise, realistically not a problem.
Posted on Reply
#47
Patriot
mb194dcDownfall requires admin access? and will only be relevant in use cases where multiple unconnected users share machines, ie shared server environments. So generally, it's not an issue.

There has been a trend towards security at hardware or other levels, when these are rarely (never?) exploited in the real world anyway. The best hacking tools are social engineering, user and configuration error and generally the human element. Not hardware!
So an issue for every single cloud server...
Posted on Reply
#48
AnotherReader
PatriotSo an issue for every single cloud server...
Downfall also relies on SMT as the attacker should be running on the same core as the victim. These cloud providers should stop running programs from different customers on the same cores.
Posted on Reply
#49
lexluthermiester
MarsM4NEngineers are just humans, and humans make errors. Period.
Exactly. But they're also NOT clairvoyant. There are times when it is impossible to see a problem coming until it's already behind you.

So if I seemed a bit harsh earlier. I'm sorry. The problem is that, some are implying something clumsy, careless or nefarious actions when ALL of these exotic and crazy vulnerabilities in the last 7 years came about from experimentation with the hardware in ways that no one designing said hardware ever planned for, imagined or could have predicted.

We can't lay the blame at their feet and scream "Why did you do this?!?!". It just doesn't work that way.
Posted on Reply
#50
MarsM4N
lexluthermiesterExactly. But they're also NOT clairvoyant. There are times when it is impossible to see a problem coming until it's already behind you.
They could bigly reduce such "unforeseen consequences" with proper QA. ;) But they're doing the exact opposite, cutting corners wherever they can to increase profits for shareholders. Just look at all the late scandals, not only in tech. Food safety, drug safety, finance, you name it. And when stuff gets public they all act surprised. On top governments let them way too easy of the hook "to protect jobs", which kinda encourages them to not change a thing.



Also it's not surprising that tech security flaws stay undetected for soo long. There are not many people on the planet who actually have a understanding for the tech, and those who do work either for the tech companies, the GOV or bad actors. And none of them are interested in making security flaws public, two of them even abuse them. That's why most security flaws are reported by private researchers.
Posted on Reply
Add your own comment
Dec 19th, 2024 03:15 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts