Intel's Patch for Meltdown, Spectre "Complete and Utter Garbage:" Linus Torvalds

btarunr · Jan 23, 2018

Linus Torvalds, creator of Linux, the most popular datacenter operating system, proclaimed Intel's patches for the recent Meltdown and Spectre CPU vulnerabilities "complete and utter garbage." Torvalds continues to work on the innermost code of Linux, and has been closely associated with kernel patches that are supposed to work in conjunction with updated CPU microcode to mitigate the two vulnerabilities that threaten to severely compromise security of data-centers and cloud-computing service providers.

Torvalds, in a heated public chain-mail with David Woodhouse, an Amazon engineer based out of the UK, called Intel's fix "insane" and questioned its intent behind making the patch "toggle-able" (any admin can disable the patch to a seemingly cataclysmic vulnerability, which can bring down a Fortune 500 company). Torvalds also takes issue with redundant fixes to vulnerabilities already patched by Google Project Zero "retpoline" technique. Later down in the thread, Woodhouse admits that there's no good reason for Intel's patches to be an "opt-in." Intel commented on this exchange with a vanilla-flavored potato: "We take the feedback of industry partners seriously. We are actively engaging with the Linux community, including Linus, as we seek to work together on solutions."

View at TechPowerUp Main Site

RejZoR · Jan 23, 2018

Well, if you can just simply toggle the patch, malware can do that too. And then siphon data through a cache exploit undetected lol

Death Star · Jan 23, 2018

I'm not defending Intel's handling of this catastrophe in the slightest, but there are a few pertinent followup e-mails in the chain, which at least offer a bit of additional explanation:

http://lkml.iu.edu/hypermail/linux/kernel/1801.2/05282.html

On Sun, 2018-01-21 at 14:27 -0800, Linus Torvalds wrote:
> On Sun, Jan 21, 2018 at 2:00 PM, David Woodhouse <dwmw2@xxxxxxxxxxxxx> wrote:
> >>
> >> The patches do things like add the garbage MSR writes to the kernel
> >> entry/exit points. That's insane. That says "we're trying to protect
> >> the kernel".Â We already have retpoline there, with less overhead.
> >
> > You're looking at IBRS usage, not IBPB. They are different things.
>
> Ehh. Odd intel naming detail.
>
> If you look at this series, it very much does that kernel entry/exit
> stuff. It was patch 10/10, iirc. In fact, the patch I was replying to
> was explicitly setting that garbage up.
>
> And I really don't want to see these garbage patches just mindlessly
> sent around.

I think we've covered the technical part of this now, not that you like
it â not that any of us *like* it. But since the peanut gallery is
paying lots of attention it's probably worth explaining it a little
more for their benefit.

This is all about Spectre variant 2, where the CPU can be tricked into
mispredicting the target of an indirect branch. And I'm specifically
looking at what we can do on *current* hardware, where we're limited to
the hacks they can manage to add in the microcode.

The new microcode from Intel and AMD adds three new features.

One new feature (IBPB) is a complete barrier for branch prediction.
After frobbing this, no branch targets learned earlier are going to be
used. It's kind of expensive (order of magnitude ~4000 cycles).

The second (STIBP) protects a hyperthread sibling from following branch
predictions which were learned on another sibling. You *might* want
this when running unrelated processes in userspace, for example. Or
different VM guests running on HT siblings.

The third feature (IBRS) is more complicated. It's designed to be
set when you enter a more privileged execution mode (i.e. the kernel).
It prevents branch targets learned in a less-privileged execution mode,
BEFORE IT WAS MOST RECENTLY SET, from taking effect. But it's not just
a 'set-and-forget' feature, it also has barrier-like semantics and
needs to be set on *each* entry into the kernel (from userspace or a VM
guest). It's *also* expensive. And a vile hack, but for a while it was
the only option we had.

Even with IBRS, the CPU cannot tell the difference between different
userspace processes, and between different VM guests. So in addition to
IBRS to protect the kernel, we need the full IBPB barrier on context
switch and vmexit. And maybe STIBP while they're running.

Then along came Paul with the cunning plan of "oh, indirect branches
can be exploited? Screw it, let's not have any of *those* then", which
is retpoline. And it's a *lot* faster than frobbing IBRS on every entry
into the kernel. It's a massive performance win.

So now we *mostly* don't need IBRS. We build with retpoline, use IBPB
on context switches/vmexit (which is in the first part of this patch
series before IBRS is added), and we're safe. We even refactored the
patch series to put retpoline first.

But wait, why did I say "mostly"? Well, not everyone has a retpoline
compiler yet... but OK, screw them; they need to update.

Then there's Skylake, and that generation of CPU cores. For complicated
reasons they actually end up being vulnerable not just on indirect
branches, but also on a 'ret' in some circumstances (such as 16+ CALLs
in a deep chain).

The IBRS solution, ugly though it is, did address that. Retpoline
doesn't. There are patches being floated to detect and prevent deep
stacks, and deal with some of the other special cases that bite on SKL,
but those are icky too. And in fact IBRS performance isn't anywhere
near as bad on this generation of CPUs as it is on earlier CPUs
*anyway*, which makes it not quite so insane to *contemplate* using it
as Intel proposed.

That's why my initial idea, as implemented in this RFC patchset, was to
stick with IBRS on Skylake, and use retpoline everywhere else. I'll
give you "garbage patches", but they weren't being "just mindlessly
sent around". If we're going to drop IBRS support and accept the
caveats, then let's do it as a conscious decision having seen what it
would look like, not just drop it quietly because poor Davey is too
scared that Linus might shout at him again.

I have seen *hand-wavy* analyses of the Skylake thing that mean I'm not
actually lying awake at night fretting about it, but nothing concrete
that really says it's OK.

If you view retpoline as a performance optimisation, which is how it
first arrived, then it's rather unconventional to say "well, it only
opens a *little* bit of a security hole but it does go nice and fast so
let's do it".

But fine, I'm content with ditching the use of IBRS to protect the
kernel, and I'm not even surprised. There's a *reason* we put it last
in the series, as both the most contentious and most dispensable part.
I'd be *happier* with a coherent analysis showing Skylake is still OK,
but hey-ho, screw Skylake.

The early part of the series adds the new feature bits and detects when
it can turn KPTI off on non-Meltdown-vulnerable Intel CPUs, and also
supports the IBPB barrier that we need to make retpoline complete. That
much I think we definitely *do* want. There have been a bunch of us
working on this behind the scenes; one of us will probably post that
bit in the next day or so.

I think we also want to expose IBRS to VM guests, even if we don't use
it ourselves. Because Windows guests (and RHEL guests; yay!) do use it.

If we can be done with the shouty part, I'd actually quite like to have
a sensible discussion about when, if ever, we do IBPB on context switch
(ptraceability and dumpable have both been suggested) and when, if
ever, we set STIPB in userspace.

R-T-B · Jan 23, 2018

This is also best taken in context: Linus is only referencing the submitted linux kernel patches. Windows and microcode patches need not apply here.

Steevo · Jan 23, 2018

Haha. Primitive hardware functions designed to make Intel faster are multi-floor screen doors in a submarine.

Darmok N Jalad · Jan 23, 2018

RejZoR said:
Well, if you can just simply toggle the patch, malware can do that too. And then siphon data through a cache exploit undetected lol

Woohoo! The return of the “Turbo” button on PCs!

AltCapwn · Jan 23, 2018

Since I've installed the fix on all our enterprise PCs, we have a fruck load of random issues. A lot of PC did slow down too. I'm angry at Intel right now.

R-T-B · Jan 23, 2018

altcapwn said:
Since I've installed the fix on all our enterprise PCs, we have a fruck load of random issues. A lot of PC did slow down too. I'm angry at Intel right now.

The meltdown fix should be pretty problem free.

I assume you are talking about the microcode fix for spectre?

AltCapwn · Jan 23, 2018

R-T-B said:
The meltdown fix should be pretty problem free.

I assume you are talking about the microcode fix for spectre?

Yes exactly.

Papahyooie · Jan 23, 2018

I just want to reiterate: "Vanilla-flavored potato."

Aquinus · Jan 24, 2018

R-T-B said:
I assume you are talking about the microcode fix for spectre?

Important takeaway is this part of Linus' response:

Linus Torvalds said:
That's part of the big problem here. The speculation control cpuid stuff shows that Intel actually seems to plan on doing the right thing for meltdown (the main question being _when_). Which is not a huge surprise, since it should be easy to fix, and it's a really honking big hole to drive through. Not doing the right thing for meltdown would be completely unacceptable.

So the IBRS garbage implies that Intel is _not_ planning on doing the right thing for the indirect branch speculation.

Honestly, that's completely unacceptable too.

Big Edit: More or less, I read that as Intel not making a microcode update for the indirect branch speculation stuff. I don't do kernel and system dev but, I can kind of understand what they're talking about when I read through the thread (which I did.) The main problem seems that there isn't a clear way to solve this issue if it's going to be fixed at the OS level in the kernel instead of as a microcode update. On one hand you have a hole that, depending on the context in which is has been run, may be a security vulnerability. However on the other hand, doing a microcode update very well could mean a substantial performance hit, possibly one even bigger than retpoline (which isn't in places where it doesn't have to be, mind you.)

So I see it like this: Intel could fix it with a microcode update but, that will cost more performance across the board but, will patch the hole for good or it could be left up to kernel and software developers to determine if and when protections from this kind of exploit are required. I personally think that's a big ask of the application development community because we (and I say this as an application dev,) that I don't want to be thinking about when I need to protect hardware from an attack and I think Linus is thinking the same thing.

Honestly, I don't care what the performance hit is. Intel needs to man up and fix this instead of trying to pass the buck. It's a problem that they need to own up to and I would hold AMD and ARM to similar standards. I understand that these things happen. At work I've spent the last several days fixing bugs and they happen more than you realize, but if something makes it to production, you fix it as quickly as possible. If it hurts performance, that can be part of the next release (for CPUs that would be next gen,) but you have to freaking fix it.

So, rant over, tl;dr: Intel needs to fix this, regardless of the performance hit. Not doing a microcode update for this is unacceptable as Linus suggests.

Katanai · Jan 24, 2018

Aquinus said:
So, rant over, tl;dr: Intel needs to fix this, regardless of the performance hit.

This would be unacceptable for me and a lot of people. Think about it this way: if you lose 5-10% performance on your CPU its like you changed out your CPU for an older generation chip. Even so, let's say it might work for you and me but how about servers with hundreds of CPU's in them? The performance loss would be massive. Maybe a company changed out 256 CPU's in a cluster to a newer generation so they get 10% CPU increase. What now? You take all that back? Give them the money they spent back then. Believe me, they will ask for it, in court...

R-T-B · Jan 24, 2018

Katanai said:
Believe me, they will ask for it, in court...

They already are. However for any serious company hosting datacenter, not patching is simply not an option. You'd have your servers completely at the mercy of your users.

hat · Jan 24, 2018

I wonder when we can expect a hardware fix that works properly without degrading performance? Personally I have no issues waiting for whatever comes after the current generation...

londiste · Jan 24, 2018

As the referenced email thread states, the patches in question are for Spectre (apparently for Variant 2 of it).

RejZoR said:
Well, if you can just simply toggle the patch, malware can do that too. And then siphon data through a cache exploit undetected lol

Again, if you have this kind of access to the operating system kernel, you have no need for something like Spectre or Meltdown.

R-T-B said:
This is also best taken in context: Linus is only referencing the submitted linux kernel patches. Windows and microcode patches need not apply here.

One has to wonder if Microsoft is fighting back at Intel in the same way. Microcode patches these kernel updates rely on, are common for both/all operating systems.

Aquinus said:
Honestly, I don't care what the performance hit is. Intel needs to man up and fix this instead of trying to pass the buck. It's a problem that they need to own up to and I would hold AMD and ARM to similar standards. I understand that these things happen. At work I've spent the last several days fixing bugs and they happen more than you realize, but if something makes it to production, you fix it as quickly as possible. If it hurts performance, that can be part of the next release (for CPUs that would be next gen,) but you have to freaking fix it.

So, rant over, tl;dr: Intel needs to fix this, regardless of the performance hit. Not doing a microcode update for this is unacceptable as Linus suggests.

Microcode does get updated either way. The features these kernel patches rely on should come (or get updated) with the microcode updates.
AMD and ARM are an interesting question here. Were their patches for this good?

Melvis · Jan 24, 2018

LOL That just makes me laugh when I read what Linus wrote, what a dude! Stick it to them man and make them fix there shit, intel....the company no one can trust.

xenocide · Jan 24, 2018

Aquinus said:
Honestly, I don't care what the performance hit is. Intel needs to man up and fix this instead of trying to pass the buck. It's a problem that they need to own up to and I would hold AMD and ARM to similar standards. I understand that these things happen. At work I've spent the last several days fixing bugs and they happen more than you realize, but if something makes it to production, you fix it as quickly as possible. If it hurts performance, that can be part of the next release (for CPUs that would be next gen,) but you have to freaking fix it.

So, rant over, tl;dr: Intel needs to fix this, regardless of the performance hit. Not doing a microcode update for this is unacceptable as Linus suggests.

The "fix" for this is designing a new CPU from the ground up. I'm confused by your post because you seem to acknowledge that, but also ran about how they aren't doing enough. There's not much they can do, it's an architectural problem, and they are trying to find the least negative solution to mitigate the security threat--since the places most at risk for this are datacenters which are using dozens or thousands of Intel CPU's at a time. We've already seen what can happen when all those CPU's take a hit--various games had servers that had to be taken offline the first time they pushed out a patch that involved a 5-10% performance hit. If they pushed out a roughly thrown together patch with a 25% performance hit, they would probably crash half the Internet...

System Name	RBMK-1000
Processor	AMD Ryzen 7 5700G
Motherboard	ASUS ROG Strix B450-E Gaming
Cooling	DeepCool Gammax L240 V2
Memory	2x 8GB G.Skill Sniper X
Video Card(s)	Palit GeForce RTX 2080 SUPER GameRock
Storage	Western Digital Black NVMe 512GB
Display(s)	BenQ 1440p 60 Hz 27-inch
Case	Corsair Carbide 100R
Audio Device(s)	ASUS SupremeFX S1220A
Power Supply	Cooler Master MWE Gold 650W
Mouse	ASUS ROG Strix Impact
Keyboard	Gamdias Hermes E2
Software	Windows 11 Pro

Processor	Haswell i7 4770
Motherboard	Asus Z87-PRO
Memory	32GB DDR3-2133 10-10-10-30
Video Card(s)	2x Radeon R9 390X
Storage	Samsung SSD M840 Pro 256GB, 4x320GB mechanical RAID 5

System Name	Pioneer
Processor	Ryzen R9 9950X
Motherboard	GIGABYTE Aorus Elite X670 AX
Cooling	Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory	64GB (4x 16GB) G.Skill Flare X5 @ DDR5-6000 CL30
Video Card(s)	XFX RX 7900 XTX Speedster Merc 310
Storage	Intel 5800X Optane 800GB boot, +2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs
Display(s)	55" LG 55" B9 OLED 4K Display
Case	Thermaltake Core X31
Audio Device(s)	TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply	FSP Hydro Ti Pro 850W
Mouse	Logitech G305 Lightspeed Wireless
Keyboard	WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software	Gentoo Linux x64 / Windows 11 Enterprise IoT 2024

System Name	Compy 386
Processor	7800X3D
Motherboard	Asus
Cooling	Air for now.....
Memory	64 GB DDR5 6400Mhz
Video Card(s)	7900XTX 310 Merc
Storage	Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s)	55" Samsung 4K HDR
Audio Device(s)	ATI HDMI
Mouse	Logitech MX518
Keyboard	Razer
Software	A lot.
Benchmark Scores	Its fast. Enough.

System Name	Budget Box
Processor	Xeon E5-2667v2
Motherboard	ASUS P9X79 Pro
Cooling	Some cheap tower cooler, I dunno
Memory	32GB 1866-DDR3 ECC
Video Card(s)	XFX RX 5600XT
Storage	WD NVME 1GB
Display(s)	ASUS Pro Art 27"
Case	Antec P7 Neo

Intel's Patch for Meltdown, Spectre "Complete and Utter Garbage:" Linus Torvalds

btarunr

Editor & Senior Moderator

RejZoR

Death Star

R-T-B

Steevo

Darmok N Jalad

AltCapwn

R-T-B

AltCapwn

Papahyooie

Aquinus

Resident Wat-man

Katanai

R-T-B

hat

Enthusiast

londiste

Melvis

xenocide

System Name	Gamer
Processor	AMD Ryzen 3700x
Motherboard	AsRock B550 Phantom Gaming ITX/AX
Memory	32GB
Video Card(s)	ASRock Radeon RX 6800 XT Phantom Gaming D
Case	Phanteks Eclipse P200A D-RGB
Power Supply	800w CM
Mouse	Corsair M65 Pro
Software	Windows 10 Pro

System Name	Apollo
Processor	Intel Core i9 9880H
Motherboard	Some proprietary Apple thing.
Memory	64GB DDR4-2667
Video Card(s)	AMD Radeon Pro 5600M, 8GB HBM2
Storage	1TB Apple NVMe, 4TB External
Display(s)	Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case	MacBook Pro (16", 2019)
Audio Device(s)	AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply	96w Power Adapter
Mouse	Logitech MX Master 3
Keyboard	Logitech G915, GL Clicky
Software	MacOS 12.1

System Name	Starlifter :: Dragonfly
Processor	i7 2600k 4.4GHz :: i5 10400
Motherboard	ASUS P8P67 Pro :: ASUS Prime H570-Plus
Cooling	Cryorig M9 :: Stock
Memory	4x4GB DDR3 2133 :: 2x8GB DDR4 2400
Video Card(s)	PNY GTX1070 :: Integrated UHD 630
Storage	Crucial MX500 1TB, 2x1TB Seagate RAID 0 :: Mushkin Enhanced 60GB SSD, 3x4TB Seagate HDD RAID5
Display(s)	Onn 165hz 1080p :: Acer 1080p
Case	Antec SOHO 1030B :: Old White Full Tower
Audio Device(s)	Creative X-Fi Titanium Fatal1ty Pro - Bose Companion 2 Series III :: None
Power Supply	FSP Hydro GE 550w :: EVGA Supernova 550
Software	Windows 10 Pro - Plex Server on Dragonfly
Benchmark Scores	>9000

Processor	Ryzen 7800X3D
Motherboard	ROG STRIX B650E-F GAMING WIFI
Memory	2x16GB G.Skill Flare X5 DDR5-6000 CL36 (F5-6000J3636F16GX2-FX5)
Video Card(s)	INNO3D GeForce RTX™ 4070 Ti SUPER TWIN X2
Storage	2TB Samsung 980 PRO, 4TB WD Black SN850X
Display(s)	42" LG C2 OLED, 27" ASUS PG279Q
Case	Thermaltake Core P5
Power Supply	Fractal Design Ion+ Platinum 760W
Mouse	Corsair Dark Core RGB Pro SE
Keyboard	Corsair K100 RGB
VR HMD	HTC Vive Cosmos

System Name	Night Rider \| Mini LAN PC \| Workhorse
Processor	AMD R7 5800X3D \| Ryzen 1600X \| i7 970
Motherboard	MSi AM4 Pro Carbon \| GA- \| Gigabyte EX58-UD5
Cooling	Noctua U9S Twin Fan\| Stock Cooler, Copper Core)\| Big shairkan B
Memory	2x8GB DDR4 G.Skill Ripjaws 3600MHz\| 2x8GB Corsair 3000 \| 6x2GB DDR3 1300 Corsair
Video Card(s)	MSI AMD 6750XT \| 6500XT \| MSI RX 580 8GB
Storage	1TB WD Black NVME / 250GB SSD /2TB WD Black \| 500GB SSD WD, 2x1TB, 1x750 \| WD 500 SSD/Seagate 320
Display(s)	LG 27" 1440P\| Samsung 20" S20C300L/DELL 15" \| 22" DELL/19"DELL
Case	LIAN LI PC-18 \| Mini ATX Case (custom) \| Atrix C4 9001
Audio Device(s)	Onboard \| Onbaord \| Onboard
Power Supply	Silverstone 850 \| Silverstone Mini 450W \| Corsair CX-750
Mouse	Coolermaster Pro \| Rapoo V900 \| Gigabyte 6850X
Keyboard	MAX Keyboard Nighthawk X8 \| Creative Fatal1ty eluminx \| Some POS Logitech
Software	Windows 10 Pro 64 \| Windows 10 Pro 64 \| Windows 7 Pro 64/Windows 10 Home

Processor	Intel i7-10700k
Motherboard	Gigabyte Aurorus Ultra z490
Cooling	Corsair H100i RGB
Memory	32GB (4x8GB) Corsair Vengeance DDR4-3200MHz
Video Card(s)	MSI Gaming Trio X 3070 LHR
Display(s)	ASUS MG278Q / AOC G2590FX
Case	Corsair X4000 iCue
Audio Device(s)	Onboard
Power Supply	Corsair RM650x 650W Fully Modular
Software	Windows 10