Wednesday, May 8th 2019
Some AMD Processors Have a Hardware RNG Bug, Losing Randomness After Suspend Resume
Red Hat Systemd (system and service manager) lead developer Lennart Poettering discovered that AMD A6-6310 "Beema" SoC that's popular among low-cost notebooks, has a faulty implementation of the RdRand random-number generation instruction. The processor's hardware random number generator (RNG) loses "randomness" after the machine resumes from a suspended state (i.e. waking up the notebook from sleep by opening its lid while it's powered on). Modern computers rely on RNGs for "entropy," critical to generation of unpredictable keys on the fly for SSL. However, the entropy source needn't be hardware, and isn't so by default. Software RNGs exist, and by default the Linux kernel does not use RdRand to generate entropy. Windows is not known to use RdRand for basic ACPI functions such as suspend/resume; however a faulty hardware RNG is not without implications for the platform, and applications that run on it.
Users on GitHub and Bugzilla report that with this bug, you cannot make a machine suspend a second time after waking it up from a suspended state, if your kernel uses RdRand. Commit cc83d51 to Systemd introduced optional randomness generation based on RdRand instruction. So, if RdRand instruction is present, it is used to generate UUIDs for invocation IDs. Michael Larabel of Phoronix comments that the RdRand bug is only found on older generations of AMD processors, "Excavator" and older; and does not affect the latest "Zen" processors. This bug report chronicles what's wrong with RdRand on the affected processors, as does this Linux kernel bugzilla thread. By avoiding RdRand usage on the system as part of generating a UUID, the reported systemd issue no longer happens. Red Hat is working on a solution to this bug.
Source:
Phoronix
Users on GitHub and Bugzilla report that with this bug, you cannot make a machine suspend a second time after waking it up from a suspended state, if your kernel uses RdRand. Commit cc83d51 to Systemd introduced optional randomness generation based on RdRand instruction. So, if RdRand instruction is present, it is used to generate UUIDs for invocation IDs. Michael Larabel of Phoronix comments that the RdRand bug is only found on older generations of AMD processors, "Excavator" and older; and does not affect the latest "Zen" processors. This bug report chronicles what's wrong with RdRand on the affected processors, as does this Linux kernel bugzilla thread. By avoiding RdRand usage on the system as part of generating a UUID, the reported systemd issue no longer happens. Red Hat is working on a solution to this bug.
27 Comments on Some AMD Processors Have a Hardware RNG Bug, Losing Randomness After Suspend Resume
Family 23/17h (Zen) is not affected, Family 22/16h (Jaguar/Puma) is affected.
Everyone who uses (or used...) this RdRand now have a moment of doubt - and potentially a problem.
AMD has to check how deep this goes and either fix or disable it in an update.
Companies that use these CPUs in offline systems (once again: embedded!) will have to analyze the risks as well.
That's how things go. This is a standard procedure for most enterprises. Hardware and software faults are found all the time. I'm not sure if I understand your post, but it doesn't seem correct the way I read it.
Pseudo-random numbers generated computationally are always deterministic - hence, of limited "quality".
Of course they are perfectly fine for many applications.
However, for specific scenarios (e.g. complex financial modelling, scientific simulations) it is often recommended (sometimes: required) to use a higher quality source.
You can either go for a very complex algorithm or a hardware generator. Either way, the better randomness means slower operation.
The popular fix to this, commonly used since Intel introduced RDRAND, is to keep using a good PRNG and periodically reset seed with RDRAND (RDSEED in fact).
Now, if it turns out that AMD's RDRAND is not reliable, this (really common) approach has to be scrapped as it may even be worse than using a PRNG all the way...