Friday, August 4th 2017
AMD X399 Platform Lacks NVMe RAID Booting Support
AMD's connectivity-rich Ryzen Threadripper HEDT platform may have an Achilles's heel after all, with reports emerging that it lacks support for booting from NVMe RAID. You can still have bootable NVMe RAID volumes using NVMe RAID HBAs installed as PCI-Express add-on cards. Threadripper processors feature 64-lane PCI-Express gen 3.0 root complexes, which allow you to run at least two graphics cards at full x16 bandwidth, and drop in other bandwidth-hungry devices such as multiple PCI-Express NVMe SSDs. Unfortunately for those planning on striping multiple NVMe SSDs in RAID; the platform lacks NVMe RAID booting support. You should still be able to build soft-RAID arrays striping multiple NVMe SSDs, just not boot from them. Pro-sumers will still be able to dump their heavy data-sets onto such soft-arrays. This limitation is probably due to PCI-Express lanes emerging from different dies on the Threadripper MCM, which could present problems to the system BIOS to boot from.
Ryzen Threadripper is a multi-chip module (MCM) of two 8-core "Summit Ridge" dies. Each 14 nm "Summit Ridge" die features 32 PCI-Express lanes. On a socket AM4 machine, 4 of those 32 lanes are used as chipset-bus, leaving 28 for the rest of the machine. 16 of those head to up to two PEG (PCI-Express Graphics) ports (either one x16 or two x8 slots); and the remaining 12 lanes are spread among M.2 slots, and other onboard devices. On a Threadripper MCM, one of the two "Summit Ridge" dies has chipset-bus access; 16 lanes from each die head to PEG (a total of four PEG ports, either as two x16 or four x8 slots); while the remaining are general purpose; driving high-bandwidth devices such as USB 3.1 controllers, 10 GbE interfaces, and several M.2 and U.2 ports.There is always the likelihood of two M.2/U.2 ports being wired to different "Summit Ridge" dies; which could pose issues in getting RAID to work reliably, which is probably the reason why NVMe RAID booting won't work. The X399 chipset, however, does support RAID on the SATA ports it puts out. Up to four SATA 6 Gb/s ports on a socket TR4 motherboard can be wired directly to the processor, as each "Summit Ridge" puts out two ports. This presents its own set of RAID issues. The general rule of the thumb here is that you'll be able to create bootable RAID arrays only between disks connected to the same exact SATA controller. By default, you have three controllers - one from each of the two "Summit Ridge" dies, and one integrated into the X399 chipset. The platform supports up to 10 ports. You will hence be able to boot from SATA RAID arrays, provided they're built up from the same controller; however, booting from NVMe RAID arrays will not be possible.
Source:
Tom's Hardware
Ryzen Threadripper is a multi-chip module (MCM) of two 8-core "Summit Ridge" dies. Each 14 nm "Summit Ridge" die features 32 PCI-Express lanes. On a socket AM4 machine, 4 of those 32 lanes are used as chipset-bus, leaving 28 for the rest of the machine. 16 of those head to up to two PEG (PCI-Express Graphics) ports (either one x16 or two x8 slots); and the remaining 12 lanes are spread among M.2 slots, and other onboard devices. On a Threadripper MCM, one of the two "Summit Ridge" dies has chipset-bus access; 16 lanes from each die head to PEG (a total of four PEG ports, either as two x16 or four x8 slots); while the remaining are general purpose; driving high-bandwidth devices such as USB 3.1 controllers, 10 GbE interfaces, and several M.2 and U.2 ports.There is always the likelihood of two M.2/U.2 ports being wired to different "Summit Ridge" dies; which could pose issues in getting RAID to work reliably, which is probably the reason why NVMe RAID booting won't work. The X399 chipset, however, does support RAID on the SATA ports it puts out. Up to four SATA 6 Gb/s ports on a socket TR4 motherboard can be wired directly to the processor, as each "Summit Ridge" puts out two ports. This presents its own set of RAID issues. The general rule of the thumb here is that you'll be able to create bootable RAID arrays only between disks connected to the same exact SATA controller. By default, you have three controllers - one from each of the two "Summit Ridge" dies, and one integrated into the X399 chipset. The platform supports up to 10 ports. You will hence be able to boot from SATA RAID arrays, provided they're built up from the same controller; however, booting from NVMe RAID arrays will not be possible.
75 Comments on AMD X399 Platform Lacks NVMe RAID Booting Support
"Ryzen Threadripper HEDT platform may have an Achilles's heel after all" is too far fetched even for Techpowerup.
Reminds me of how HARDOCP used to trash AMD every chance they get especially right after Kyle did not get invited to one of AMD Polaris event. But attitude changed right after they send him couple of samples.
For real there might be less then 1% of TR consumes who would RAID 0 two of their PCI x4 NVME drives.
It takes just one person to start the Fire and headlines like such do a pretty good job.
I'm fairly sure quite an opposite observation is true.
I'd say that for a significant part of people that actually buy into HEDT, disks only come in RAID setups.
It's a bit like if TR was - for whatever reason - not usable for solving PDE. Who cares, right?
But here comes the actual problem.
Threadripper is known to be just a cut-down EPYC server CPU. Does that mean EPYC can't boot from a RAID as well? :-)
If the slot is wired to the chipset, then yes you can use any brand drive in RAID and more than just RAID-0. Of course, this has the disadvantage of routing the data through the slower chipset path instead of directly to the CPU. It will limit the access speed from the rest of the system to the NVMe array to about that of a PCI-E 3.0 x4 link. If the slot is wired to the CPU, then you have to use VROC for RAID.
Sad part is that, while discussion themes changed, the people (their knowledge) didn't.
When a thread is about RAID and majority of comments is about NVMe speed and OS boot time, it kind of says it all...
My bigger point what that disk I/O has got fast enough where RAID-0 doesn't make a whole lot of sense. I did it with SATA3 not even for speed but, because at the time, RAID-0 of two 120GBs was cheaper than a single 240GB and just happened to be a little faster in certain situations. However, with how cheap SSD storage has become and how fast NVMe devices are getting, I see very little reason to want to do RAID-0 with NVMe devices. If you need something that fast but, require a replica, I would argue that something more eventually consistent would allow you to retain more performance while sacrificing a minute or so worth of data being written in the case of catastrophe.
All in all, I think that this article is a non-issue and isn't even worthy of the attention it is receiving. It's not a realistic need that solves any real tangible problem.
That is one scenario that would use them. Another would be deep learning, cad/cam, video capture etc. Plenty of uses for pcie lanes outside of nvme devices.
As you've said, it is servers where RAID becomes crucial. And the worst thing is that TR is an EPYC underneath. So will EPYC be able to boot from RAID?
Speaking from workstation point of view, RAID0 on NVMe is waste of time. I would be interested in RAID10 as it adds crucial Redundancy (that's R in RAID for "RAID0 generation"), but just R0 on a devices which can deal with like 300000 IOPS R/W. Nuts! RAID at this moment in time is ancient technology anyway - which never was designed to work with NVMe devices. RAID adds a lot of overhead on top of much superior NVMe protocol.
What's the point?
Only for benchmarks nerds and for bigger e-pen. Nothing else.
In advanced servers yes you can utilize this (to a point), but in SOHO segment... even video editing with multiple 8K live streams won't benefit much if at all from RAID0 on NVMe.
What I would love to see are PCIe cards with M.2 slots for up to 4 drives (don't need RAID just NVMe connectivity). Have you tried drive pooling with NVMe? No? That's quite something to behold without RAID quirks and moods.