Wednesday, April 21st 2021
DirectStorage API Works Even with PCIe Gen3 NVMe SSDs
Microsoft on Tuesday, in a developer presentation, confirmed that the DirectStorage API, designed to speed up the storage sub-system, is compatible even with NVMe SSDs that use the PCI-Express Gen 3 host interface. It also confirmed that all GPUs compatible with DirectX 12 support the feature. A feature making its way to the PC from consoles, DirectStorage enables the GPU to directly access an NVMe storage device, paving the way for GPU-accelerated decompression of game assets.
This works to reduce latencies at the storage sub-system level, and offload the CPU. Any DirectX 12-compatible GPU technically supports DirectStorage, according to Microsoft. The company however recommends DirectX 12 Ultimate GPUs "for the best experience." The GPU-accelerated game asset decompression is handled via compute shaders. In addition to reducing latencies; DirectStorage is said to accelerate the Sampler Feedback feature in DirectX 12 Ultimate.More slides from the presentation follow.
Source:
NEPBB (Reddit)
This works to reduce latencies at the storage sub-system level, and offload the CPU. Any DirectX 12-compatible GPU technically supports DirectStorage, according to Microsoft. The company however recommends DirectX 12 Ultimate GPUs "for the best experience." The GPU-accelerated game asset decompression is handled via compute shaders. In addition to reducing latencies; DirectStorage is said to accelerate the Sampler Feedback feature in DirectX 12 Ultimate.More slides from the presentation follow.
76 Comments on DirectStorage API Works Even with PCIe Gen3 NVMe SSDs
also for the arguments about needing a CPU Connected NVME.... nah. this is about LOAD times, all that will happen on a slower NVME device is that it'll.... load slower. It wont go "ah shit 3GB/s instead of 3.2? NOOOOOO"
NVME is the requirement because they're programming the GPU to use the NVME driver language, whereas using AHCI would work on every SATA device and likely give some really shit user experiences when some idiot runs it off a mech drive
Have you seen Ratchet on PS5?
We're talking about different level design... no way to run that on slower SSD. Uh, what?
GPUs don't understand "NVMe driver language".
As someone else said, it's about direct/p2p DMA transfers straight from the SSD to the GPU (with the PCIe root complex acting as a middle-man).
The root complex is like an Ethernet switch, so ideally you need direct connection. If you want to go from place A to place B, you follow the shortest route. You don't go to place C first (which is much farther).
Unless you have a hard speed limit for this, you're just being paranoid and spreading FUD.
Tell me how an NVME PCIE 4.0 card on my x570 chipset slot is going to be slower than an NVME PCI-E 3.0 card on a CPU slot on B450?
X570 is a special case as I said, not a normal chipset.
AMD AM4 platforms should be fine (whether it's B450 or X570). That's because all of them have 4 dedicated lanes (and you should use them).
Intel platforms will experience bus bottlenecks. Do you think Intel is stupid for adding 4 dedicated lanes on Rocket Lake?
It's all about reducing bottlenecks.
They will not make this tech exist and then lock it down to a minority of their target platform.
Shit all the games that support this are still going to have fallbacks for systems with no support.
I don't expect Ratchet to run on Intel platforms with no dedicated lanes.
NVMe works fine on all AMD platforms. Just use the dedicated lanes. :)
Also on the xbox whilst the built in storage is directly connected to cpu, the expansion port is not and microsoft have confirmed that port has the same rules as the internal storage for their software api aka it can play X/S series games.
Performance wise the only time a nvme drive will be slower on a southbridge port is when the link between the chipset and cpu is saturated enough to slow it down, in the majority of computers it wont be.
If it does for some reason get limited to recent chipsets, then it will be an artificial restriction to sell new kit. Of course I could be wrong its only my opinion, but I dont see anything in hardware spec sheets as to why southbridge based drives would not work.
The incompatibility is the sata protocol, its a protocol issue not a chipset latency one.
2) Wrong. I'll have to ask you again: have you studied the console PCBs? Something tells me you haven't.
There are 4 dedicated lanes and 2 of them go to each SSD, straight from the APU. Southbridge is on a separate PCB (daughterboard).
Don't make me post pictures, they're available if you search for them...
But even if I am wrong on the PCB the issue is a protocol one in my opinion not a chipset latency one, are you trying to claim a pcie3 nvme drive on a cpu based pcie lane has a different performance metric to one connected to the chipset?
Until we are told specifically it wont work with a reason to back it up, I am going to assume it will work. It will likely just need a minimum nvme version protocol requirement, plus minimum rated speed drive.
Do you remember when AMD launched K8 with IMC (integrated memory controller) and Intel still had no IMC (until Nehalem came)?
How many Intel users said "nobody needs an IMC"? And how many people take IMC for granted now?
I expect the latency differences are insignificant, there is nothing in any directstorage api article that states it matters, it just needs drives to meet a certain specification and the nvme protocol.
Also not sure what the IMC has to do with this. NVME latencies and bandwidth are nothing like ram.
Not saying I am 100% right, just that until I see a statement saying it wont work I dont have reason to believe otherwise. :)
Think of all the reviewers who pushed nvme drives for long periods on chipset ports and they were not hitting saturation that affected the performance.
Not saying Ratchet will ever get ported to PC, but IF it does, expect 5.5 GB/s to be the minimum required spec. How is your southbridge going to handle that?
I understand that some of you need to justify your rigs (especially if you have an old Intel PC with PCIe 3.0/PCH NVMe), but again: try to approach what I'm saying with an open mind... otherwise it's totally pointless to even bother.
And yes, 15+ years ago people didn't believe me when I told them about the IMC benefits. The same pattern happens now with NVMe.
Technology needs to progress and leave old rigs behind. It's always been that way, but most people don't even have an open mind.
You've got some weird ideas going on here
Have you ever wondered why SATA vs NVMe SSD benchmarks show ZERO difference so far?
Maybe, just maybe the antiquated Windows I/O stack has something to do with that?
devblogs.microsoft.com/directx/directstorage-is-coming-to-pc/ Of course not every game will need high I/O. Pixelated indies will work just fine even with HDD.
But if you want next-gen games like Ratchet on your PC, you'll need to upgrade your rig. There's no other way. What kind of "weird" ideas? Does MS has weird ideas?
Did you see the portals and how fast it changes worlds? How are you going to accomplish that with a slower medium?
I'm pretty sure you didn't even see the whole video.
*checks notes*
Oh right loading it into RAM
You're going to pay a lot more money if you follow that route, since RAM tends to be more expensive. I know, because I have 64GB DDR4 on my PC since 2019.
That's assuming that game devs will actually code a 2nd path (RAMdisk), while we do know (if you read what actual presentations say) that both consoles have an anti-RAMdisk philosophy this gen (small RAM + high-speed SSD to load assets on the fly).
Most PC gamers only have 16GB of RAM.
Do your homework and then we can talk.
Don't be surprised if Sony asks for 128GB of RAM for Ratchet to run on PCs that don't actually have a fast SSD... will that be cheaper? Probably not.
Ratchet is just an example, others will follow soon after that.
ps: I'm not excited about Ratchet, nor €80 games. I'm here to post technological facts, but maybe I'm in the wrong site, since I see a lot of prejudice against consoles and their paradigm shift. Stop making assumptions about people that you don't even know.
Any data going through system RAM goes through CPU/its package.
Because that's where memory controller is and that's what system RAM is connected into.
What's skipped is CPU cores handling that data. Okay, let's check technological facts:
Flash memory has literally many magnitudes worser latencies than DRAM and lot worser bandwidth.
So if that game actually needs to constantly handle that much data, it's going to have lots of issues needing hiding on consoles.
No Flash based NVMe is simply even remotely fast enough to deliver data fast enough on the fly at the moment GPU needs it!
Just remember that if graphics card runs out of VRAM and has to wait data from (faster than any NVMe) system RAM, that causes instant performance drops.
And courtesy of minimal generational memory increase, those new consoles don't even have any RAM to spare for buffering.
While that PC owner affording prices of PCIe v4 NVMes should automatically have 32 GB of system RAM...
With likely 20 GB of it sitting there with no direct use for the game and hence available for buffering on background.
Game developer would only need to code game to prefetch data when end of one part of the game level/map approaches and new assets would be available faster than from any NVMe.
What's holding back game loading times most is no doubt crappy coding.
There are games whose loading times scale very nicely with transfer rates dropping to second or two level even without any DirectStorage:
www.realhardwarereviews.com/silicon-power-us70-1tb-review/11/
Though that 24 core Threadripper of test platform offers some serious data crunching power...
developer.nvidia.com/gpudirect
Do. Your. Homework.
Also Well yeah this is what I have been trying to tell you, you making assumptions based on theory, and the speedup in the new i/o stack is optimisations to the software stack as to how the data is read, the bottleneck is the sata protocol and i/o stack not the chipset interface. Plus that gpu hardware can handle the data decompression etc. faster than a typical cpu can.
Those nvme performance reviews are relevant to the point to prove that in a typical system the chipset link doesnt strangle a nvme drive. You would maybe have issues though if trying to read from multiple nvme drives at the same time over the chipset or have some other bandwidth heavy device running there, but these are very rare cases in consumer pc's.
We simply going to have to wait and see.