Wednesday, November 16th 2022
NVIDIA Brings the Benefits of DirectStorage 1.1 to Vulkan Under its RTX-IO Brand
NVIDIA dusted off its RTX-IO technology moniker which we thought it retired in the wake of the now-standardized DirectStorage API, in an attempt to bring its benefits to games powered by the Vulkan API. Team Green was the first to introduce such a technology to the PC platform, something functionally-similar existed with game consoles, where it plays a key role in speeding up game loading times. DirectStorage enables a means for the GPU to directly communicate with a storage device, with no round-trips to the CPU cores or main memory. This enables a quicker way for a game to transfer its assets to the video memory. NVIDIA introduces this as part of its latest GeForce 526.98 WHQL drivers. The same drivers also introduce official DirectStorage 1.1 support.
With DirectStorage 1.1, Microsoft went a step ahead and introduced GPU-accelerated game asset decompression. Game assets (such as textures) are stored on your disk in compressed form, and are decompressed as needed when your game loads. This involves the CPU cores, and tends to be slower when compared to getting the same job done by a GPU when not rendering 3D graphics. NVIDIA even developed a file-compression format optimized for highly-parallelized decompression hardware such as GPUs. The standardization by Microsoft extends this feature to other brands of GPUs (such as AMD and Intel, which are confirmed to be implementing it); but games powered by the Vulkan API were left out in the lurch. NVIDIA developed a Vulkan version of the original RTX-IO tech (which would go on to develop into DirectStorage), so now game developers with engines primarily designed for Vulkan (such as idTech), can speed up game load times.
With DirectStorage 1.1, Microsoft went a step ahead and introduced GPU-accelerated game asset decompression. Game assets (such as textures) are stored on your disk in compressed form, and are decompressed as needed when your game loads. This involves the CPU cores, and tends to be slower when compared to getting the same job done by a GPU when not rendering 3D graphics. NVIDIA even developed a file-compression format optimized for highly-parallelized decompression hardware such as GPUs. The standardization by Microsoft extends this feature to other brands of GPUs (such as AMD and Intel, which are confirmed to be implementing it); but games powered by the Vulkan API were left out in the lurch. NVIDIA developed a Vulkan version of the original RTX-IO tech (which would go on to develop into DirectStorage), so now game developers with engines primarily designed for Vulkan (such as idTech), can speed up game load times.
25 Comments on NVIDIA Brings the Benefits of DirectStorage 1.1 to Vulkan Under its RTX-IO Brand
The rapture must be nigh...
Thanks, Microsoft. The next generation of console will be out before you give PC gaming feature parity on something you announced before the Series X even came out.
Talking about there being no-round trips to the CPU cores or memory is silly. A round trip would suggest that the GPU not only had the assets loaded, but then sent a copy of them back, which is unlikely to ever happen. Computers do not work like public libraries where you need to return the copy of data you borrowed.
They developed the feature for Vulkan but are advertising it as RTX IO so maybe it will be nvidia exclusive until someone else shares an implemention with Kronos?
Radeon SSG, no matter how limited it was, says hi and calls BS.
Besides, RTX-IO is just an implementation of the DirectStorage. Although probably they were part of DirectStorage itself as well.
github.com/openzfs/zfs/graphs/contributors
When I first heard about DirectStorage, I looked into it since it piqued my curiosity as a filesystem developer. DirectStorage does not speak directly to hardware. It is just a marketing term for a convenience library on top of the existing OS APIs that does on-GPU decompression via a compute shader.
As for Nvidia's GPUDirect, that is potentially able to talk directly to NVMe, although you can expect to give up plenty of modern things such as software RAID, logical volume management, transparent encryption, etcetera if you are to use it. However, that is as related to DirectStorage as Chinese is related to English (i.e. they have a few superficial similarities, but that is all). It is also largely unnecessary since the CPU is the one telling the GPU what to do and it is very easy for the CPU to schedule reads into GPU VRAM. The only thing that their approach does is move the interrupt handler from the CPU to the GPU, but I have not heard anyone claim that is a bottleneck for graphics.
I don't honestly know which came first, but the PS5 technology reveal was demonstrated in realtime on ES, near-production silicon and the OS already had full software/API support for the feature way back in early 2020, before Nvidia even anounced Ampere. Clearly, Ampere silicon was already in pre-production at that point, but by then they were already a whole year, perhaps even 18-months behind Sony who were at the full trilogy of hardware/software/developer completion and demoing the finished technology to the public.
Nvidia's adoption and contribution to DirectStorage with RTX-IO was likely them jumping in to avoid FOMO, but I'm reasonably confident it was a feature born for consoles and driven by Sony/AMD for the PS5 first.
"Action speaks louder than words" - Mark Twain
www.usenix.org/system/files/conference/atc17/atc17-bergman.pdf
That paper itself does nothing more than use existing OS operations to achieve DMA and software has been able to do this for more than a decade.
What Sony did was implement a special hardware controller that does decompression. The CPU is still initiating these IO operations and handling the interrupt from the device saying that the IO has completed. The unified memory of the PS5 means that a copy into CPU memory is a copy into GPU memory, so Sony does not need to do anything special to make sure that the data is read into a GPU mapped buffer.
That said, I was able to confirm that Sony is not having the GPU do IO to NVMe by watching this presentation that Sony gave on the PS5:
The smoking gun is when they say that the game developer does not need to know about any of this and can do things as usual.
An engineer at Nvidia with whom I am acquainted disagreed. He thought Sony did have the GPU talk to NVMe. I asked him to watch that presentation and think about whether the description they give is consistent with the GPU talking to storage. His conclusion was that Sony was not having the GPU talk to NVMe, despite initially thinking that Sony had done the same thing Nvidia did based on the hype. He was rather disappointed, since he likes the idea of having the GPU talk to NVMe directly.
As for how it works, my understanding is that the I/O complex is doing compression transparently and hides that from the OS. That is why despite the compression being done, you cannot store more than the amount of data that the NAND flash is rated to store (unlike ZFS, which allows storage to exceed the raw storage capacity via compression). While the flash itself might be limited to 5.5GB/sec, the transparent decompression allows speeds to reach 9GB/sec, which is presumably the limit of their custom Kraken decompressor.
If we're talking about when this feature (which has been around since pre-2017) had its first real-world release, It's clear Sony/AMD kickstarted the process with the PS5's Kraken Engine which keeps data compressed both in VRAM and on NVMe, which is not how Microsoft's implementation works at the moment. The distinction is kind of unimportant, IMO. What Sony did is woke up Microsoft who realised that if Sony can do asset compression via the GPU and read compressed assets direct from NVMe with the AMD hardware in the PS5, the incredibly similar hardware in the Xbox should be able to do it too. DirectStorage (aka Xbox Velocity Architecture) was a copy of Sony's solution, born at around the time Microsoft learned of Sony's implementation. RTX-I/O uses shaders to depcompress assets read from NVMe whilst the PS5 and Xbox hardware has dedicated GPU decode blocks dedicated to the storage.
DirectStorage, RTX-I/O, XBox Velocity Architecture, and Sony's PS5 Kraken Engine are not 1:1 directly comparable. The two most closely-related are Velocity and Kraken simply because they use very similar silicon with heavily-overlapping capabilities. RTX-I/O uses shaders not tensor cores, so in theory it's not limited to RTX hardware at all, and I suspect Vulkan will get a GPU-agnostic equivalent in due-course.
openzfs.org/w/images/6/63/Lightning_Talk-Zacodi_Labs-Maxim_Martynov.pdf DirectStorage is just a library. Anyone can write one that does what it does. These guys did:
www.usenix.org/system/files/conference/atc17/atc17-bergman.pdf
I am not sure why people are overexcited about these things. Faster hardware has always been on its way. There is no radical overhaul needed for it to work. Making better use of OS APIs has always been better too and there is nothing stopping people from doing that right now without any special libraries.
AFAIK DirectStorage requirements are just: Win11 + SSD/NVMe + GPU DX12 and Shader Model 6.0 compatible, but I'm not sure about greedy NVidiaware...
EDIT: from a brief search on the net it seems will require GTX 3000+ at least.... which would be usual NVidia shame
Perhaps the absolute dumpster-fire of RTX support at launch was enough to abandon that idea. I remember Nvidia promising dozens on titles with raytracing when the 20-series launched, and 18 months later it was basically still only enough titles to count on your fingers, the performance hit being crippling for all but the two best cards, and a few implementations like SOTR being almost indistinguishable from the raster version, apart from the massive hit to framerate.
Despite buying into RTX at launch, I don't think any games were truly worth using RTX on, other than enabling it briefly out of curiousity, until the next generation of games and the next generation of GPUs came along. 2018 and 2019, and most of 2020 were dark days for Raytraced games.