Friday, October 14th 2022

DirectStorage not used by any Games, Microsoft hopes DirectStorage 1.1 with GPU Asset Decompression can Fix This

Back in March 2022, Microsoft formally debuted the DirectStorage 1.0 API that enables direct interactions between a GPU and a storage device; thereby reducing the processing load of the storage stack on the CPU and main memory. This release, however, lacked a killer feature that's available to consoles—asset decompression. With the lukewarm response from game developers to DirectStorage 1.0 for PC, Microsoft has finally updated the API, introducing the feature with DirectStorage 1.1.

With this feature, your GPU can not only directly fetch game assets from the storage device (an SSD that uses either NVMe or AHCI protocols), but also pull them in their natively-stored compressed state. These assets are then decompressed by the GPU using compute shaders, and the decompressed assets remain in the video memory. This will directly impact game loading times, as asset decompression no longer involves the CPU. Its impact on the game's framerate will be minimal, as the API mainly accelerates game loading times, not gameplay itself. Game assets are organized pieces of data such as textures, 3D model files, music, sound effects—pretty much all of the individual pieces of content that make up a 3D scene.
File compression (and decompression) remains a compute-heavy workload that benefits from parallelism, and here the GPU and its faster memory help greatly. Once the relevant assets are committed to video-memory, the remaining data from the asset containers are purged from video memory to make room for the rest of the game's memory load. Microsoft in its tech demo example showed how a 3D scene's assets were loaded in 0.8 seconds with DirectStorage 1.1, compared to 2.36 seconds without it. This is just a synthetic example, you can imagine the impact on much larger AAA games that take dozens of seconds to load levels, even with NVMe SSDs. The hold-up here is not the storage device, but the CPU trying to decompress relevant assets.

Along with DirectStorage 1.1, Microsoft is introducing GDeflate, a file compression format for game assets, developed by NVIDIA, and is working with all PC GPU manufacturers, including AMD and Intel, to add support for this file format through graphics driver updates. The format is optimized for highly parallelized compression and decompression methods over a large number of threads, which make them better optimized for GPUs. This doesn't necessarily mean that all games have to use GDeflate in order to take advantage of GPU-accelerated asset decompression over DirectStorage 1.1, it's just an added optimization for game developers working on new projects. Patching already-released games to have GDeflate would involve redistributing the entire game asset load (which is still fine if the developer chooses to).

Microsoft plans to release DirectStorage 1.1 to game developers toward the end of 2022. The first games released or patched with DirectStorage 1.1 support should start coming out in 2023.
Source: Microsoft DirectX Blog
Add your own comment

60 Comments on DirectStorage not used by any Games, Microsoft hopes DirectStorage 1.1 with GPU Asset Decompression can Fix This

#2
Readlight
Will it help bug Windows 10 to get faster on hard drive?
Posted on Reply
#3
Bwaze
Is this meant only for initial game load?

Or do graphics cards have free compute shaders that can run decompression to load assets while gaming?

But this is really moving very slowly - for a technology that already works on gaming consoles, which are more or less assembled from midrange PC parts...
Posted on Reply
#4
londiste
Screenshots show a big speed increase as well as a big reduction in CPU load. Would be cool to know how much load this adds to GPU though.
Posted on Reply
#5
Unregistered
BwazeIs this meant only for initial game load?

Or do graphics cards have free compute shaders that can run decompression to load assets while gaming?

But this is really moving very slowly - for a technology that already works on gaming consoles, which are more or less assembled from midrange PC parts...
I blame Microsoft, seems they focus more on Xbox Live than anything else, DirectX should include this and many others like RT, upscaling, VRR... Etc which would have fixed the mess we have now with G-sync, FSR, DLSS...etc
#6
Kaleid
Finally. Hopefully this will have widespread engine support soon.
Gentlemen (and ladies) patch your games!
Posted on Reply
#7
londiste
Xex360I blame Microsoft, seems they focus more on Xbox Live than anything else, DirectX should include this and many others like RT, upscaling, VRR... Etc which would have fixed the mess we have now with G-sync, FSR, DLSS...etc
What? Where do you think Direct prefix in DirectStorage comes from?
DirectX includes DirectX Raytracing (DXR).
Upscaling/FSR/DLSS/XeSS are a bit higher level, up from base graphics APIs. VRR is on a different level as well, more on the hardware side.

As for Microsoft, the signs are more in the way that they are reconsidering their focus from Xbox to the entire ecosystem of Windows which includes Xbox. API pieces, games, services are coming from Xbox over to PC/Windows. Most notable are probably the games. Xbox Series X/S has barely any exclusives any more.
Posted on Reply
#8
TheoneandonlyMrK
ReadlightWill it help bug Windows 10 to get faster on hard drive?
Not at all no, it's not even supported on a HDD or Windows 10 afaik.
Plus the game has to be developed to use it much like Rtx etc.
So not sure there's a Game out that uses it yet.
Posted on Reply
#9
PapaTaipei
Less than 3x with DS1.1 vs no DS. Not fast enough to load tons of assets like on the UE5 engine which has amazing rendering quality but VERY bad performance.
Posted on Reply
#10
Leiesoldat
lazy gamer & woodworker
PapaTaipeiLess than 3x with DS1.1 vs no DS. Not fast enough to load tons of assets like on the UE5 engine which has amazing rendering quality but VERY bad performance.
What's the source for this information? The article states that DS1.1 will be released to developers at the end of 2022 so we don't know right now if this API is fast enough for UnrealEngine 5.
Posted on Reply
#11
Legacy-ZA
KaleidFinally. Hopefully this will have widespread engine support soon.
Gentlemen (and ladies) patch your games!
I upgraded to Windows 11 just for this... still waiting, maybe Windows 12? :P
Posted on Reply
#12
Kaleid
Legacy-ZAI upgraded to Windows 11 just for this... still waiting, maybe Windows 12? :p
"Microsoft only confirms that the SDK will be made available to game developers soon, but no date was provided. This means we will have to wait even longer for the first games to support this technology, unless Microsoft has already been working with select devs behind the scenes."
videocardz.com/newz/microsoft-confirms-directstorage-1-1-with-gpu-decompression-is-coming-soon


As it has been discussed here before, it should also work in win10 although it lacks some 11 specific optimizations.
Posted on Reply
#13
Legacy-ZA
Kaleidit lacks some 11 specific optimizations.
Precisely, in other words, don't get shafted.
Posted on Reply
#14
Punkenjoy
It's pointless anyway to want to use it on HDD. CPU, unless you have a really shitty one are fast enough to handle the decompression.

Well i hope that it will get leveraged pretty quickly as loading time on PC suck. (But i think in some game it's just very bad coding).

I wonder how it will be usefull in realtime in an openworld game. Using the compute units to decompress the assets might means framerate dips due to less ressources avaialble to render. We will see. Remember that the PS5 have a dedicated chip for that.
Posted on Reply
#15
Dirt Chip
And the best way they find to express that new tech was a pic of sliced, floating in air avocados.
Right.
Posted on Reply
#16
Franzen4Real
BwazeIs this meant only for initial game load?

Or do graphics cards have free compute shaders that can run decompression to load assets while gaming?

But this is really moving very slowly - for a technology that already works on gaming consoles, which are more or less assembled from midrange PC parts...
Back in the old days developers had to use some tricks for asset loading. When you would move between one main area of a game to a new one, you would get either a load screen, or, they would use tricks like connecting the two areas with long hallways, elevators, etc just to buy enough time for the old area assets to be purged and the new area assets to be loaded into memory. Or in some games you would see lots of ‘pop in’ of assets that were still loading by the time you had moved in visible range of them. So to your question, yes it does run during game play. The assets are decompressed on the fly instead of loading every one of them into memory. Aside from faster initial loads and reducing the need for ‘tricks’ to swap assets, one
benefit to this is that you no longer have lots of assets taking up memory while just sitting their waiting to be used, which in turn can potentially reduce the amount of VRAM required by the game.

I also agree, this has been a long time coming to for PC games. We have had RTX IO since the 30 series launch and AMD has Smart Access Storage to implement DirectStorage.
Posted on Reply
#17
80251
Why would I want to utilize slow direct storage if I have a PC with 32 GiB or 64 GiB of memory? DDR4 access times and bandwidth are far greater than any NVME M.2 storage solution regardless of PCIe revision.
Posted on Reply
#18
TheoneandonlyMrK
80251Why would I want to utilize slow direct storage if I have a PC with 32 GiB or 64 GiB of memory? DDR4 access times and bandwidth are far greater than any NVME M.2 storage solution regardless of PCIe revision.
Well for the most part many try ram cache few can be arsed passed game 2.
Secondly it's obviously not memory it's storage try fitting gtaV on your 64GB ram cache.
Posted on Reply
#19
80-watt Hamster
80251Why would I want to utilize slow direct storage if I have a PC with 32 GiB or 64 GiB of memory? DDR4 access times and bandwidth are far greater than any NVME M.2 storage solution regardless of PCIe revision.
So instead of loading directly to VRAM, you'd rather it make a detour to system RAM first?
Posted on Reply
#20
Wirko
80-watt HamsterSo instead of loading directly to VRAM, you'd rather it make a detour to system RAM first?
Potentially it's a long detour, with two writes and two reads.

1. DMA transfer from SSD to RAM
2. CPU reads compressed data from RAM
3. CPU writes uncompressed data to RAM in small chunks (lots of it because, well, it's uncompressed)
4. DMA transfer from RAM to VRAM in large chunks (because PCIe is inefficent for small transfers)
Posted on Reply
#21
Franzen4Real
WirkoPotentially it's a long detour, with two writes and two reads.

1. DMA transfer from SSD to RAM
2. CPU reads compressed data from RAM
3. CPU writes uncompressed data to RAM in small chunks (lots of it because, well, it's uncompressed)
4. DMA transfer from RAM to VRAM in large chunks (because PCIe is inefficent for small transfers)
Yes, all of that. But also in addition to that the decompression is now taking place on the GPU as opposed to the CPU. nVidia stated in their RTX IO presentation that it is approximately a 20x speed up due decompression being a highly parallel work load.
80251Why would I want to utilize slow direct storage if I have a PC with 32 GiB or 64 GiB of memory? DDR4 access times and bandwidth are far greater than any NVME M.2 storage solution regardless of PCIe revision.
In either situation you are still streaming from the NVME. Either as Wirko listed out above, or straight to GPU/VRAM.

Here are a couple of links, the nVidia one has a some flow charts showing the difference, the AMD one has a video embedded from the Computex Keynote. SmartAccess Storage explanation starts at 12:40 on the timeline.

www.nvidia.com/en-us/geforce/news/rtx-io-gpu-accelerated-storage-technology/
www.amd.com/en/events/computex
Posted on Reply
#22
Wirko
It's also worth noting that SSDs achieve very poor performance in small random reads which are not queued (4k QD1). The DirectStorage API can't automagically improve that but does it in any way facilitate programming for parallel/queued access?
Posted on Reply
#23
Franzen4Real
Wirkodoes it in any way facilitate programming for parallel/queued access?
Yes, here is an overview. Unfortunately the page that provides the most detail on queuing clearly states that it pertains to the Xbox, but then from the Desktop DirectStorage page that it links to-- "The DirectStorage API already exists on Xbox and in order to ease porting of titles between Xbox and Windows, the two APIs are as similar as possible".

learn.microsoft.com/en-us/gaming/gdk/_content/gc/system/overviews/directstorage/directstorage-overview
github.com/microsoft/DirectStorage
Posted on Reply
#24
chrcoluk
BwazeIs this meant only for initial game load?

Or do graphics cards have free compute shaders that can run decompression to load assets while gaming?

But this is really moving very slowly - for a technology that already works on gaming consoles, which are more or less assembled from midrange PC parts...
It potentially will fix texture stuttering issues on games that use unreal engine and stream them in during game play.
Posted on Reply
Add your own comment
Dec 19th, 2024 13:44 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts