Friday, October 14th 2022
DirectStorage not used by any Games, Microsoft hopes DirectStorage 1.1 with GPU Asset Decompression can Fix This
Back in March 2022, Microsoft formally debuted the DirectStorage 1.0 API that enables direct interactions between a GPU and a storage device; thereby reducing the processing load of the storage stack on the CPU and main memory. This release, however, lacked a killer feature that's available to consoles—asset decompression. With the lukewarm response from game developers to DirectStorage 1.0 for PC, Microsoft has finally updated the API, introducing the feature with DirectStorage 1.1.
With this feature, your GPU can not only directly fetch game assets from the storage device (an SSD that uses either NVMe or AHCI protocols), but also pull them in their natively-stored compressed state. These assets are then decompressed by the GPU using compute shaders, and the decompressed assets remain in the video memory. This will directly impact game loading times, as asset decompression no longer involves the CPU. Its impact on the game's framerate will be minimal, as the API mainly accelerates game loading times, not gameplay itself. Game assets are organized pieces of data such as textures, 3D model files, music, sound effects—pretty much all of the individual pieces of content that make up a 3D scene.File compression (and decompression) remains a compute-heavy workload that benefits from parallelism, and here the GPU and its faster memory help greatly. Once the relevant assets are committed to video-memory, the remaining data from the asset containers are purged from video memory to make room for the rest of the game's memory load. Microsoft in its tech demo example showed how a 3D scene's assets were loaded in 0.8 seconds with DirectStorage 1.1, compared to 2.36 seconds without it. This is just a synthetic example, you can imagine the impact on much larger AAA games that take dozens of seconds to load levels, even with NVMe SSDs. The hold-up here is not the storage device, but the CPU trying to decompress relevant assets.
Along with DirectStorage 1.1, Microsoft is introducing GDeflate, a file compression format for game assets, developed by NVIDIA, and is working with all PC GPU manufacturers, including AMD and Intel, to add support for this file format through graphics driver updates. The format is optimized for highly parallelized compression and decompression methods over a large number of threads, which make them better optimized for GPUs. This doesn't necessarily mean that all games have to use GDeflate in order to take advantage of GPU-accelerated asset decompression over DirectStorage 1.1, it's just an added optimization for game developers working on new projects. Patching already-released games to have GDeflate would involve redistributing the entire game asset load (which is still fine if the developer chooses to).
Microsoft plans to release DirectStorage 1.1 to game developers toward the end of 2022. The first games released or patched with DirectStorage 1.1 support should start coming out in 2023.
Source:
Microsoft DirectX Blog
With this feature, your GPU can not only directly fetch game assets from the storage device (an SSD that uses either NVMe or AHCI protocols), but also pull them in their natively-stored compressed state. These assets are then decompressed by the GPU using compute shaders, and the decompressed assets remain in the video memory. This will directly impact game loading times, as asset decompression no longer involves the CPU. Its impact on the game's framerate will be minimal, as the API mainly accelerates game loading times, not gameplay itself. Game assets are organized pieces of data such as textures, 3D model files, music, sound effects—pretty much all of the individual pieces of content that make up a 3D scene.File compression (and decompression) remains a compute-heavy workload that benefits from parallelism, and here the GPU and its faster memory help greatly. Once the relevant assets are committed to video-memory, the remaining data from the asset containers are purged from video memory to make room for the rest of the game's memory load. Microsoft in its tech demo example showed how a 3D scene's assets were loaded in 0.8 seconds with DirectStorage 1.1, compared to 2.36 seconds without it. This is just a synthetic example, you can imagine the impact on much larger AAA games that take dozens of seconds to load levels, even with NVMe SSDs. The hold-up here is not the storage device, but the CPU trying to decompress relevant assets.
Along with DirectStorage 1.1, Microsoft is introducing GDeflate, a file compression format for game assets, developed by NVIDIA, and is working with all PC GPU manufacturers, including AMD and Intel, to add support for this file format through graphics driver updates. The format is optimized for highly parallelized compression and decompression methods over a large number of threads, which make them better optimized for GPUs. This doesn't necessarily mean that all games have to use GDeflate in order to take advantage of GPU-accelerated asset decompression over DirectStorage 1.1, it's just an added optimization for game developers working on new projects. Patching already-released games to have GDeflate would involve redistributing the entire game asset load (which is still fine if the developer chooses to).
Microsoft plans to release DirectStorage 1.1 to game developers toward the end of 2022. The first games released or patched with DirectStorage 1.1 support should start coming out in 2023.
60 Comments on DirectStorage not used by any Games, Microsoft hopes DirectStorage 1.1 with GPU Asset Decompression can Fix This
C'mon, Microsoft. If you already knew this was going to be a big part of the Xbox Series line, you should have had this ready for PC at the same time to help porting.
For instance Ratchet and Clank uses SSD fast loading. If it gets ported I'd bet it's going to use the API (but I don't think it's getting ported)
Mostly a carrot on a stick to get people to update.
DX-11 comes to mind as well.
This feature is reserved for m.2 os and game storage isn't it ?
@human_error oh ok I guess I was wrong, it's hard to tell though what with no games exposing it's behaviour and support level.
Yeah with m.2 prices high when this feature was revealed it's not surprising it went no where fast
Just now m.2's prices have fallen but imho not near enough to light any fire under anyone to utilize it for storage
Prices haven't compelled me to buy any for game storage that's for sure hell I don't even use the two I have for os I'm happy with sata ssd's :laugh:
It's the killer app thing again, If a game was out that was ass on everything but an nvme, it would actually push the opinion we need nvme to game with.
imagine having a Gbit Fibre connection and waiting hours for 97GB mandaTory content….
Think it was just a sells pitch for m.2 sells personally
I mean really how much can it get faster with both os and storage on a m.2 :roll:
Moreover Microsoft IMHO are just trying to leverage console IP to enhance the user experience beyond Apple and Google on PC.
Just ,unlike console they have no levers to pull on hardware users have except for they're tablet's which are not useful in gaming terms anyway.
And MS don't sell nvme drives, do they?.
No but they invest and insider trading is a thing see Musk :laugh:
Things don't move, make them move.
What happens when you load into a game? The game needs to pull assets from the drive into the GPU's VRAM.
How does it do it on PC? It copies the data from the drive to the system RAM. The CPU has to decompress the data while it is in system RAM. The decompressed data can then be copied to the VRAM.
How will it work with DirectStorage? The game will copy data from the drive directly into the VRAM, decompressing it on the fly using the GPU.
Once all the assets are in the VRAM, they could be cached in system memory if the game no longer needs them. Then they could be quickly pulled back from system memory. But many modern games keep data in the VRAM, using it as cache.
But this is all about the first loading process. You bypass the CPU and the system RAM and place the data directly in the GPU's memory. This makes loading data drastically faster.
This also shows why PCs have had many problems with data streaming in games, which became very popular on Xbox 360, which introduced a unified memory system. While the CPU still had to decompress data, everything happened in a single memory pool, which drastically reduced latency.
In time PCs were able to brute force past this with sheer computing power and transfer speeds, but it all depends on how optimized a game is. Some games still have data streaming problems on PC even today, because of two separate memory pools.
And this is why some games on PS5 and Xbox Series consoles have unbelievably short loading times. When I played Horizon Forbidden West, I could not believe how instant everything was. And that is not even the best example.
And these consoles combine DirectStorage (PS5 has its own tech) and unified memory to get these incredible results.
If a console game using this tech can load in 2-3 seconds, how will your PrimoCache setup improve on this? It will load in 1 second? Great, definitely worth spending $30-50 on the program and hundreds of dollars on extra 32 GB of RAM.
You are probably representing less than 1% of consumers with your setup. And you know what? You can turn off DirectStorage if you do not want it. But for 99% of gamers, this can be a game changer. And it is free.
But if you want the full potential of it, you need some serius stuff: many cpu score, Sampler Feedback Tier_1_0 capable gpu, fast nvme drive, raw gpu computepower, native driverside support and Win11.
Forspoken will be the first game on pc which will support it, so far.
One point though it doesn't matter what you load an old game from it Will still be caching stuff across in a different section of memory because the game designer and Microsoft made it so.
Plus most like more than one game on a drive.
And your not fitting some games like GtaV in a cache drive.
So your way is SHIT for most and hardly automatic.
And I know I tried it btw.
And this isn't that, at all.
learn.microsoft.com/en-us/gaming/gdk/_content/gc/system/overviews/directstorage/directstorage-overview#memory-to-memory-decompression
If you actually used primocache you would understand it IS AUTOMATIC.
I'll bet you don't even have a copy of primocache and have never used it.
Anything that works now, is going to be faster - it certainly wont get slower by cutting out the extra steps, worst case it should be equal. Ah yes, back to DX9 where they duplicated everything
Damned hard to find it documented in plainspeak but that's the main thing DX10 changed, was no longer requiring VRAM duplicated into system RAM. big deal when we hit the 4GB x86 limits.