• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Microsoft Releases DirectStorage 1.2 with HDD Speedups

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,300 (7.52/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
Microsoft released a major update to DirectStorage, the API that promises to reduce game loading times. The new DirectStorage 1.2 adds the ability to speed up game loading for mechanical hard drives, a feature game developers requested from Microsoft. DirectStorage brings much of the storage sub-system secret sauce of consoles over to PC, and consoles have held on to mechanical HDDs as game storage devices longer than mainstream gaming PCs.

HDDs require buffered reads to compensate for the longer seek times, whereas DirectStorage traditionally accesses files in unbuffered mode, which disqualified HDDs for DirectStorage. With this update, HDDs can take advantage of DirectStorage, wherein game data stored on them is directly accessed by GPUs, and compressed game assets are decompressed on the fly through the compute-shader acceleration capabilities of modern GPUs.



Microsoft also added a means for a game to know whether compressed assets are being decompressed by the GPU, or whether a software (CPU) fallback is engaged for reasons such as incompatible compression/file format. This feedback mechanism allows the game to adjust its asset quality (such as texture resolution), to compensate for the reduced decompression performance.

Microsoft has progressively relaxed the hardware requirements for DirectStorage with each major release. It was originally restricted to NVMe SSDs as the storage device, but was extended to AHCI devices such as SATA SSDs, and now with this release, support is extended to mechanical HDDs.

Many Thanks to TumbleGeorge for the tip!

View at TechPowerUp Main Site | Source
 
Joined
Aug 3, 2020
Messages
7 (0.00/day)
System Name Queen-18
Processor Ryzen 7 5800X
Motherboard MSI Tomahawk Max II
Cooling NH-U9S
Memory Crucial Ballistix DDR4 3600MHz PC4-28800 16GB 2x8GB CL16
Video Card(s) MSI 3070Ti VENTUS 3X 8G OC
Storage Emtec Power Pro X300 SSD 512GB // Seagate BarraCuda 3.5" 2TB SATA 3
Display(s) LG UltraGear 27GP850-B
Case Sharkoon Pure Steel Black
Power Supply Corsair RM850 (2019)
Software Windows 11 Pro
Do developers need to update their games?
 
Joined
Nov 6, 2014
Messages
117 (0.03/day)
Processor Intel i7 13700K
Motherboard ASUS PROArt Z690 Creator WiFi
Cooling Liquid Freezer II - 280
Memory Kingston 32GB DDR5 @ 6200 MT/s
Video Card(s) Palit RTX3070 GamingPRO
Storage TrueNAS CORE
Case Phanteks ECLIPSE P600S
Audio Device(s) Creative Sound Blaster AE-5
Power Supply SEASONIC CONNECT 750W
And we're going back to pre-caching like the engines of old, instead of textures streaming
 
Joined
Sep 1, 2020
Messages
2,395 (1.52/day)
Location
Bulgaria
This new feature is especially for better presence of HDD games loading. I think that there is some modernization of as you mention the previous way of masking the flow of data through the buffer.
 
Joined
Mar 22, 2020
Messages
27 (0.02/day)
HDDs can take advantage of DirectStorage, wherein game data stored on them is directly accessed by GPUs
This is contradicted literally a line below with your own graph. Direct storage does not bypass the ram when uploading data into the gpu mem (yet).

And we're going back to pre-caching like the engines of old, instead of textures streaming
I think you misunderstood the news, it's about how direct storage manages the queues of reads that it gets, not about streamed data flow.
 

Nicholas Steel

New Member
Joined
Jan 2, 2022
Messages
13 (0.01/day)
Edited for clarity.

DirectStorage v1.0 and v1.1 work with HDD's, the problem was that commands weren't buffered so the order of operations couldn't be optimized to minimize Seek Time.

* Read Sector 7
* Read Sector 7049
* Read Sector 9
* Read Sector 14
* Read Sector 3

Before v1.2, DirectStorage would process them in the order it received them in. With v1.2's Buffered mode it can re-organize the commands.

* Read Sector 3
* Read Sector 7
* Read Sector 9
* Read Sector 14
* Read Sector 7049

Windows has been able to buffer HDD activity for a very long time now (I think Windows 95 introduced it?) while SATA III introduced the ability of HDD's performing their own re-ordering of commands (NCQ) to minimize HDD seeking (this requires a buffer to simultaneously hold multiple commands in so it can juggle around their priority). With SSD's the Seek time is the same regardless of where data is physically located, which is why they don't need buffering.

I imagine NCQ also reduced the wear on a HDD's motors.
 
Last edited:
Joined
Jan 10, 2011
Messages
1,451 (0.28/day)
Location
[Formerly] Khartoum, Sudan.
System Name 192.168.1.1~192.168.1.100
Processor AMD Ryzen5 5600G.
Motherboard Gigabyte B550m DS3H.
Cooling AMD Wraith Stealth.
Memory 16GB Crucial DDR4.
Video Card(s) Gigabyte GTX 1080 OC (Underclocked, underpowered).
Storage Samsung 980 NVME 500GB && Assortment of SSDs.
Display(s) ViewSonic VA2406-MH 75Hz
Case Bitfenix Nova Midi
Audio Device(s) On-Board.
Power Supply SeaSonic CORE GM-650.
Mouse Logitech G300s
Keyboard Kingston HyperX Alloy FPS.
VR HMD A pair of OP spectacles.
Software Ubuntu 24.04 LTS.
Benchmark Scores Me no know English. What bench mean? Bench like one sit on?
And we're going back to pre-caching like the engines of old, instead of textures streaming
Not sure how you could infer that from this article. The third paragraph notes an added feature that primarily geared towards tex streamers.
Microsoft also added a means for a game to know whether compressed assets are being decompressed by the GPU, or whether a software (CPU) fallback is engaged for reasons such as incompatible compression/file format. This feedback mechanism allows the game to adjust its asset quality (such as texture resolution), to compensate for the reduced decompression performance.
 
Joined
Jan 3, 2021
Messages
3,608 (2.49/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
ordering commands to minimize HDD seeking and this requires a buffer to simultaneously hold multiple commands in (NCQ, Native Command Queuing). With SSD's the Seek time is the same regardless of where data is physically located, which is why they don't need buffering.
SSDs are very slow on non-queued random read access. Seek time is around 40 us but much less if it's within the same page (16 kilobytes). That's similar to DRAM row/column access, just a thousand times slower. So it makes sense to reorder the read requests to make them sequential more often, on average. Just like in HDDs. The other reason for queueing is to try to activate as many banks as possible at once, and here's where the similarity with a HDD ends.
 
Joined
Feb 18, 2013
Messages
2,186 (0.51/day)
Location
Deez Nutz, bozo!
System Name Rainbow Puke Machine :D
Processor Intel Core i5-11400 (MCE enabled, PL removed)
Motherboard ASUS STRIX B560-G GAMING WIFI mATX
Cooling Corsair H60i RGB PRO XT AIO + HD120 RGB (x3) + SP120 RGB PRO (x3) + Commander PRO
Memory Corsair Vengeance RGB RT 2 x 8GB 3200MHz DDR4 C16
Video Card(s) Zotac RTX2060 Twin Fan 6GB GDDR6 (Stock)
Storage Corsair MP600 PRO 1TB M.2 PCIe Gen4 x4 SSD
Display(s) LG 29WK600-W Ultrawide 1080p IPS Monitor (primary display)
Case Corsair iCUE 220T RGB Airflow (White) w/Lighting Node CORE + Lighting Node PRO RGB LED Strips (x4).
Audio Device(s) ASUS ROG Supreme FX S1220A w/ Savitech SV3H712 AMP + Sonic Studio 3 suite
Power Supply Corsair RM750x 80 Plus Gold Fully Modular
Mouse Corsair M65 RGB FPS Gaming (White)
Keyboard Corsair K60 PRO RGB Mechanical w/ Cherry VIOLA Switches
Software Windows 11 Professional x64 (Update 23H2)
cool. Now I wanna see them game studios to implement this API ASAP. XDD
 
Joined
Aug 4, 2022
Messages
54 (0.06/day)
Since it supports Sata SSDs connected via ahci mode, does that mean it also supports SSDs that are in a storage pool via Storage Spaces or a software raid through Intel driver?
 
Joined
Feb 1, 2019
Messages
3,667 (1.70/day)
Location
UK, Midlands
System Name Main PC
Processor 13700k
Motherboard Asrock Z690 Steel Legend D4 - Bios 13.02
Cooling Noctua NH-D15S
Memory 32 Gig 3200CL14
Video Card(s) 4080 RTX SUPER FE 16G
Storage 1TB 980 PRO, 2TB SN850X, 2TB DC P4600, 1TB 860 EVO, 2x 3TB WD Red, 2x 4TB WD Red
Display(s) LG 27GL850
Case Fractal Define R4
Audio Device(s) Soundblaster AE-9
Power Supply Antec HCG 750 Gold
Software Windows 10 21H2 LTSC
Interesting, recognition that directstorage is not just about loading times but also shifting processing off the CPU which is a common bottleneck.
 
Joined
Jun 6, 2021
Messages
708 (0.55/day)
System Name Red Devil
Processor AMD 5950x - Vermeer - B0
Motherboard Gigabyte X570 AORUS MASTER
Cooling NZXT Kraken Z73 360mm; 14 x Corsair QL 120mm RGB Case Fans
Memory G.SKill Trident Z Neo 32GB Kit DDR4-3600 CL14 (F4-3600C14Q-32GTZNB)
Video Card(s) PowerColor's Red Devil Radeon RX 6900 XT (Navi 21 XTX)
Storage 1 x Western Digital SN850 1GB; 1 x WD Black SN850X 4TB; 1 x Samsung SSD 870EVO 2TB
Display(s) 1 x MSI MPG 321URX QD-OLED 4K; 2 x Asus VG27AQL1A
Case Corsair Obsidian 1000D
Audio Device(s) Raz3r Nommo V2 Pro ; Steel Series Arctis Nova Pro X Wireless (XBox Version)
Power Supply AX1500i Digital ATX - 1500w - 80 Plus Titanium
Mouse Razer Basilisk V3
Keyboard Razer Huntsman V2 - Optical Gaming Keyboard
Software Windows 11
Interesting, recognition that directstorage is not just about loading times but also shifting processing off the CPU which is a common bottleneck.
I thought that was the entire point of Directstorage was to bypass the CPU or did I miss something?
 
Joined
Feb 1, 2019
Messages
3,667 (1.70/day)
Location
UK, Midlands
System Name Main PC
Processor 13700k
Motherboard Asrock Z690 Steel Legend D4 - Bios 13.02
Cooling Noctua NH-D15S
Memory 32 Gig 3200CL14
Video Card(s) 4080 RTX SUPER FE 16G
Storage 1TB 980 PRO, 2TB SN850X, 2TB DC P4600, 1TB 860 EVO, 2x 3TB WD Red, 2x 4TB WD Red
Display(s) LG 27GL850
Case Fractal Define R4
Audio Device(s) Soundblaster AE-9
Power Supply Antec HCG 750 Gold
Software Windows 10 21H2 LTSC
I thought that was the entire point of Directstorage was to bypass the CPU or did I miss something?
It is, but the marketing side of it is the lightning fast loading speeds.
 
Joined
Sep 1, 2020
Messages
2,395 (1.52/day)
Location
Bulgaria
There was a time... when Nvidia wanted everything to be done by the graphics card, without a CPU. I even forgot when that was. On the subject, there is no way to completely exclude the CPU from any computer configuration activity. In this case, the aspiration is to have the part of the asset decompression tasks performed in the most demanding part of the GPU. But Direct Storage performs several different tasks simultaneously.
 

Mussels

Freshwater Moderator
Joined
Oct 6, 2004
Messages
58,413 (7.91/day)
Location
Oystralia
System Name Rainbow Sparkles (Power efficient, <350W gaming load)
Processor Ryzen R7 5800x3D (Undervolted, 4.45GHz all core)
Motherboard Asus x570-F (BIOS Modded)
Cooling Alphacool Apex UV - Alphacool Eisblock XPX Aurora + EK Quantum ARGB 3090 w/ active backplate
Memory 2x32GB DDR4 3600 Corsair Vengeance RGB @3866 C18-22-22-22-42 TRFC704 (1.4V Hynix MJR - SoC 1.15V)
Video Card(s) Galax RTX 3090 SG 24GB: Underclocked to 1700Mhz 0.750v (375W down to 250W))
Storage 2TB WD SN850 NVME + 1TB Sasmsung 970 Pro NVME + 1TB Intel 6000P NVME USB 3.2
Display(s) Phillips 32 32M1N5800A (4k144), LG 32" (4K60) | Gigabyte G32QC (2k165) | Phillips 328m6fjrmb (2K144)
Case Fractal Design R6
Audio Device(s) Logitech G560 | Corsair Void pro RGB |Blue Yeti mic
Power Supply Fractal Ion+ 2 860W (Platinum) (This thing is God-tier. Silent and TINY)
Mouse Logitech G Pro wireless + Steelseries Prisma XL
Keyboard Razer Huntsman TE ( Sexy white keycaps)
VR HMD Oculus Rift S + Quest 2
Software Windows 11 pro x64 (Yes, it's genuinely a good OS) OpenRGB - ditch the branded bloatware!
Benchmark Scores Nyooom.
So, they're pre-loading some content instead of all of it on the fly?
It made sense to preload some of it regardless

There was a time... when Nvidia wanted everything to be done by the graphics card, without a CPU. I even forgot when that was. On the subject, there is no way to completely exclude the CPU from any computer configuration activity. In this case, the aspiration is to have the part of the asset decompression tasks performed in the most demanding part of the GPU. But Direct Storage performs several different tasks simultaneously.
back when they first introduced programmable shaders, with their dream of servers and enterprise setups using their GPUs instead of CPUs

DirectStorage v1.0 and v1.1 work with HDD's, the problem was that commands weren't buffered so the HDD couldn't optimize the order of operations to minimize Seek Time. The newly added Buffered mode enables the HDD to optimize operations fed to it in order to minimize Seek Time.

Read Sector 7
Read Sector 7049
Read Sector 9
Read Sector 14
Read Sector 3

Before it would process them in the order it received them in, with Buffered mode it can re-organize the commands.

Read Sector 3
Read Sector 7
Read Sector 9
Read Sector 14
Read Sector 7049

SATA III originally introduced the ability of re-ordering commands to minimize HDD seeking and this requires a buffer to simultaneously hold multiple commands in (NCQ, Native Command Queuing). With SSD's the Seek time is the same regardless of where data is physically located, which is why they don't need buffering.
Good description of the change and how it works, a completely logical change at their end - sounds like NCQ or a way to make sure NCQ works correctly

Since it supports Sata SSDs connected via ahci mode, does that mean it also supports SSDs that are in a storage pool via Storage Spaces or a software raid through Intel driver?
any kind of software involved, is going to add a CPU burden - why would you want to run games off such a thing?
 

Nicholas Steel

New Member
Joined
Jan 2, 2022
Messages
13 (0.01/day)
So, they're pre-loading some content instead of all of it on the fly?
It made sense to preload some of it regardless


back when they first introduced programmable shaders, with their dream of servers and enterprise setups using their GPUs instead of CPUs


Good description of the change and how it works, a completely logical change at their end - sounds like NCQ or a way to make sure NCQ works correctly


any kind of software involved, is going to add a CPU burden - why would you want to run games off such a thing?
I've just now updated my message to clarify some things.
 

Mussels

Freshwater Moderator
Joined
Oct 6, 2004
Messages
58,413 (7.91/day)
Location
Oystralia
System Name Rainbow Sparkles (Power efficient, <350W gaming load)
Processor Ryzen R7 5800x3D (Undervolted, 4.45GHz all core)
Motherboard Asus x570-F (BIOS Modded)
Cooling Alphacool Apex UV - Alphacool Eisblock XPX Aurora + EK Quantum ARGB 3090 w/ active backplate
Memory 2x32GB DDR4 3600 Corsair Vengeance RGB @3866 C18-22-22-22-42 TRFC704 (1.4V Hynix MJR - SoC 1.15V)
Video Card(s) Galax RTX 3090 SG 24GB: Underclocked to 1700Mhz 0.750v (375W down to 250W))
Storage 2TB WD SN850 NVME + 1TB Sasmsung 970 Pro NVME + 1TB Intel 6000P NVME USB 3.2
Display(s) Phillips 32 32M1N5800A (4k144), LG 32" (4K60) | Gigabyte G32QC (2k165) | Phillips 328m6fjrmb (2K144)
Case Fractal Design R6
Audio Device(s) Logitech G560 | Corsair Void pro RGB |Blue Yeti mic
Power Supply Fractal Ion+ 2 860W (Platinum) (This thing is God-tier. Silent and TINY)
Mouse Logitech G Pro wireless + Steelseries Prisma XL
Keyboard Razer Huntsman TE ( Sexy white keycaps)
VR HMD Oculus Rift S + Quest 2
Software Windows 11 pro x64 (Yes, it's genuinely a good OS) OpenRGB - ditch the branded bloatware!
Benchmark Scores Nyooom.
Joined
Sep 4, 2022
Messages
348 (0.41/day)
While this Is not directly related does anyone know why we don't have variable textures ( like variable shaders) Where the texture assets load in as a percentage of the full asset based on the vram the end user has instead of potatoe graphics with some titles found on the 8 gig vram 3070 by Steve at Hardware Unboxed. Or the textures acting like nanite where they load more efficiently in your direct visual periphery instead of potatoe graphics?
 
Joined
Jan 3, 2021
Messages
3,608 (2.49/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
I thought that was the entire point of Directstorage was to bypass the CPU or did I miss something?
Also to bypass the system RAM - at least the uncompressed assets don't have to be written to it and read back from it.
 
Joined
Mar 22, 2020
Messages
27 (0.02/day)
While this Is not directly related does anyone know why we don't have variable textures ( like variable shaders) Where the texture assets load in as a percentage of the full asset based on the vram the end user has instead of potatoe graphics with some titles found on the 8 gig vram 3070 by Steve at Hardware Unboxed. Or the textures acting like nanite where they load more efficiently in your direct visual periphery instead of potatoe graphics?
That is exactly what we have, it's called mips. This is the basis of texture streaming.
This can be further refined using sampler feedback to upload only parts of a mip in vram but I don't think many games implement this yet.
 
Last edited:

Mussels

Freshwater Moderator
Joined
Oct 6, 2004
Messages
58,413 (7.91/day)
Location
Oystralia
System Name Rainbow Sparkles (Power efficient, <350W gaming load)
Processor Ryzen R7 5800x3D (Undervolted, 4.45GHz all core)
Motherboard Asus x570-F (BIOS Modded)
Cooling Alphacool Apex UV - Alphacool Eisblock XPX Aurora + EK Quantum ARGB 3090 w/ active backplate
Memory 2x32GB DDR4 3600 Corsair Vengeance RGB @3866 C18-22-22-22-42 TRFC704 (1.4V Hynix MJR - SoC 1.15V)
Video Card(s) Galax RTX 3090 SG 24GB: Underclocked to 1700Mhz 0.750v (375W down to 250W))
Storage 2TB WD SN850 NVME + 1TB Sasmsung 970 Pro NVME + 1TB Intel 6000P NVME USB 3.2
Display(s) Phillips 32 32M1N5800A (4k144), LG 32" (4K60) | Gigabyte G32QC (2k165) | Phillips 328m6fjrmb (2K144)
Case Fractal Design R6
Audio Device(s) Logitech G560 | Corsair Void pro RGB |Blue Yeti mic
Power Supply Fractal Ion+ 2 860W (Platinum) (This thing is God-tier. Silent and TINY)
Mouse Logitech G Pro wireless + Steelseries Prisma XL
Keyboard Razer Huntsman TE ( Sexy white keycaps)
VR HMD Oculus Rift S + Quest 2
Software Windows 11 pro x64 (Yes, it's genuinely a good OS) OpenRGB - ditch the branded bloatware!
Benchmark Scores Nyooom.
While this Is not directly related does anyone know why we don't have variable textures ( like variable shaders) Where the texture assets load in as a percentage of the full asset based on the vram the end user has instead of potatoe graphics with some titles found on the 8 gig vram 3070 by Steve at Hardware Unboxed. Or the textures acting like nanite where they load more efficiently in your direct visual periphery instead of potatoe graphics?
we do, many technologies exist

what you're imagining would have to real-time compress the high res textures and shrink them down, and that'd be slower than sending the full res ones - hence, pre-compressing them

Consoles lack CPU power for example, so all the textures became really large disk-space wise to avoid any issues with decompression.
 
Top