Saturday, April 1st 2023
DirectX 12 API New Feature Set Introduces GPU Upload Heaps, Enables Simultaneous Access to VRAM for CPU and GPU
Microsoft has implemented two new features into its DirectX 12 API - GPU Upload Heaps and Non-Normalized sampling have been added via the latest Agility SDK 1.710.0 preview, and the former looks to be the more intriguing of the pair. The SDK preview is only accessible to developers at the present time, since its official introduction on Friday 31 March. Support has also been initiated via the latest graphics drivers issued by NVIDIA, Intel, and AMD. The Microsoft team has this to say about the preview version of GPU upload heaps feature in DirectX 12: "Historically a GPU's VRAM was inaccessible to the CPU, forcing programs to have to copy large amounts of data to the GPU via the PCI bus. Most modern GPUs have introduced VRAM resizable base address register (BAR) enabling Windows to manage the GPU VRAM in WDDM 2.0 or later."
They continue to describe how the update allows the CPU to gain access to the pool of VRAM on the connected graphics card: "With the VRAM being managed by Windows, D3D now exposes the heap memory access directly to the CPU! This allows both the CPU and GPU to directly access the memory simultaneously, removing the need to copy data from the CPU to the GPU increasing performance in certain scenarios." This GPU optimization could offer many benefits in the context of computer games, since memory requirements continue to grow in line with an increase in visual sophistication and complexity.A shared pool of memory between the CPU and GPU will eliminate the need to keep duplicates of the game scenario data in both system memory and graphics card VRAM, therefore resulting in a reduced data stream between the two locations. Modern graphics cards have tended to feature very fast on-board memory standards (GDDR6) in contrast to main system memory (DDR5 at best). In theory the CPU could benefit greatly from exclusive access to a pool of ultra quick VRAM, perhaps giving an early preview of a time when DDR6 becomes the daily standard in main system memory.
Sources:
Microsoft Dev Blogs, Zhang Doa
They continue to describe how the update allows the CPU to gain access to the pool of VRAM on the connected graphics card: "With the VRAM being managed by Windows, D3D now exposes the heap memory access directly to the CPU! This allows both the CPU and GPU to directly access the memory simultaneously, removing the need to copy data from the CPU to the GPU increasing performance in certain scenarios." This GPU optimization could offer many benefits in the context of computer games, since memory requirements continue to grow in line with an increase in visual sophistication and complexity.A shared pool of memory between the CPU and GPU will eliminate the need to keep duplicates of the game scenario data in both system memory and graphics card VRAM, therefore resulting in a reduced data stream between the two locations. Modern graphics cards have tended to feature very fast on-board memory standards (GDDR6) in contrast to main system memory (DDR5 at best). In theory the CPU could benefit greatly from exclusive access to a pool of ultra quick VRAM, perhaps giving an early preview of a time when DDR6 becomes the daily standard in main system memory.
54 Comments on DirectX 12 API New Feature Set Introduces GPU Upload Heaps, Enables Simultaneous Access to VRAM for CPU and GPU
Direct Storage can exist on it's own. Else, Direct Storage would have been released after this feature. This is for both CPU and GPU to be able to access the same VRAM region without having to copy it back to main ram.
:toast:
Simultaneous access is a latency benefit, since it's been accesible from the CPU and GPU for a while now (at least in DX12/vulkan)
Old (DX9) setup:
Compressed Storage -> RAM -> CPU changes things as requested -> Duplicate to VRAM -> GPU processes 3D goodness
Newer DirectStorage (when it eventually gets used correctly)
NVME storage -> VRAM, GPU harddware decompresses (Compressed NVME -> VRAM)
VRAM is now accessible by the CPU and GPU for either to read/write to as needed. No redundant "flush the whole thing and try again" steps
"Hey CPU can you hash this file and verify it's the one we need already decompressed and avoid loading the whole thing from scratch again? Cheers"
Consoles have shared RAM and VRAM, so this is primarily for them.
On PC it just means game engines will use less RAM, it does not mean they're going to use more VRAM.