Earlier this year, Microsoft finalized the DirectX 12 Ultimate API subset, which enables improved visual fidelity over conventional DirectX 12. Four features were chosen by Microsoft for GPU vendors to qualify for the new logo—support for real-time raytracing using the DirectX Raytracing or DXR API, Mesh Shaders, Sampler Feedback, and Variable Rate Shading (VRS). AMD worked toward ticking off all four features, and the Xbox Series X/S launched earlier this year became the first DirectX 12 Ultimate device powered by AMD hardware. NVIDIA's "Turing" graphics architecture from 2018 already meets all these requirements.
AMD's implementation of DirectX Raytracing is slightly different from NVIDIA's. The RDNA 2 graphics architecture uses Ray Accelerators, fixed-function hardware which calculate ray intersections with boxes and triangles (4-box intersections per clock, or one triangle intersection per clock). Intersection is the most math-intensive step, which warranted special hardware. Most other stages of the raytracing pipeline leverage the vast SIMD resources at the GPU's disposal, while NVIDIA's RT core offers full BVH traversal processing via special hardware. Also, the two companies take different approaches to de-noising, an important stage of raytracing that seeks to remove the "noise" resulting from the "sparsity" of rays being used. Remember, we're not quite there with fully raytraced 3D scenes, but raster 3D with select raytraced elements is an option. While NVIDIA uses an AI-based de-noiser that leverages tensor cores, AMD's de-noiser leverages compute shaders. The company claims to have innovated an efficient compute-based de-noising solution.
AMD also implemented support for Mesh Shaders as a geometry front end, which again heavily rely on compute shaders, Sampler Feedback, and Variable Rate Shading (VRS), a key feature that allows different regions of a 3D scene to have different levels of shading, letting the GPU conserve resources. RDNA 2 supports both VRS tier-1 and tier-2. VRS, along with dynamic resolution, makes up much of the secret sauce that lets next-gen consoles offer 4K UHD gaming.
Smart Access Memory and DirectStorage
Your CPU can only address up to 256 MB of video memory at once, a legacy from when things were operating in 32-bit mode with 4 GB of address space. Modern graphics cards come with a lot more memory, and in cases where the CPU needs to address more VRAM, a windowing mechanism is used where the GPU keeps a 256 MB chunk of its memory as a transfer area data the CPU requests is juggled in and out of. The 256 MB aperture size was arbitrarily decided on in the 32-bit days when address-space was at a premium. Video cards have plenty of memory bandwidth (compared to main memory totals), so this arrangement didn't really bottleneck anything. AMD is already running a tight ship with its relatively narrow memory bus and Infinity Cache. As such, it sought to change this by using the resizable base address register (BAR) capability standardized by the PCI-SIG, which AMD and NVIDIA hadn't leveraged until now.
AMD simply branded resizable BAR "Smart Access Memory." This feature requires an AMD Ryzen 5000 series processor, an AMD 500-series chipset motherboard, and a UEFI firmware update that toggles the feature, but in theory any modern chipset will support it. With Smart Access Memory, the CPU is able to access the full VRAM as a continuous block of memory. AMD claims that in specific game engines that rely on heavy CPU access of video memory, Smart Access Memory can improve performance by up to six percent. We put these claims to the test in our dedicated AMD Smart Access Memory article.
AMD is also introducing support for the DirectStorage API, which can accelerate game-loading times by giving the GPU direct access to game resource data from an NVMe SSD in its native compressed format, and performing the decompression on the GPU, leveraging compute shaders. Since the GPU isn't performing much 3D rendering during level-loading scenes, you won't feel its impact on frame rates, but loading times will be cut down.
Display and Media
AMD substantially updated the display and multimedia engines with RDNA 2. The Radeon RX 6800 series cards come with two DisplayPort 1.4 connectors, one HDMI 2.1 port, and a USB-C port with DisplayPort and USB 3.1 Gen 2, along with up to 27 W USB-PD. Thanks to HDMI 2.1 and Display Stream Compression (DSC), the card now supports 8K at up to 120 Hz resolution, along with support for FreeSync and Variable Refresh Rate. The multimedia engine now has AV1 hardware decode at up to 8K resolution and HEVC hardware encode at up to 8K, along with H.264 B-frame support.
Radeon Boost and Rage Mode
If you're unfamiliar with Radeon Boost, it is a variable resolution feature that relies on motion, or inputs, to dynamically reduce the resolution of the game being rendered for increased frame rates. The idea is that you're more focused on the action than on the details when in motion. Rage Mode, on the other hand, is a one-click automated overclocking feature that seeks to step up engine clocks by 100 MHz beyond the advertised maximum boost frequency.