Having spoken to RTG members over the course of the weekend, it seems obvious that a good fraction of the 12.5 billion transistor count was dedicated to increasing the base and boost clocks Vega is capable of relative to Polaris, and also Fiji, which is a direct comparison given their die sizes. As such, AMD is targeting GPU core frequencies on the order of 1.7 GHz with Vega 10, which is a huge improvement over the previous two microarchitectures and can help tremendously if IPC is on par with the highly overclockable NVIDIA GP104 die-based cards. This increased transistor count coupled with a smaller die size relative to Fiji is a result of an optimized general-purpose register design with Vega, wherein AMD claims collaboration with their Ryzen CPU team to have helped with the transistor density and power savings. For example, the company leveraged Globalfoundries' planar transistor technologies, which enables the use of wider transistors when the designer chooses to do so without compromising leakage or parasitics. AMD has also used improved synthesis tools for their circuit design and paid closer attention to their cell library.
AMD also updated the GPU hierarchy such that it improves performance of programs that use deferred shading. The geometry pipeline, the compute engine, and the pixel engine, which output to the ROPs (L1 cache), are now tied to the L2 cache, which has in turn been doubled from 2 MB to 4 MB to cater to these changes.
Ahh, it was only a few weeks ago when AMD announced the launch of the Radeon Vega Frontier Edition, and tests quickly revealed that draw-stream binning rasterization (DBSR) was not enabled on it despite the Vega architecture supporting it. AMD today confirmed that Vega 10 does indeed support it and that RX Vega SKUs should too. We are not sure yet if there will be a Radeon Pro software driver update to help enable it with the prosumer Vega Frontier Edition at this point.
DBSR is a tile-based pixel-shading/rendering approach wherein the GPU can render more complex pixels very efficiently relative to previous generations. This is done by having the GPU fetch overlaps only once, followed by performing the pixel-shade of those overlaps only once. As such, any overlapped or invisible pixels are not shaded/rendered, thus saving power and time. AMD has provided some internal testing results to help demonstrate said power savings and performance benefits in synthetic and real-world rendering loads.
The power savings with Vega continue with the addition of an updated micro-controller unit (SMC MCU) for power management alone. Vega supports Infinity Fabric, although in ways yet unknown completely, but one of the ways the MCU aids is by improving idle state power draw - switching over to a sleep state for the GPU core as well as an ultra low operating frequency for the HBM2 memory. AMD is using a very old 3DMark application, Perlin Noise, to generate solid procedural textures in order to demonstrate the power savings in action. This does seem like a stretch, but quantifying idle behavior is not easy to begin with.