Gigabyte GeForce RTX 2080 Gaming OC 8 GB Review

Name: Gigabyte GeForce RTX 2080 Gaming OC 8 GB
Brand: GIGABYTE

W1zzard

on Oct 9th, 2018,

in Graphics Cards.

Manufacturer: GIGABYTE

(22 Comments) »

Introduction

NVIDIA launched the GeForce RTX 20-series with the introduction of the GeForce RTX 2080 and RTX 2080 Ti. It comes at a time when the silicon fabrication technology isn't advancing at the rate it used to four years ago, wrecking the architecture roadmaps of several semiconductor giants, including Intel, NVIDIA, AMD, and Qualcomm, which is forcing them to design innovative new architectures on existing foundry nodes. Brute transistor-count increases, as would have been the case with "Volta," are no longer a viable option, and NVIDIA needed a killer feature to sell new GPUs. That killer feature is the RTX Technology. This feature is so big for NVIDIA that it has changed the nomenclature of its client-segment graphics cards with the introduction of the GeForce RTX 20-series.

NVIDIA RTX is a near-turnkey real-time ray-tracing model for game developers that lets them fuse real-time ray-traced objects into 3D scenes that have been rasterized. Ray-tracing the whole scene in existence isn't quite possible yet, but the results with using RTX are still better-looking than anything rasterizing can achieve. To even get those few bits of ray-tracing done right, an enormous amount of compute power is required. NVIDIA has hence deployed purpose-built hardware components on its GPUs that sit alongside all-purpose CUDA cores, called RT cores.

NVIDIA invested heavily to stay at the bleeding edge of the hardware that drives pioneering AI research and has, over the years, developed Tensor cores, specialized components that are tasked with matrix multiplication, which speeds up deep-learning neural-net building and training via Tensor ops. Although it's a client-segment GPU for gaming, NVIDIA feels GPU-accelerated AI could play an increasingly big role in the company's turnkey GameWorks effects, and a new image quality enhancement called Deep-Learning Super-Sampling (DLSS). The chips are hence endowed with Tensor cores, just like the TITAN Volta. All that it lacks compared to the $3,000 graphics card from last year is FP64 CUDA cores.

NVIDIA GeForce RTX 20-series graphics cards debut at unusually high prices compared to their predecessors, perhaps because NVIDIA doesn't count the GTX 10-series as a predecessor to begin with. These chips pack not just CUDA cores, but also RT cores and Tensor cores, adding to the transistor count which, along with generational increases in performance, contributes to scorching 15%–70% increases in launch prices over the GTX 10-series. The GeForce RTX 2080 is the second-fastest graphics card from the series and is priced at $700 for the base model.

We have with us today the Gigabyte GeForce RTX 2080 Gaming OC, which is the company's highest-clocked RTX 2080 variant. It features an overclock out of the box, to 1815 MHz GPU Boost. The card comes with a large triple-fan cooling solution and is priced at $830.

On their website, Gigabyte has published an updated BIOS for the RTX 2080 Gaming OC, which increases the board power limit from 225 W to 245 W, which should provide a little bit of extra performance. That's why we present results with both BIOSes, as that also provides useful insight into what to expect from cards with a higher power limit.

GeForce RTX 2080 Market Segment Analysis
	Price	Shader Units	ROPs	Core Clock	Boost Clock	Memory Clock	GPU	Transistors	Memory
GTX 1070	$390	1920	64	1506 MHz	1683 MHz	2002 MHz	GP104	7200M	8 GB, GDDR5, 256-bit
RX Vega 56	$400	3584	64	1156 MHz	1471 MHz	800 MHz	Vega 10	12500M	8 GB, HBM2, 2048-bit
GTX 1070 Ti	$400	2432	64	1607 MHz	1683 MHz	2000 MHz	GP104	7200M	8 GB, GDDR5, 256-bit
GTX 1080	$470	2560	64	1607 MHz	1733 MHz	1251 MHz	GP104	7200M	8 GB, GDDR5X, 256-bit
RX Vega 64	$570	4096	64	1247 MHz	1546 MHz	953 MHz	Vega 10	12500M	8 GB, HBM2, 2048-bit
GTX 1080 Ti	$675	3584	88	1481 MHz	1582 MHz	1376 MHz	GP102	12000M	11 GB, GDDR5X, 352-bit
RTX 2070	$499	2304	64	1410 MHz	1620 MHz	1750 MHz	TU106	10800M	8 GB, GDDR6, 256-bit
RTX 2070 FE	$599	2304	64	1410 MHz	1710 MHz	1750 MHz	TU106	10800M	8 GB, GDDR6, 256-bit
RTX 2080	$699	2944	64	1515 MHz	1710 MHz	1750 MHz	TU104	13600M	8 GB, GDDR6, 256-bit
RTX 2080 FE	$799	2944	64	1515 MHz	1800 MHz	1750 MHz	TU104	13600M	8 GB, GDDR6, 256-bit
Gigabyte RTX 2080 Gaming OC	$829	2944	64	1515 MHz	1815 MHz	1750 MHz	TU104	13600M	8 GB, GDDR6, 256-bit
RTX 2080 Ti	$999	4352	64	1350 MHz	1545 MHz	1750 MHz	TU102	18600M	11 GB, GDDR6, 352-bit
RTX 2080 Ti FE	$1199	4352	64	1350 MHz	1635 MHz	1750 MHz	TU102	18600M	11 GB, GDDR6, 352-bit

Architecture

On the 14th of September, we published a comprehensive NVIDIA "Turing" architecture deep-dive article including coverage of its three new silicon implementations and the new RTX Technology. Be sure to catch that article for more technical details.

The "Turing" architecture caught many of us by surprise because it wasn't visible on GPU architecture roadmaps until a few quarters ago. NVIDIA took this roadmap detour over carving out client-segment variants of "Volta" because it realized it had achieved sufficient compute power to bring its ambitious RTX Technology to the client segment. NVIDIA RTX is an all-encompassing real-time ray-tracing model for consumer graphics that seeks to bring a semblance of real-time ray tracing to 3D games.

To enable RTX, NVIDIA has developed an all new hardware component that sits next to CUDA cores, called the RT core. An RT core is a fixed-function hardware that does what the spiritual ancestor of RTX, NVIDIA OptiX, did over CUDA cores. You input the mathematical representation of a ray, and it will transverse the scene to calculate the point of intersection with any triangle in the scene. This is a computationally heavy task that would have otherwise bogged down the CUDA cores.

The other major introduction is the Tensor Core, which made its debut with the "Volta" architecture. These too are specialized components tasked with 3x3x3 matrix multiplication, which speed up AI deep learning neural net building and training. Its relevance to gaming is limited at this time, but NVIDIA is introducing a few AI-accelerated image quality enhancements that could leverage Tensor operations.

The component hierarchy of a "Turing" GPU isn't much different from its predecessors, but the new-generation Streaming Multiprocessor is significantly different. It packs 64 CUDA cores, 8 Tensor Cores, and a single RT core.

TU104 Silicon

The TU104 is the second-largest silicon based on the "Turing" architecture and powers the GeForce RTX 2080. It's also significantly larger than its predecessor, holding 13.6 billion transistors.

The essential component hierarchy on the "Turing" architecture hasn't changed. What has changed, however, is that the Streaming Multiprocessor (SM), the indivisible sub-unit of the GPU, now packs CUDA cores, RT cores, and Tensor cores, orchestrated by a new Warp Scheduler that supports concurrent INT and FP32 ops, which should improve the GPU's asynchronous compute performance.

At the topmost level, the GPU takes host connectivity from PCI-Express 3.0 x16, an NVLink interface, and connects to GDDR6 memory across a 256-bit wide memory bus. On the RTX 2080, this bus drives 8 GB of memory clocked at 14 Gbps. The GigaThread engine marshals load between six GPCs (graphics processing clusters), unlike predecessors of the TU104, which generally only had 4 GPCs. Each GPC has a dedicated raster engine and four TPCs (texture processing clusters). A TPC shares a PolyMorph engine between two SMs. Each SM packs 64 CUDA cores, 8 Tensor cores, and an RT core. There are hence 512 CUDA cores, 64 Tensor cores, and 8 RT cores per GPC for a grand total of 3,072 CUDA cores, 384 Tensor cores, and 48 RT cores across the TU104 silicon.

The GeForce RTX 2080 is carved out of the TU104 by disabling two SMs or one TPC, resulting in 2,944 CUDA cores, 368 Tensor cores, and 46 RT cores. The GPU is endowed with 184 TMUs and 64 ROPs.

The GeForce RTX 2080 maxes out the 256-bit GDDR6 memory bus width of the TU104 silicon, wiring it to 8 GB of memory. Ticking at 14 Gbps, this setup belts out a memory bandwidth of 448 GB/s.

Features

Again, we highly recommend you to read our article from the 14th of September for intricate technical details about the "Turing" architecture feature set, which we are going to briefly summarize here.

NVIDIA RTX is a brave new feature that has triggered a leap in GPU compute power, just like other killer real-time consumer graphics features, such as anti-aliasing, programmable shading, and tessellation. It provides a programming model for 3D scenes with ray-traced elements that improve realism. RTX introduces several turnkey effects that game developers can implement with specific sections of their 3D scenes, rather than ray-tracing everything on the screen (we're not quite there yet). A plethora of next-generation GameWorks effects could leverage RTX.

Perhaps more relevant architectural features to gamers come in the form of improvements to the GPU's shaders. In addition to concurrent INT and FP32 operations in the SM, "Turing" introduces Mesh Shading, Variable Rate Shading, Content-Adaptive Shading, Motion-Adaptive Shading, Texture-Space Shading, and Foveated Rendering.

Deep Learning Anti-Aliasing (DLSS) is an ingenious new post-processing AA method that leverages deep-neural networks built ad hoc with the purpose of guessing how an image could look upscaled. DNNs are built on-chip, accelerated by Tensor cores. Ground-truth data on how objects in most common games should ideally look upscaled are fed via driver updates, or GeForce Experience. The DNN then uses this ground-truth data to reconstruct details in 3D objects. 2x DLSS image quality is comparable to 64x "classic" super sampling.