Introduction
Back in September, NVIDIA launched its GeForce RTX 20-series graphics card family with the RTX 2080 and RTX 2080 Ti. Close to a month later, the company launched its third-fastest card in the family, the GeForce RTX 2070. This is an important product for NVIDIA because even at a relatively steep price of $500, it is the most affordable one offering real-time ray tracing in games, or at least a semblance of it. The RTX 2070 is being offered to the vast bulk of gamers that play at 1440p resolution or lower.
NVIDIA has also made certain interesting design choices for the RTX 2070. Predecessors of this card, such as the GTX 1070 and GTX 970, have been based on the same chips as the SKU just above them, such as the GTX 1080 and GTX 980. NVIDIA is basing the RTX 2070 on its third-largest "Turing" chip, the TU106, instead of the TU104.
It's important to mention here, though, that the TU106 isn't exactly a successor of chips in the same way as the GP106 or GM206. While those two have exactly half the muscle of the GP104 or GM204 respectively, the TU106 has half the muscle of the top-dog TU102 instead of the TU104. This chip also gets the same 256-bit wide GDDR6 memory interface, which is unchanged from the TU104. The philosophy behind the TU106 may have been to design a lean chip that is cheaper to build for the simple fact that it has a smaller die than the TU104.
Today, we have for review the Zotac GeForce RTX 2070 AMP Extreme, which is the company's flagship RTX 2070 variant. It is built on a custom PCB with revamped VRM circuitry and upgraded power input capability. The clocks have been increased significantly, to 1860 MHz Boost, which is 110 MHz higher than the NVIDIA Founders Edition. ZOTAC also upgraded the cooler to a large triple-slot triple-fan solution with a 8+6 power input configuration.
The Zotac RTX 2070 AMP Extreme is currently available online for $640.
GeForce GTX 2070 Market Segment Analysis | Price | Shader Units | ROPs | Core Clock | Boost Clock | Memory Clock | GPU | Transistors | Memory |
---|
GTX 1050 | $135 | 640 | 32 | 1354 MHz | 1455 MHz | 1752 MHz | GP107 | 3300M | 2 GB, GDDR5, 128-bit |
---|
GTX 1050 Ti | $170 | 768 | 32 | 1290 MHz | 1392 MHz | 1752 MHz | GP107 | 3300M | 4 GB, GDDR5, 128-bit |
---|
RX 470 | $165 | 2048 | 32 | 932 MHz | 1216 MHz | 1650 MHz | Ellesmere | 5700M | 4 GB, GDDR5, 256-bit |
---|
RX 570 | $190 | 2048 | 32 | 1168 MHz | 1244 MHz | 1750 MHz | Ellesmere | 5700M | 4 GB, GDDR5, 256-bit |
---|
GTX 970 | $235 | 1664 | 56 | 1051 MHz | 1178 MHz | 1750 MHz | GM204 | 5200M | 4 GB, GDDR5, 256-bit |
---|
RX 480 | $230 | 2304 | 32 | 1120 MHz | 1266 MHz | 2000 MHz | Ellesmere | 5700M | 8 GB, GDDR5, 256-bit |
---|
RX 580 | $230 | 2304 | 32 | 1257 MHz | 1340 MHz | 2000 MHz | Ellesmere | 5700M | 8 GB, GDDR5, 256-bit |
---|
GTX 1060 3 GB | $220 | 1152 | 48 | 1506 MHz | 1708 MHz | 2002 MHz | GP106 | 4400M | 3 GB, GDDR5, 192-bit |
---|
GTX 1060 | $260 | 1280 | 48 | 1506 MHz | 1708 MHz | 2002 MHz | GP106 | 4400M | 6 GB, GDDR5, 192-bit |
---|
GTX 980 Ti | $390 | 2816 | 96 | 1000 MHz | 1075 MHz | 1750 MHz | GM200 | 8000M | 6 GB, GDDR5, 384-bit |
---|
R9 Fury X | $380 | 4096 | 64 | 1050 MHz | N/A | 500 MHz | Fiji | 8900M | 4 GB, HBM, 4096-bit |
---|
GTX 1070 | $390 | 1920 | 64 | 1506 MHz | 1683 MHz | 2002 MHz | GP104 | 7200M | 8 GB, GDDR5, 256-bit |
---|
RX Vega 56 | $400 | 3584 | 64 | 1156 MHz | 1471 MHz | 800 MHz | Vega 10 | 12500M | 8 GB, HBM2, 2048-bit |
---|
GTX 1070 Ti | $400 | 2432 | 64 | 1607 MHz | 1683 MHz | 2000 MHz | GP104 | 7200M | 8 GB, GDDR5, 256-bit |
---|
GTX 1080 | $470 | 2560 | 64 | 1607 MHz | 1733 MHz | 1251 MHz | GP104 | 7200M | 8 GB, GDDR5X, 256-bit |
---|
RX Vega 64 | $570 | 4096 | 64 | 1247 MHz | 1546 MHz | 953 MHz | Vega 10 | 12500M | 8 GB, HBM2, 2048-bit |
---|
GTX 1080 Ti | $675 | 3584 | 88 | 1481 MHz | 1582 MHz | 1376 MHz | GP102 | 12000M | 11 GB, GDDR5X, 352-bit |
---|
RTX 2070 | $499 | 2304 | 64 | 1410 MHz | 1620 MHz | 1750 MHz | TU106 | 10800M | 8 GB, GDDR6, 256-bit |
---|
RTX 2070 FE | $599 | 2304 | 64 | 1410 MHz | 1710 MHz | 1750 MHz | TU106 | 10800M | 8 GB, GDDR6, 256-bit |
---|
ZOTAC RTX 2070 AMP Extreme | $640 | 2304 | 64 | 1410 MHz | 1830 MHz | 1860 MHz | TU106 | 10800M | 8 GB, GDDR6, 256-bit |
---|
RTX 2080 | $699 | 2944 | 64 | 1515 MHz | 1710 MHz | 1750 MHz | TU104 | 13600M | 8 GB, GDDR6, 256-bit |
---|
RTX 2080 FE | $799 | 2944 | 64 | 1515 MHz | 1800 MHz | 1750 MHz | TU104 | 13600M | 8 GB, GDDR6, 256-bit |
---|
RTX 2080 Ti | $999 | 4352 | 64 | 1350 MHz | 1545 MHz | 1750 MHz | TU102 | 18600M | 11 GB, GDDR6, 352-bit |
---|
RTX 2080 Ti FE | $1199 | 4352 | 64 | 1350 MHz | 1635 MHz | 1750 MHz | TU102 | 18600M | 11 GB, GDDR6, 352-bit |
---|
Architecture
On the 14th of September, we published a
comprehensive NVIDIA "Turing" architecture deep-dive article including coverage of its three new silicon implementations and the new RTX Technology. Be sure to catch that article for more technical details.
The "Turing" architecture caught many of us by surprise because it wasn't visible on GPU architecture roadmaps until a few quarters ago. NVIDIA took this roadmap detour over carving out client-segment variants of "Volta" as it realized it had achieved sufficient compute power to bring its ambitious RTX Technology to the client segment. NVIDIA RTX is an all-encompassing, real-time ray-tracing model for consumer graphics that seeks to bring a semblance of real-time ray tracing to 3D games.
To enable RTX, NVIDIA has developed an all-new hardware component that sits next to CUDA cores, called the RT core. An RT core is a fixed-function hardware that does what the spiritual ancestor of RTX, NVIDIA OptiX, did over CUDA cores. You input the mathematical representation of a ray and it will transverse the scene to calculate the point of intersection with any triangle in the scene. This is a computationally heavy task that would have otherwise bogged down the CUDA cores.
The other major introduction is the Tensor Core, which made its debut with the "Volta" architecture. These too are specialized components tasked with 3x3x3 matrix multiplication, which speeds up AI deep-learning neural net building and training. Its relevance to gaming is limited at this time, but NVIDIA is introducing a few AI-accelerated image-quality enhancements that could leverage Tensor operations.
The component hierarchy of a "Turing" GPU isn't much different from its predecessors, but the new-generation Streaming Multiprocessor is significantly different. It packs 64 CUDA cores, 8 Tensor Cores, and a single RT core.
TU106 Graphics Processor
The TU106 is the third-largest based on the "Turing" architecture, and as we mentioned earlier, it is divergent from chips such as the GP106 in that it has half the number-crunching machinery of the largest TU102 chip, and not half that of the TU104. This allows NVIDIA to design the RTX 2070 to have over 3/4th the number of CUDA cores as the RTX 2080 without wasting valuable TU104 die by disabling CUDA cores that are sometimes perfectly functional.
At the topmost level, the GPU takes host connectivity from PCI-Express 3.0 x16 and connects to GDDR6 memory across a 256-bit wide GDDR6 memory bus, which is the same exact memory interface as the RTX 2080 and TU104 it's based on.
The GigaThread engine marshals load between three GPCs (graphics processing clusters). Each GPC has a dedicated raster engine and six TPCs (texture processing clusters). A TPC shares a PolyMorph engine between two SMs. Each SM packs 64 CUDA cores, 8 Tensor cores, and an RT core.
There are, hence, 768 CUDA cores, 96 Tensor cores, and 12 RT cores per GPC, and a grand total of 2,304 CUDA cores, 288 Tensor cores, and 36 RT cores across the TU106 silicon. The GeForce RTX 2070 maxes out this silicon with no disabled components. The GPU is endowed with 144 TMUs and 64 ROPs. You'll notice that the composition of the GPC is identical to that of the TU102, in comparison to that of the TU104.
At its given memory clock of 14 Gbps, the RTX 2070 has the same memory bandwidth on tap as the RTX 2080 at 448 GB/s.
Features
Again, we highly recommend you read our article from the 14th of September for intricate technical details about the "Turing" architecture feature set, which we are going to briefly summarize here.
NVIDIA RTX is a brave new feature that has triggered a leap in GPU compute power, just like other killer real-time consumer graphics features, such as anti-aliasing, programmable shading, and tessellation. It provides a programming model for 3D scenes with ray-traced elements that improve realism. RTX introduces several turnkey effects that game developers can implement with specific sections of their 3D scenes, rather than ray-tracing everything on the screen (we're not quite there yet). A plethora of next-generation GameWorks effects could leverage RTX.
Perhaps more relevant architectural features to gamers come in the form of improvements to the GPU shaders. In addition to concurrent INT and FP32 operations in the SM, "Turing" introduces Mesh Shading, Variable Rate Shading, Content-Adaptive Shading, Motion-Adaptive Shading, Texture-Space Shading, and Foveated Rendering.
Deep Learning Anti-Aliasing (DLSS) is an ingenious new post-processing AA method that leverages deep-neural networks built ad hoc with the purpose of guessing how an image could look upscaled. DNNs are built on-chip, accelerated by Tensor cores. Ground-truth data on how objects in most common games should ideally look upscaled are fed via driver updates, or GeForce Experience. The DNN then uses this ground-truth data to reconstruct detail in 3D objects. 2x DLSS image quality is comparable to 64x "classic" super sampling.
Packaging and Contents
You will receive:
- Graphics card
- Documentation
- 2x PCIe power cable