Palit GeForce RTX 5070 GamingPro OC Review 26

Palit GeForce RTX 5070 GamingPro OC Review

(26 Comments) »

Introduction

Palit Logo

The Palit GeForce RTX 5070 GamingPro OC is the company's most premium custom-design rendition of NVIDIA's new performance segment GPU, one that's arguably the most important SKU in the generation, given that all its predecessors sold in large volumes. The Palit GamingPro brand strikes a balance of aesthetics and enthusiast-friendly features. It's positioned below the GameRock and JetStream brands, but Palit hasn't launched an RTX 5070 GameRock, making this the company's best RTX 5070—for now. With the RTX 50-series, Palit also introduced the Infinity series of value custom-design cards priced closest to MSRP. The GeForce RTX 5070 is a lean mean machine designed to max out gameplay at 1440p, including having ray tracing enabled. You can take advantage of features such as DLSS, including the latest DLSS 4 Multi Frame Generation, to enable new use-cases, such as 1440p with high refresh-rates or 4K.



The GeForce RTX 5070 is powered by the new GeForce Blackwell graphics architecture. NVIDIA built this generation of GPUs on the same NVIDIA 4N foundry node as the previous RTX 40-series Ada generation—a node that went into mass-production in 2022. As a mature node in 2025, it gives NVIDIA the best manufacturing costs, so it could keep supplies of its GeForce RTX products unaffected by rising demand for its Blackwell AI GPUs. Whatever generational energy efficiency gains you see with the RTX 50-series are hence purely a function of the architecture and new power management technologies introduced with it.

The GeForce Blackwell graphics architecture introduces a new concept to consumer 3D graphics, called Neural Rendering. It aims to bring the power of generative AI models into the gaming graphics workflow, with an AI model running in tandem with the conventional raster 3D graphics stack, and neural objects being combined with raster 3D much in the same way RTX brings real time ray traced objects to it. NVIDIA even worked with Microsoft to standardize this at the DirectX 12 API level, letting 3D applications directly address the Tensor cores on the GPU, and for the Shader Execution Reordering component to be aware of neural shaders. Neural Rendering is only possible on the RTX 50-series, and not older generations of GeForce, because NVIDIA relies on a new hardware scheduler to manage the various AI-related compute resources on the silicon, called the AI Management Processor (AMP).

The RTX 50-series also introduces DLSS 4 and Multi Frame Generation. DLSS 4 sees the company debut a new transformer-based AI model replacing the older convoluted neural networks (CNN) based one. This is more accurate, and vastly improves image quality at every performance preset. Transformer models replace CNN-based ones for super resolution, ray reconstruction, as well as frame generation; speaking of which, the new Multi Frame Generation (MFG) technology lets the GPU generate up to three frames entirely using AI, following a conventionally rendered one, letting you nearly quadruple framerates. MFG is exclusive to Blackwell because it relies on hardware flip-metering to accurately pace frames, which is introduced with the updated display engine of Blackwell.

The new Blackwell streaming multiprocessor (SM) features concurrent FP32 and INT32 math capability on all its CUDA cores—Ada could only have this on half its CUDA cores per SM. The new 5th Gen Tensor core comes with FP4 data format capability to increase throughput by trading in precision. The 4th Gen RT core comes with even more specialized hardware, including components that enable Mega Geometry, or the ability for ray traced objects to have exponentially higher triangle counts (and the need for rays to interact with all of those triangles). The generational increase in the use of AI models in consumer graphics warrants increases in memory bandwidth. NVIDIA implemented the new GDDR7 memory standard, with the RTX 5070 coming with 12 GB of 28 Gbps GDDR7 memory across a 192-bit wide memory interface for a 33% increase in memory bandwidth over the previous RTX 4070.

The RTX 5070 introduces the new GB205 silicon, nearly maxing it out by utilizing 48 of the 50 SM units available. This leads to 6,144 CUDA cores, 192 Tensor cores, 48 RT cores, and 192 TMUs. In a notable upgrade, the RTX 5070 benefits from all 80 ROPs available on the GB203 silicon—remember, the RTX 4070 only utilized 64 of the 80 ROPs from the AD104 silicon. Additionally, the RTX 5070 boasts the full 48 MB of L2 cache on the silicon, compared to the 36 MB seen on the RTX 4070. A 192-bit GDDR7 memory interface drives 12 GB of memory, enhancing overall performance.

Palit's GeForce RTX 5070 GamingPro OC comes with a fairly heavy triple-slot cooling solution, with an aluminium fin-stack heatsink that's ventilated by a trio of TurboFan 4.0 axial airflow fans that come with dual ball bearings. Unlike the company's latest GameRock graphics cards (available on RTX 5070 Ti and above), the GamingPro OC comes with a touch of tastefully executed RGB lighting, and offers a 3-pin ARGB header to sync your build's lighting with the card's. It also offers dual-BIOS, with the default Performance BIOS running the card at 2572 MHz boost (compared to 2512 MHz reference), while the second Quiet BIOS drops it to reference speeds to bring down fan noise. Palit graphics cards are rarely available in the US. We found the RTX 5070 GamingPro OC listed online (and in-stock) in Europe for €750, including VAT, which converts to USD 675 without VAT, or $125 higher than the NVIDIA baseline MSRP.

NVIDIA GeForce RTX 5070 Market Segment Analysis
 PriceCoresROPsCore
Clock
Boost
Clock
Memory
Clock
GPUTransistorsMemory
RTX 3080$4208704961440 MHz1710 MHz1188 MHzGA10228000M10 GB, GDDR6X, 320-bit
RTX 4070$4905888641920 MHz2475 MHz1313 MHzAD10435800M12 GB, GDDR6X, 192-bit
RX 7800 XT$4403840962124 MHz2430 MHz2425 MHzNavi 3228100M16 GB, GDDR6, 256-bit
RX 6900 XT$45051201282015 MHz2250 MHz2000 MHzNavi 2126800M16 GB, GDDR6, 256-bit
RX 6950 XT$63051201282100 MHz2310 MHz2250 MHzNavi 2126800M16 GB, GDDR6, 256-bit
RTX 3090$900104961121395 MHz1695 MHz1219 MHzGA10228000M24 GB, GDDR6X, 384-bit
RTX 4070 Super$5907168801980 MHz2475 MHz1313 MHzAD10435800M12 GB, GDDR6X, 192-bit
RX 7900 GRE$53051201601880 MHz2245 MHz2250 MHzNavi 3157700M16 GB, GDDR6, 256-bit
RTX 4070 Ti$7007680802310 MHz2610 MHz1313 MHzAD10435800M12 GB, GDDR6X, 192-bit
RTX 5070$5506144802325 MHz2512 MHz1750 MHzGB20531100M12 GB, GDDR7, 192-bit
Palit RTX 5070
GamingPro OC
$6756144802325 MHz2572 MHz1750 MHzGB20531100M12 GB, GDDR7, 192-bit
RTX 4070 Ti Super$75084481122340 MHz2610 MHz1313 MHzAD10345900M16 GB, GDDR6X, 256-bit
RX 7900 XT$62053761922000 MHz2400 MHz2500 MHzNavi 3157700M20 GB, GDDR6, 320-bit
RTX 5070 Ti$7508960962295 MHz2452 MHz1750 MHzGB20345600M16 GB, GDDR7, 256-bit
RTX 3090 Ti$1000107521121560 MHz1950 MHz1313 MHzGA10228000M24 GB, GDDR6X, 384-bit
RTX 4080$94097281122205 MHz2505 MHz1400 MHzAD10345900M16 GB, GDDR6X, 256-bit
RTX 4080 Super$990102401122295 MHz2550 MHz1438 MHzAD10345900M16 GB, GDDR6X, 256-bit
RX 7900 XTX$82061441922300 MHz2500 MHz2500 MHzNavi 3157700M24 GB, GDDR6, 384-bit
RTX 5080$1000107521122295 MHz2617 MHz1875 MHzGB20345600M16 GB, GDDR7, 256-bit
RTX 4090$2400163841762235 MHz2520 MHz1313 MHzAD10276300M24 GB, GDDR6X, 384-bit
RTX 5090$2000217601762017 MHz2407 MHz1750 MHzGB20292200M32 GB, GDDR7, 512-bit

NVIDIA Blackwell Architecture


NVIDIA does not provide a block diagram for the GB205 GPU (we asked), so we had to quickly hack one out from the GB202 diagram. This is accurate just not as pretty.

The GeForce Blackwell graphics architecture heralds NVIDIA's 4th generation of RTX, the late-2010s re-invention of the modern GPU that sees a fusion of real time ray traced objects with conventional raster 3D graphics. With Blackwell, NVIDIA is helping add another dimension, neural rendering, the ability for the GPU to leverage a generative AI to create portions of a frame. This is different from DLSS, where an AI model is used to reconstruct details in an upscaled frame based on its training date, temporal frames, and motion vectors. Today we are reviewing NVIDIA's fourth GPU from this generation, the RTX 5070. At the heart of this graphics card is the new 5 nm GB205 silicon. This chip has a unique die-size and SM count that doesn't have a predecessor from the previous Ada generation. NVIDIA skipped a direct successor to the AD104 in the Blackwell generation, instead building the RTX 5070 Ti on the larger GB203 silicon and the RTX 5070 on the technically smaller GB205. The chip measures 263 mm² in die-area, with a transistor count of 31.1 billion, both of which are smaller than those of the AD104, which had to part with nearly a fifth of its shaders to yield an RTX 4070. Given its volumes, NVIDIA would probably have had to part with perfectly good AD104 chips to carve out the RTX 4070. It's to minimize this die area wastage in this generation that the company set out to create the GB205.

The GB205 silicon is laid out essentially in the same component hierarchy as past generations of NVIDIA GPUs, but with a few notable changes. The GPU features a PCI-Express 5.0 x16 host interface. PCIe Gen 5 has been around since Intel's 12th Gen Core "Alder Lake" and AMD's Ryzen 7000 "Zen 4," so there is a sizable install-base of systems that can take advantage of it. The GPU is of course compatible with older generations of PCIe. The GB205 also features the new GDDR7 memory interface that's making its debut with this generation. The chip features a 192-bit wide memory bus. NVIDIA is using this to drive 12 GB of memory at 28 Gbps speeds, yielding 672 GB/s of memory bandwidth, which is a 33% increase over the RTX 4070 and its 21 Gbps GDDR6X.

The GigaThread Engine is the main graphics rendering workload allocation logic on the GB205, but there's a new addition, a dedicated serial processor for managing all AI acceleration resources on the GPU, NVIDIA calls this AMP (AI management processor). Other components at the global level are the Optical Flow Processor, a component involved in older versions of DLSS frame generation and for video encoding; and an updated media acceleration engine consisting of one each of NVDEC and NVENC video accelerators. The new 9th Gen NVENC video encode accelerators come with 4:2:2 AV1 and HEVC encoding support. The central region of the GPU has the single largest common component, the 48 MB L2 cache, which the RTX 5070 maxes out. This is an increase over the 36 MB that the RTX 4070 has.


There are five graphics processing clusters (GPC) on the GB205. Each of these contains 10 streaming multiprocessors (SM) across 5 texture processing clusters (TPCs), and a raster engine consisting of 16 ROPs. Each SM contains 128 CUDA cores. Unlike the Ada generation SM that each had 64 FP32+INT32 and 64 purely-FP32 SIMD units, the new Blackwell generation SM features concurrent FP32+INT32 capability on all 128 SIMD units. These 128 CUDA cores are arranged in four slices, each with a register file, a level-0 instruction cache, a warp scheduler, two sets of load-store units, and a special function unit (SFU) handling some special math functions such as trigonometry, exponents, logarithms, reciprocals, and square-root. The four slices share a 128 KB L1 data cache, and four TMUs. The most exotic components of the Blackwell SM are the four 5th Gen Tensor cores, and a 4th Gen RT core.

With 5 GPCs containing 5 TPCs each, there are a total of 50 SM, worth 6,400 CUDA cores, 200 Tensor cores, 50 RT cores, and 200 TMUs, on the GB205 silicon. The RTX 5070 doesn't max out the silicon, it gets 48 out of the 50 SM, resulting in 6,144 CUDA cores, 192 Tensor cores, 48 RT cores, and 192 TMUs. The GB205 silicon is endowed with 80 ROPs, all of which are enabled on the RTX 5070. This is a step up from the RTX 4070, which only had 64 out of 80 ROPs present on the AD104 silicon. The RTX 5070 also maxes out all 48 MB of L2 cache present on the die, while the RTX 4070 only had 36 MB out of the 48 MB present.


Perhaps the biggest change to the way the SM handles work introduced with Blackwell is the concept of neural shaders—treating portions of the graphics rendering workload done by a generative AI model as shaders. Microsoft has laid the groundwork for standardization of neural shaders with its Cooperative Vectors API, in the latest update to DirectX 12. The Tensor cores are now accessible for workloads through neural shaders, and the shader execution reordering (SER) engine of the Blackwell SM is able to more accurately reorder workloads for the CUDA cores and the Tensor core in an SM.


The new 5th Gen Tensor core introduces support for FP4 data format (1/8 precision) to fast moving atomic workloads, providing 32 times the throughput of the very first Tensor core introduced with the Volta architecture. Over the generations, AI models leveraged lesser precision data formats, and sparsity, to improve performance. The AI management processor (AMP) is what enables simultaneous AI and graphics workloads at the highest levels of the GPU, so it could be simultaneously rendering real time graphics for a game, while running an LLM, without either affecting the performance of the other. AMP is a specialized hardware scheduler for all the AI acceleration resources on the silicon. This plays a crucial role for DLSS 4 multi-frame generation to work.


The 4th Gen RT core not just offers a generational increase in ray testing and ray intersection performance, which lowers the performance cost of enabling path tracing and ray traced effects; but also offers a potential generational leap in performance with the introduction of Mega Geometry. This allows for ray traced objects with extremely high polygon counts, increasing their detail. Poly count and ray tracing present linear increases in performance costs, as each triangle has to intersect with a ray, and there should be sufficient rays to intersect with each of them. This is achieved by adopting clusters of triangles in an object as first-class primitives, and cluster-level acceleration structures. The new RT cores introduce a component called a triangle cluster intersection engine, designed specifically for handling mega geometry. The integration of a triangle cluster compression format and a lossless decompression engine allows for more efficient processing of complex geometry.


The GB205 and the rest of the GeForce Blackwell GPU family is built on the exact same TSMC "NVIDIA 4N" foundry node, which is actually 5 nm, as previous-generation Ada, so NVIDIA directed efforts to finding innovative new ways to manage power and thermals. This is done through a re-architected power management engine that relies on clock gating, power gating, and rail gating of the individual GPCs and other top-level components. It also worked on the speed at which the GPU makes power-related decisions.


The quickest way to drop power is by adjusting the GPU clock speed, and with Blackwell, NVIDIA introduced a means for rapid clock adjustments at the SM-level.


NVIDIA updated both the display engine and the media engine of Blackwell over the previous generation Ada, which drew some flack for holding on to older display I/O standards such as DisplayPort 1.4, while AMD and Intel had moved on to DisplayPort 2.1. The good news is that Blackwell supports DP 2.1 with UHBR20, enabling 8K 60 Hz with a single cable. The company also updated NVDEC and NVENC, which now support AV1 UHQ, double the H.264 decode performance, MV-HEVC, and 4:2:2 formats.

Neural Rendering


Neural Rendering promises to be as transformative to modern graphics as programmable shaders itself. 3D Graphics rendering evolved from fixed-function over the turn of the century, to programmable shaders, HLSL, geometry shaders, compute shaders, and ray tracing, over the past couple of decades. In 2025, NVIDIA is writing the next chapter in this journey with Blackwell neural shaders. This allows for a host of neural-driven effects, including neural materials, neural volumes, and even neural radiance fields. Microsoft introduced the new Cooperative Vectors API for DirectX in a recent update, making it possible to access Tensor cores within a graphics API. Combined with a new shading language, Slang, this breakthrough enables developers to integrate neural techniques directly into their workflows, potentially replacing parts of the traditional graphics pipeline. Slang splits large, complex functions into smaller pieces that are easier to handle. Given that this is a DirectX standard API feature, there is nothing that stops AMD and Intel from integrating Neural Rendering (Cooperative Vectors) into their graphics drivers.

RTX Neural Materials works to significantly reduce the memory footprint of materials in 3D scenes. Under conventional rendering, the memory footprint of a material is bloated from complex shader code. Neural materials convert shader code and texture layers into a compressed neural representation. This results in up to a 7:1 compression ratio and enables small neural networks to generate stunning, film-like materials in real-time. For example, silk rendered with traditional shaders might lack the multicolored sheen seen in real life. Neural materials, however, capture intricate details like color variation and reflections, bringing such surfaces to life with unparalleled realism—and at a fraction of the memory cost.


The new Neural Radiance Cache, which dynamically trains a neural network during gameplay using the user's GPU, allowing light transport to be cached spatially, enabling near-infinite light bounces in a scene. This results in realistic indirect lighting and shadows with minimal performance impact. NRC partially traces 1 or 2 rays before storing them in a radiance cache, and infers an infinite amount of rays and bounces for a more accurate representation of indirect lighting in the game scene.

DLSS 4 and Multi Frame Generation


DLSS 4 introduces a major leap in image quality and performance. It isn't just a version bump with the introduction of a new feature, namely Multi Frame Generation, but introduces updates to nearly all DLSS sub-features. DLSS from its very beginning relied on AI to reconstruct details in super resolution, and with DLSS 4, NVIDIA is introducing a new transformer-based AI model to succeed the convolutional neural networks previous used, for double the parameters, four times the compute performance, and significantly improved image quality. Ray Reconstruction, introduced with DLSS 3.5, gets a significant image quality update with the new transformer-based model.


To understand Multi Frame Generation, you need to understand how DLSS Frame Generation, introduced with GeForce Ada, works. An Optical Flow Accelerator component gives the DLSS algorithm data to generate an entire frame using a neural network, using information from a previous rendered frame, effectively doubling frame rate. In Multi Frame Generation, AI takes over the functions of optical flow, to predict up to three frames following a conventionally rendered frame, effectively drawing four frames form the rendering effort of one.


Now, assuming this rendered frame is a product of Super Resolution, with the maximum performance setting generating 4x the pixels from a single rendered pixel, you're looking at a possibility where the rendering effort of 1/4th a frame goes into drawing 4 frames, or 15 in every 16 pixels being generated entirely by DLSS. When generating so many frames, Frame Pacing becomes a problem—irregular frame intervals impact smoothness. DLSS 4 addresses these issues by using a dedicated hardware unit inside Blackwell, which takes care of flip metering, reducing frame display variability by 5-10x. The Display Engine of Blackwell contains the hardware for flip metering.

NVIDIA Reflex 2


The original NVIDIA Reflex brought about a significant improvement to the responsiveness of maxed out graphics in competitive online gameplay, by compacting the rendering queue with the goal of reducing the whole system latency by up to 50%. Reflex is mandatory in DLSS 3 Frame Generation, given the latency cost imposed by the technology. Multi-frame generation calls for an equally savvy piece of technology, so we hence have Reflex 2. NVIDIA claims to have achieved a 75% reduction in latency with Frame Warp, which updates the camera (viewport) positions based on user inputs in real-time, and then uses temporal information to reconstruct the frame to display.

Packaging

Package Front
Package Back


The Card

Graphics Card Front
Graphics Card Back
Graphics Card Height

With the GeForce RTX 50 Series, Palit introduces a new GamingPro theme. The main color is black, with white highlights on both the front and the back. On the back you get a high quality metal backplate with a cutout for air to flow through, the front cooler shroud is made from plastic.

Graphics Card Dimensions

Dimensions of the card are 33.0 x 13.0 cm, and it weighs 1528 g.

Graphics Card Front Angled

Installation requires three slots in your system. We measured the card's width to be 61 mm.

Monitor Outputs, Display Connectors

Display connectivity includes three standard DisplayPort 2.1b and one HDMI 2.1b.

Standard for all GeForce RTX 50-series Blackwell cards is a new display engine that supports three DisplayPort 2.1b outputs, each capable of UHBR20; and one HDMI 2.1a. Both interfaces support DSC (display stream compression). With DSC enabled, a single DisplayPort on this card can drive 4K 12-bit HDR at 480 Hz; or 8K 12-bit HDR at up to 165 Hz. The RTX 5070 features an updated media acceleration engine with support for 4:2:2 video formats, AV1 UHQ, and MV-HEVC. Unlike the bigger RTX 50 models, which have two, there is a single NVENC and NVDEC unit each.

Graphics Card Power Plugs

The card uses a single 16-pin connector, which allows a maximum power draw of 600 W, but the board power limit is set much lower of course. Next to the power input is the RGB header which lets you synchronize the graphics card RGB with other components in your system.


Palit has installed an RGB lighting zone along the top edge of the card, lighting up the "GamingPro" logo, which is also seen when the card is in an upright position.


This BIOS switch lets you toggle between the default Performance BIOS and an optional "Quiet" BIOS.

Our Patreon Silver Supporters can read articles in single-page format.
Discuss(26 Comments)
Apr 9th, 2025 20:05 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts