News Posts matching #NVDEC

Return to Keyword Browsing

NVIDIA GB202 "Blackwell" Die Exposed, Shows the Massive 24,576 CUDA Core Configuration

A die-shot of NVIDIA's GB202, the silicon powering the RTX 5090, has surfaced online, providing detailed insights into the "Blackwell" architecture's physical layout. The annotated images, shared by hardware analyst Kurnal and provided by ASUS China general manager Tony Yu, compare the GB202 to its AD102 predecessor and outline key architectural components. The die's central region houses 128 MB of L2 cache (96 MB enabled on RTX 5090), surrounded by memory interfaces. Eight 64-bit memory controllers support the 512-bit GDDR7 interface, with physical interfaces positioned along the top, left, and right edges of the die. Twelve graphics processing clusters (GPCs) surround the central cache. Each GPC contains eight texture processing clusters (TPCs), with each GPC housing 16 streaming multiprocessors (SMs). The complete die configuration enables 24,576 CUDA cores, arranged as 128 cores per SM across 192 SMs. With RTX 5090 offering "only" 21,760 CUDA cores, this means that the full GB202 die is reserved for workstation GPUs.

The SM design includes four slices sharing 128 KB of L1 cache and four texture mapping units (TMUs). Individual SM slices contain dedicated register files, L0 instruction caches, warp schedulers, load-store units, and special function units. Central to the die's layout is a vertical strip containing the media processing components—NVENC and NVDEC units—running from top to bottom. The RTX 5090 implementation enables three of four available NVENC encoders and two of four NVDEC decoders. The die includes twelve raster engine/3D FF blocks for geometry processing. At the bottom edge sits the PCIe 5.0 x16 interface and display controller components. Despite its substantial size, the GB202 remains smaller than NVIDIA's previous GH100 and GV100 dies, which exceeded 814 mm². Each SM integrates specialized hardware, including new 5th-generation Tensor cores and 4th-generation RT cores, contributing to the die's total of 192 RT cores, 768 Tensor cores, and 768 texture units.

NVIDIA Launches the RTX A400 and A1000 Professional Graphics Cards

AI integration across design and productivity applications is becoming the new standard, fueling demand for advanced computing performance. This means professionals and creatives will need to tap into increased compute power, regardless of the scale, complexity or scope of their projects. To meet this growing need, NVIDIA is expanding its RTX professional graphics offerings with two new NVIDIA Ampere architecture-based GPUs for desktops: the NVIDIA RTX A400 and NVIDIA RTX A1000.

They expand access to AI and ray tracing technology, equipping professionals with the tools they need to transform their daily workflows. The RTX A400 GPU introduces accelerated ray tracing and AI to the RTX 400 series GPUs. With 24 Tensor Cores for AI processing, it surpasses traditional CPU-based solutions, enabling professionals to run cutting-edge AI applications, such as intelligent chatbots and copilots, directly on their desktops. The GPU delivers real-time ray tracing, so creators can build vivid, physically accurate 3D renders that push the boundaries of creativity and realism.

NVIDIA Enables More Encoding Streams on GeForce Consumer GPUs

NVIDIA has quietly removed some video encoding limitations on its consumer GeForce graphics processing units (GPUs), allowing encoding of up to five simultaneous streams. Previously, NVIDIA's consumer GeForce GPUs were limited to three simultaneous NVENC encodes. The same limitation did not apply to professional GPUs.

According to NVIDIA's own Video Encode and Decode GPU Support Matrix document, the number of concurrent NVENC encodes on consumer GPUs have been increased from three to five. This includes certain GeForce GPUs based on Maxwell 2nd Gen, Pascal, Turing, Ampere, and Ada Lovelace GPU architectures. While the number of concurrent NVDEC decodes were never limited, there is a limitation on how many streams you can encode by certain GPU, depending on the resolution of the stream and the codec.
Return to Keyword Browsing
Feb 1st, 2025 15:58 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts