The TITAN X Pascal is based on NVIDIA's new second-biggest chip based on the "Pascal" architecture, the GP102 silicon. The biggest GPU is the GP100 driving the Tesla P100 HPC processor. The GP102 is essentially GP100 minus FP64 CUDA cores, featuring only 3,840 FP32 ones, and 4096-bit HBM2 memory interface making way for 384-bit GDDR5X. As mentioned earlier, the GP102 is a 1.5X upscale of the GP104 silicon, which powers the GeForce GTX 1080.
With each successive architecture since "Fermi," NVIDIA has been enriching the streaming multiprocessor (SM) by adding more dedicated resources and reducing shared resources within the graphics processing cluster (GPC), which leads to big performance gains. The story continues with "Pascal." Like the GM200 before it, the GP102 features six GPCs, super-specialized subunits of the GPU that share the PCI-Express 3.0 x16 host interface and the 384-bit GDDR5X memory interface through twelve controllers. These controllers support GDDR5X memory, ticking at 10 Gbps.
Workload across the six GPCs is shared by the GigaThread Engine cushioned by 3 MB of cache. Each GPC holds five streaming multiprocessors (SMs), which is an increase from the four SMs each GPC held on the GM200. The GPC shares a raster engine between these five SMs. The "Pascal" streaming multiprocessor features a 4th generation PolyMorph Engine, a component for key render setup operations. With "Pascal," the PolyMorph Engine includes specialized hardware for the new Simultaneous MultiProjection feature. Each SM also holds a block of eight TMUs.
Each SM continues to feature 128 CUDA cores. The GP102 hence features a total of 3,840 CUDA cores, from which 3,584 across 28 SMs are enabled on the TITAN X Pascal. Other vital specifications include 224 TMUs and 96 ROPs. NVIDIA claims to have worked on a new GPU internal circuit design and board channel paths to facilitate significantly higher clock speeds than what the GM204 is capable of. The TITAN X Pascal ships with 1417 MHz GPU clock speed, and a maximum GPU Boost frequency of 1531 MHz. These look like lower clock speeds than what the GTX 1080 ships with, but one has to understand that the GP102 is a much bigger chip, and even with the current speeds, its TDP is rated at 250W. For comparison, the original GeForce GTX TITAN X (Maxwell) was clocked at 1000 MHz, with 1089 MHz boost.
The TITAN X Pascal features the new GDDR5X memory standard. The interface enables effective data-rates that are as high as 14 GHz, and although it has many bare-metal specifications in common with GDDR5, minimizing R&D for its implementation, the memory chip design is improved with higher pin counts to support these higher data-rates. The memory is clocked at an effective 10 GHz. Over a 384-bit memory interface, this works out to a memory bandwidth of 480 GB/s NVIDIA has also optimized the usage of with more advanced 4th generation lossless Delta Color Compression. The best-case scenario has Delta Color Compression provide an "effective" memory bandwidth uplift of 20 percent, which results in 576 GB/s.
The "Pascal" architecture supports Asynchronous Compute as standardized by Microsoft. It adds to that with its own variation of the concept with "Dynamic Load Balancing."