NVIDIA Hopper Features "SM-to-SM" Comms Within GPC That Minimize Cache Roundtrips and Boost Multi-Instance Performance

btarunr · Aug 24, 2022

NVIDIA in its HotChips 34 presentation revealed a defining feature of its "Hopper" compute architecture that works to increase parallelism and help the H100 processor better perform in a multi-instance environment. The hardware component hierarchy of "Hopper" is typical of NVIDIA architectures, with GPCs, SMs, and CUDA cores forming a hierarchy. The company is introducing a new component it calls "SM to SM Network." This is a high-bandwidth communications fabric inside the Graphics Processing Cluster (GPC), which facilitates direct communication among the SMs without making round-trips to the cache or memory hierarchy, play a significant role in NVIDIA's overarching claim of "6x throughput gain over the A100."

Direct SM-to-SM communication not just impacts latency, but also unburdens the L2 cache, letting NVIDIA's memory-management free up the cache of "cooler" (infrequently accessed) data. CUDA sees every GPU as a "grid," every GPC as a "Cluster," every SM as a "thread block," and every lane of SIMD units as a "lane." Each lane has a 64 KB of shared memory, which makes up 256 KB of shared local storage per SM as there are four lanes. The GPCs interface with 50 MB of L2 cache, which is the last-level on-die cache before the 80 GB of HBM3 serves as main memory.

View at TechPowerUp Main Site | Source

tpu7887 · Aug 24, 2022

Very nice. I wonder how much it'll help.

napata · Aug 24, 2022

Hopper only or is it also going to be implemented into Lovelace?

System Name	RBMK-1000
Processor	AMD Ryzen 7 5700G
Motherboard	ASUS ROG Strix B450-E Gaming
Cooling	DeepCool Gammax L240 V2
Memory	2x 8GB G.Skill Sniper X
Video Card(s)	Palit GeForce RTX 2080 SUPER GameRock
Storage	Western Digital Black NVMe 512GB
Display(s)	BenQ 1440p 60 Hz 27-inch
Case	Corsair Carbide 100R
Audio Device(s)	ASUS SupremeFX S1220A
Power Supply	Cooler Master MWE Gold 650W
Mouse	ASUS ROG Strix Impact
Keyboard	Gamdias Hermes E2
Software	Windows 11 Pro

NVIDIA Hopper Features "SM-to-SM" Comms Within GPC That Minimize Cache Roundtrips and Boost Multi-Instance Performance

btarunr

Editor & Senior Moderator

tpu7887

napata