| BLACKWELL (PRO) | BLACKWELL (RTX) | HOPPER | ADA | AMPERE (PRO) | AMPERE (RTX) | TURING (RTX) | VOLTA |
Chipset exemple | GB100 | GB202 | GH100 | AD102 | GA100 | GA102 | TU102 | GV100 |
Partitions | ? | 12 GPCs | 8 GPCs | 12 GPCs | 8 GPCs | 7 GPCs | 6 GPCs | 6 GPCs |
Clusters | ? | 96 TPCs | 72 TPCs | 72 TPCs | 64 TPCs | 42 TPCs | 36 TPCs | 42 TPCs |
Cores | ? | 192 SM | 144 SM | 144 SM | 128 SM | 84 SM | 72 SM | 84 SM |
SIMD | ? | 4xSIMD32 (FP32/INT32)
+
4xSIMD4 (SFU)
+
2xFP64 | 4xSIMD32 (FP32)
+
4xSIMD16 (INT32)
+
4xSIMD4 (SFU)
+
4xSIMD16 (FP64) | 4xSIMD16 (FP32)
+
4xSIMD16 (FP32/INT32)
+
4xSIMD4 (SFU)
+
2xFP64 | 4xSIMD16 (FP32)
+
4xSIMD16 (INT32)
+
4xSIMD4 (SFU)
+
4xSIMD8 (FP64) | 4xSIMD16 (FP32)
+
4xSIMD16 (FP32/INT32)
+
4xSIMD4 (SFU)
+
2xFP64 | 4xSIMD16 (FP32)
+
4xSIMD16 (INT32)
+
4xSIMD4 (SFU)
+
2xFP64 | 4xSIMD16 (FP32)
+
4xSIMD16 (INT32)
+
4xSIMD4 (SFU)
+
4xSIMD8 (FP64) |
Max ALU Vector | ? | 28032
(24576 FP32/INT32 + 3072 SFU + 384 FP64 | 39168
(18432 FP32 + 9216 INT32 + 2304 SFU + 9216 FP64 | 21024
(9216 FP32 + 9216 FP32/INT32 + 2304 SFU + 288 FP64 | 22528
(8192 FP32 + 8192 INT32 + 2048 SFU + 4096 FP64 | 12264
(5376 FP32 + 5376 FP32/INT32 + 1344 SFU + 168 FP64 | 10512
(4608 FP32 + 4608 INT32 + 1152 SFU + 144 FP64 | 14784
(5376 FP32 + 5376 INT32 + 1344 SFU + 2688 FP64 |
Matrix ALU | ? | 768 Gen5 | 576 Gen4 | 576 Gen4 | 512 Gen3 | 336 Gen3 | 576 Gen2 | 672 Gen1 |
RTU | - | 192 Gen4 | - | 144 Gen3 | - | 84 Gen2 | 72 Gen1 | - |
Scalar ALU | ? | 768 (4/SM) | 576 (4/SM) | 576 (4/SM) | 512 (4/SM) | 336 (4/SM) | 288 (4/SM) | 336 (4/SM) |
Raster Engine | ? | 12 | 8 | 12 | 8 | 7 | 6 | 6 |
Tesselator | ? | 96 | 72 | 72 | 64 | 42 | 36 | 84 |
TMU | ? | 768 | 576 | 576 | 512 | 366 | 288 | 336 |
ROP | ? | 192 | 24 | 192 | 192 | 112 | 96 | 128 |
Clock max | ? | 2407 MHz | 1980 MHz | 2520 MHz | 1440 MHz | 1860 MHz | 1770 MHz | 1627 MHz |
INT8 Vector | ? | 473,23 TOPs | | 185,79 TOPs | 94,37 TOPs | 79,99 TOPs | 65,25 TOPs | 69,97 TOPs |
INT16 Vector | ? | ? | | ? | ? | ? | ? | ? |
INT24 Vector | ? | ? | | 46,48 TOPs | 23,59 TOPs | 19,99 TOPs | 16,31 TOPs | 17,49 TOPs |
INT32 Vector | ? | ? | | 46,48 TOPs | 23,59 TOPs | 19,99 TOPs | 16,31 TOPs | 17,49 TOPs |
INT64 Vector | ? | ? | | 11,61 TOPs | 5,89 TOPs | 4,99 TOPs | 4,08 TOPs | 4,37 TOPs |
BF16 Vector | ? | 118,30 TFLOPs | 145,98 TFLOPs | 92,89 TFLOPs or 46,48 TFLOPs | 47,18 TFLOPs | 39,99 TFLOPs or 19,99 TFLOPs | - | - |
FP16 Vector | ? | 118,30 TFLOPs | 145,98 TFLOPs | 92,89 TFLOPs or 46,48 TFLOPs | 94,37 TFLOPs | 39,99 TFLOPs or 19,99 TFLOPs | 32,62 TFLOPs | 34,98 TFLOPs |
FP32 Vector | ? | 118,30 TFLOPs | 72,99 TFLOPs | 92,89 TFLOPs or 46,48 TFLOPs | 23,59 TFLOPs | 39,99 TFLOPs or 19,99 TFLOPs | 16,31 TFLOPs | 17,49 TFLOPs |
FP64 Vector | ? | 1,85 TFLOPs | 36,49 TFLOPs | 1,45 TFLOPs | 11,79 TFLOPs | 624,9 GFLOPs | 509,76 GFLOPs | 8,74 TFLOPs |
Tracendental Vector | ? | 14,79 TFLOPs | 9,12 TFLOPs | 11,61 TFLOPs | 5,89 TFLOPs | 4,99 TFLOPs | 4,08 TFLOPs | 4,37 TFLOPs |
INT4 Matrix (Sparsity) | ? | - | - | 1486,35 TOPs (2972,71 TOPs) | 1509,94 TOPs (3019,89 TOPs) | 639,95 TOPs (1279,91 TOPs) | 521,99 TOPs | - |
INT8 Matrix (Sparsity) | ? | 946,47 TOPs (1892,94 TOPs) | - | 743,17 TOPs (1486,35 TOPs) | 754,97 TOPs (1509,94 TOPs) | 319,97 TOPs (639,95 TOPs) | 260,99 TOPs | - |
FP4 wFP32 accumulate Matrix (Sparsity) | ? | 1892,94 TFLOPs (3785,88 TFLOPs) | - | - | - | - | - | - |
FP8 wFP16 accumulate Matrix (Sparsity) | ? | 946,47 TFLOPs (1892,94 TFLOPs) | 1751,77 TFLOPs (3503,55 TFLOPs) | 743,17 TFLOPs (1486,35 TFLOPs) | - | - | - | - |
FP8 wFP32 accumulate Matrix (Sparsity) | ? | 473,23 TFLOPs (946,47 TFLOPs) | 1751,77 TFLOPs (3503,55 TFLOPs) | 743,17 TFLOPs (1486,35 TFLOPs) | - | - | - | - |
FP16 wFP16 accumulate Matrix (Sparsity) | ? | 473,23 TFLOPs (946,47 TFLOPs) | 875,88 TFLOPs (1751,77 TFLOPs) | 371,58 TFLOPs (743,17 TFLOPs) | 377,48 TFLOPs (754,97 TFLOPs) | 159,98 TFLOPs (319,97 TFLOPs) | 130,49 TFLOPs | - |
FP16 wFP32 accumulate Matrix (Sparsity) | ? | 236,62 TFLOPs (473,23 TFLOPs) | 875,88 TFLOPs (1751,77 TFLOPs) | 185,79 TFLOPs (371,58 TFLOPs) | 377,48 TFLOPs (754,97 TFLOPs) | 79,99 TFLOPs (159,98 TFLOPs) | 130,49 TFLOPs | 139,94 TFLOPs |
BF16 wFP32 accumulate Matrix (Sparsity) | ? | 236,62 TFLOPs (473,23 TFLOPs) | 875,88 TFLOPs (1751,77 TFLOPs) | 185,79 TFLOPs (371,58 TFLOPs) | 377,48 TFLOPs (754,97 TFLOPs) | 79,99 TFLOPs (159,98 TFLOPs) | - | - |
TF32 Matrix (Sparsity) | ? | 118,30 TFLOPs (236,62 TFLOPs) | 437,94 TFLOPs (875,88 TFLOPs) | 92,89 TFLOPs (185,79 TFLOPs) | 188,74 TFLOPs (377,48 TFLOPs) | 39,99 TFLOPs (79,99 TFLOPs) | - | - |
FP64 Matrix | ? | | 72,99 TFLOPs | - | 23,59 TFLOPs | - | - | - |