• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

TPU's GPU Database Portal & Updates

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
28,497 (3.73/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
Hello @eidairaman1
I would love to upload the vBIOS my new office graphics card.
It's a Nvidia GeForce GT 1030 4GHD4 LP OC with 4 GiB DDR4 as vRAM.

It's already in the GPU database, but vBIOS is not in the Video BIOS Collection.

Thanks
Upload it with GPU-Z, this submits some additional information that helps us categorize the BIOS correctly
 
Joined
Jul 31, 2023
Messages
61 (0.10/day)
So I guess 50 series bios extraction is still in work? No mention of it in the GPUZ release notes but doesn't seem to be working.
 
Joined
Nov 5, 2012
Messages
49 (0.01/day)
Location
France
System Name Game computer
Processor AMD RyZen 7 5800X3D 4.35GHZ
Motherboard ASRock X470 Taichi
Cooling be quiet! Pure Rock 2 Black
Memory 32768 Mo DDR4-3200 G-Skill CL16
Video Card(s) AMD Radeon RX 7900 GRE (x2)
Storage SSD Samsung 970 EVO M2 250 Go, Samsung 970 EVO M2 500 Go, Samsung 850 EVO SATA 500 Go, Toshiba 4 To
Display(s) AOC 24' 1440p 144 Hz DisplayPort + ACER KG251Q 24' 1080p 144 Hz DisplayPort
Case NZXT Phantom Black
Audio Device(s) Corsair Gaming VOID Pro RGB Wireless Special Edition
Power Supply BeQuiet Straight Power 11 1000W
Mouse Roccat Kone XTD
Keyboard BTC USB
Software Windows 11 24H2 Pro x64
Hi,

I saw that you don't have the Chinese MTT video cards, I was able to gather the information concerning them :)

MTT S80:MTT S70:MTT S50:MTT S30:MTT S10:
=>Architecture: MTT MUSA-Chunxiao=>Architecture: MTT MUSA-Chunxiao=>Architecture: MTT MUSA-Chunxiao=>Architecture: MTT MUSA-Chunxiao=>Architecture: MTT MUSA-Chunxiao
=>Process Size: 12nm=>Process Size: 12nm=>Process Size: 12nm=>Process Size: 12nm=>Process Size: 12nm
=>Die size: ?=>Die size: ?=>Die size: ?=>Die size: ?=>Die size: ?
=>Transistors: ~22.000M=>Transistors: ~22.000M=>Transistors: ?=>Transistors: ?=>Transistors: ?
=>8 MPC (GPC NVIDIA equivalent)=>7 MPC (GPC NVIDIA equivalent)=>4 MPC (GPC NVIDIA equivalent)=>2 MPC (GPC NVIDIA equivalent)=>2 MPC (GPC NVIDIA equivalent)
=>16 MPX (TPC NVIDIA equivalent)=>14 MPX (TPC NVIDIA equivalent)=>8 MPX (TPC NVIDIA equivalent)=>4 MPX (TPC NVIDIA equivalent)=>4 MPX (TPC NVIDIA equivalent)
=>32 MP (SM/CU equivalent)=>28 MP (SM/CU equivalent)=>16 MP (SM/CU equivalent)=>8 MP (SM/CU equivalent)=>8 MP (SM/CU equivalent)
=>4096 ALU FP32 / 1024 ALU INT32=>3584 ALU FP32 / 896 ALU INT32=>2048 ALU FP32 / 512 ALU INT32=>1024 ALU FP32 / 256 ALU INT32=>1024 ALU FP32 / 256 ALU INT32
=>64 ALU FP64=>56 ALU FP64=>32 ALU FP64=>16 ALU FP64=>16 ALU FP64
=>32 ALU SFU=>28 ALU SFU=>16 ALU SFU=>8 ALU SFU=>8 ALU SFU
=>32 Matrix ALU=>28 Matrix ALU=>16 Matrix ALU=>8 Matrix ALU=>8 Matrix ALU
=>TMU: 256=>TMU: 224=>TMU: 128=>TMU: 64=>TMU: 64
=>ROP: 256=>ROP: 224=>ROP: 128=>ROP: 64=>ROP: 64
=>Cache L0: ?/MP=>Cache L0: ?/MP=>Cache L0: ?/MP=>Cache L0: ?/MP=>Cache L0: ?/MP
=>Cache L1: ?=>Cache L1: ?=>Cache L1: ?=>Cache L1: ?=>Cache L1: ?
=>Cache L2: 4 Mo=>Cache L2: 3,5 Mo=>Cache L2: ?=>Cache L2: ?=>Cache L2: ?
=>Clock: 1800 MHz=>Clock: 1600 MHz=>Clock: 1200 MHz=>Clock: 1300 MHz=>Clock: 1000 MHz
=>FP32: 14.74 TFLOPs=>FP32: 11,46 TFLOPs=>FP32: 4,91 TFLOPs=>FP32: 2,66 TFLOPs=>FP32: 2,04 TFLOPs
=>FP16: 29.49 TFLOPs (2:1)=>FP16: 22,93 TFLOPs (2:1)=>FP16: 9,83 TFLOPs (2:1)=>FP16: 5,32 TFLOPs (2:1)=>FP16: 4,09 TFLOPs (2:1)
=>FP64: 230.4 GFLOPs (1:64)=>FP64: 179,2 GFLOPs (1:64)=>FP64: 77,8 GFLOPs (1:64)=>FP64: 41,6 GFLOPs (1:64)=>FP64: 32,0 GFLOPs (1:64)
=>VRAM: 16GB G-DDR6=>VRAM: 7GB G-DDR6=>VRAM: 8GB G-DDR6=>VRAM: 4GB G-DDR6=>VRAM: 2GB G-DDR6
=>Clock VRAM: 14 GBps (1750 MHz)=>Clock VRAM: 14 GBps (1750 MHz)=>Clock VRAM: ? (? MHz)=>Clock VRAM: ? (? MHz)=>Clock VRAM: ? (? MHz)
=>Bus: 256-bits=>Bus: 224-bits=>Bus: 256-bits=>Bus: 128-bits=>Bus: 64-bits
=>Bandwidth: 448 GB/s=>Bandwidth: 392 GB/s=>Bandwidth: ?=>Bandwidth: ?=>Bandwidth: ?
=>PCIe: 16x 5.0=>PCIe: 16x 4.0=>PCIe: 16x 3.0=>PCIe: 8x 4.0=>PCIe: 8x 4.0
=>TBP: 255 Watts (1x8-pins)=>TBP: 220 Watts (1x8-pins)=>TBP: 85 Watts (1x6-pins)=>TBP: 40 Watts=>TBP: 30 Watts
=>Outputs: 3xDP 1.4a + 1xHDMI 2.1=>Outputs: 3xDP 1.4a + 1xHDMI 2.1=>Outputs: 2xDP 1.4a + 1xHDMI 2.0=>Outputs: 1xHDMI + 1xVGA=>Outputs: 1xHDMI + 1xVGA
 

T4C Fantasy

CPU & GPU DB Maintainer
Staff member
Joined
May 7, 2012
Messages
2,593 (0.55/day)
Location
Rhode Island
System Name Whaaaat Kiiiiiiid!
Processor Intel Core i9-14900K @ Default
Motherboard Gigabyte Z690 AORUS Elite AX DDR4
Cooling Corsair H150i AIO Cooler
Memory Corsair Dominator Platinum 128GB DDR4-3200
Video Card(s) EVGA GeForce RTX 3080 FTW3 ULTRA @ Default
Storage Samsung 970 PRO 512GB + Crucial MX500 2TB x3 + Crucial MX500 4TB + Samsung 980 PRO 1TB
Display(s) 27" LG 27MU67-B 4K, + 27" Acer Predator XB271HU 1440P
Case Thermaltake Core X9 Snow
Audio Device(s) Logitech G PRO X 2 Lightspeed
Power Supply SeaSonic Platinum 1050W Snow Silent
Mouse Logitech G903 Lightspeed
Keyboard Logitech G915 X Lightspeed
Software Windows 11 Pro
Benchmark Scores FFXV: 19329
Hi,

I saw that you don't have the Chinese MTT video cards, I was able to gather the information concerning them :)

MTT S80:MTT S70:MTT S50:MTT S30:MTT S10:
=>Architecture: MTT MUSA-Chunxiao=>Architecture: MTT MUSA-Chunxiao=>Architecture: MTT MUSA-Chunxiao=>Architecture: MTT MUSA-Chunxiao=>Architecture: MTT MUSA-Chunxiao
=>Process Size: 12nm=>Process Size: 12nm=>Process Size: 12nm=>Process Size: 12nm=>Process Size: 12nm
=>Die size: ?=>Die size: ?=>Die size: ?=>Die size: ?=>Die size: ?
=>Transistors: ~22.000M=>Transistors: ~22.000M=>Transistors: ?=>Transistors: ?=>Transistors: ?
=>8 MPC (GPC NVIDIA equivalent)=>7 MPC (GPC NVIDIA equivalent)=>4 MPC (GPC NVIDIA equivalent)=>2 MPC (GPC NVIDIA equivalent)=>2 MPC (GPC NVIDIA equivalent)
=>16 MPX (TPC NVIDIA equivalent)=>14 MPX (TPC NVIDIA equivalent)=>8 MPX (TPC NVIDIA equivalent)=>4 MPX (TPC NVIDIA equivalent)=>4 MPX (TPC NVIDIA equivalent)
=>32 MP (SM/CU equivalent)=>28 MP (SM/CU equivalent)=>16 MP (SM/CU equivalent)=>8 MP (SM/CU equivalent)=>8 MP (SM/CU equivalent)
=>4096 ALU FP32 / 1024 ALU INT32=>3584 ALU FP32 / 896 ALU INT32=>2048 ALU FP32 / 512 ALU INT32=>1024 ALU FP32 / 256 ALU INT32=>1024 ALU FP32 / 256 ALU INT32
=>64 ALU FP64=>56 ALU FP64=>32 ALU FP64=>16 ALU FP64=>16 ALU FP64
=>32 ALU SFU=>28 ALU SFU=>16 ALU SFU=>8 ALU SFU=>8 ALU SFU
=>32 Matrix ALU=>28 Matrix ALU=>16 Matrix ALU=>8 Matrix ALU=>8 Matrix ALU
=>TMU: 256=>TMU: 224=>TMU: 128=>TMU: 64=>TMU: 64
=>ROP: 256=>ROP: 224=>ROP: 128=>ROP: 64=>ROP: 64
=>Cache L0: ?/MP=>Cache L0: ?/MP=>Cache L0: ?/MP=>Cache L0: ?/MP=>Cache L0: ?/MP
=>Cache L1: ?=>Cache L1: ?=>Cache L1: ?=>Cache L1: ?=>Cache L1: ?
=>Cache L2: 4 Mo=>Cache L2: 3,5 Mo=>Cache L2: ?=>Cache L2: ?=>Cache L2: ?
=>Clock: 1800 MHz=>Clock: 1600 MHz=>Clock: 1200 MHz=>Clock: 1300 MHz=>Clock: 1000 MHz
=>FP32: 14.74 TFLOPs=>FP32: 11,46 TFLOPs=>FP32: 4,91 TFLOPs=>FP32: 2,66 TFLOPs=>FP32: 2,04 TFLOPs
=>FP16: 29.49 TFLOPs (2:1)=>FP16: 22,93 TFLOPs (2:1)=>FP16: 9,83 TFLOPs (2:1)=>FP16: 5,32 TFLOPs (2:1)=>FP16: 4,09 TFLOPs (2:1)
=>FP64: 230.4 GFLOPs (1:64)=>FP64: 179,2 GFLOPs (1:64)=>FP64: 77,8 GFLOPs (1:64)=>FP64: 41,6 GFLOPs (1:64)=>FP64: 32,0 GFLOPs (1:64)
=>VRAM: 16GB G-DDR6=>VRAM: 7GB G-DDR6=>VRAM: 8GB G-DDR6=>VRAM: 4GB G-DDR6=>VRAM: 2GB G-DDR6
=>Clock VRAM: 14 GBps (1750 MHz)=>Clock VRAM: 14 GBps (1750 MHz)=>Clock VRAM: ? (? MHz)=>Clock VRAM: ? (? MHz)=>Clock VRAM: ? (? MHz)
=>Bus: 256-bits=>Bus: 224-bits=>Bus: 256-bits=>Bus: 128-bits=>Bus: 64-bits
=>Bandwidth: 448 GB/s=>Bandwidth: 392 GB/s=>Bandwidth: ?=>Bandwidth: ?=>Bandwidth: ?
=>PCIe: 16x 5.0=>PCIe: 16x 4.0=>PCIe: 16x 3.0=>PCIe: 8x 4.0=>PCIe: 8x 4.0
=>TBP: 255 Watts (1x8-pins)=>TBP: 220 Watts (1x8-pins)=>TBP: 85 Watts (1x6-pins)=>TBP: 40 Watts=>TBP: 30 Watts
=>Outputs: 3xDP 1.4a + 1xHDMI 2.1=>Outputs: 3xDP 1.4a + 1xHDMI 2.1=>Outputs: 2xDP 1.4a + 1xHDMI 2.0=>Outputs: 1xHDMI + 1xVGA=>Outputs: 1xHDMI + 1xVGA
@W1zzard

Still waiting for it to be added, will add once implemented
 

T4C Fantasy

CPU & GPU DB Maintainer
Staff member
Joined
May 7, 2012
Messages
2,593 (0.55/day)
Location
Rhode Island
System Name Whaaaat Kiiiiiiid!
Processor Intel Core i9-14900K @ Default
Motherboard Gigabyte Z690 AORUS Elite AX DDR4
Cooling Corsair H150i AIO Cooler
Memory Corsair Dominator Platinum 128GB DDR4-3200
Video Card(s) EVGA GeForce RTX 3080 FTW3 ULTRA @ Default
Storage Samsung 970 PRO 512GB + Crucial MX500 2TB x3 + Crucial MX500 4TB + Samsung 980 PRO 1TB
Display(s) 27" LG 27MU67-B 4K, + 27" Acer Predator XB271HU 1440P
Case Thermaltake Core X9 Snow
Audio Device(s) Logitech G PRO X 2 Lightspeed
Power Supply SeaSonic Platinum 1050W Snow Silent
Mouse Logitech G903 Lightspeed
Keyboard Logitech G915 X Lightspeed
Software Windows 11 Pro
Benchmark Scores FFXV: 19329
I added a couple, need more info on formal naming, encode, decode support
Apis

 
Joined
Nov 5, 2012
Messages
49 (0.01/day)
Location
France
System Name Game computer
Processor AMD RyZen 7 5800X3D 4.35GHZ
Motherboard ASRock X470 Taichi
Cooling be quiet! Pure Rock 2 Black
Memory 32768 Mo DDR4-3200 G-Skill CL16
Video Card(s) AMD Radeon RX 7900 GRE (x2)
Storage SSD Samsung 970 EVO M2 250 Go, Samsung 970 EVO M2 500 Go, Samsung 850 EVO SATA 500 Go, Toshiba 4 To
Display(s) AOC 24' 1440p 144 Hz DisplayPort + ACER KG251Q 24' 1080p 144 Hz DisplayPort
Case NZXT Phantom Black
Audio Device(s) Corsair Gaming VOID Pro RGB Wireless Special Edition
Power Supply BeQuiet Straight Power 11 1000W
Mouse Roccat Kone XTD
Keyboard BTC USB
Software Windows 11 24H2 Pro x64
I'll take this opportunity to add more detailed information about GPU architectures: SIMD organization :)

=>NVIDIA Blackwell: 4xSIMD32 (FP32/INT32) + 4xSIMD4 (SFU) + 4xMatrix ALU + 2xALU FP64 / SM
=>NVIDIA Ada: 4xSIMD16 (FP32) + 4xSIMD16 (FP32/INT32) + 4xSIMD4 (SFU) + 4xMatrix ALU + 2xALU FP64 / SM
=>NVIDIA Hopper: 4xSIMD32 (FP32) + 4xSIMD16 (INT32) + 4xSIMD16 (FP64) + 4xSIMD4 (SFU) + 4xMatrix ALU / SM
=>NVIDIA Ampère GA10x: 4xSIMD16 (FP32) + 4xSIMD16 (FP32/INT32) + 4xSIMD4 (SFU) + 4xMatrix ALU + 2xALU FP64 / SM
=>NVIDIA Ampère GA100: 4xSIMD16 (FP32) + 4xSIMD16 (FP32/INT32) + 4xSIMD8 (FP64) + 4xSIMD4 (SFU) + 4xMatrix ALU / SM
=>NVIDIA Turing: 4xSIMD16 (FP32) + 4xSIMD16 (INT32) + 4xSIMD4 (SFU) + 4xMatrix ALU + 2xALU FP64 / SM
=>NVIDIA Volta: 4xSIMD16 (FP32) + 4xSIMD16 (INT32) + 4xSIMD8 (FP64) + 4xSIMD4 (SFU) + 4xMatrix ALU / SM
=>NVIDIA Pascal GP100: 4xSIMD16 (FP32/INT32) + 4xSIMD8 (FP64) + 4xSIMD8 (SFU) / SM
=>NVIDIA Pascal GP10x: 4xSIMD32 (FP32/INT32) + 4xSIMD8 (SFU) + 4xALU FP64 / SM
=>NVIDIA Maxwell: 4xSIMD32 (FP32/INT32) + 4xSIMD8 (SFU) + 4xALU FP64 / SM
=>NVIDIA Kepler: 6xSIMD32 (FP32/INT32) + 4xSIMD8 (SFU) + 4xALU FP64 / SM


=>AMD RDNA3: 2xSIMD32 (FP32/INT32/WMMA) + 2xSIMD32 (FP32/WMMA) + 2xSIMD8 (SFU) + 2xALU FP64 / CU
=>AMD RDNA2/1: 2xSIMD32 (FP32/INT32) + 2xSIMD8 (SFU) + 2xALU FP64 / CU
=>AMD CDNA: 4xSIMD16 (FP32/INT32) + 4xSIMD8 (SFU) + 4xMatrix ALU / CU
=>AMD GCN: 4xSIMD16 (FP32/INT32) + 4xSIMD8 (SFU) / CU


=>INTEL Xe2-Battlemage: 8xSIMD16 (FP32) + 8xSIMD16 (INT32) + 8xSIMD4 (SFU) + 8xSIMD2 (FP64) / Xe
=>INTEL Xe-Alchemist: 16xSIMD8 (FP32) + 16xSIMD8 (INT32) + 16xSIMD2 (SFU) / Xe

=>MTT MUSA: 1xSIMD128 (FP32) + 1xSIMD32 (INT32) + 1xALU SFU + 1xMatrix ALU + 1x2 ALU FP64 / MP
 

Sweepi

New Member
Joined
Feb 14, 2025
Messages
3 (0.05/day)
Maybe this is a better place for feedback for the GPU Database:

Could not find a newer feedback thread for the GPU database, and wasn't in the mood the send an eMail:

For Navi III GPUs (e.g. 7800XT), FP32 Performance is[1] calculated by the following formula:
FP32 = Shading Units * Boost Clock * 4
Examples:
7900 XTX: 6144 * 2498 * 4 = 61390848 -> 61.39 TFLOPS
7800 XT: 3840 * 2430 * 4 = 37324800 -> 37.32 TFLOPS

However, for the preliminary Navi IV entries (e.g. 9070XT), FP32 Performance is calculated by the following formula:
FP32 = Shading Units * Boost Clock * 2
9070 XT: 4096 * 2970 * 2 = 24330240 -> 24.33 TFLOPS

This seem like an error to me, or is this intended?


[1] just a guess, but the math checks out

Question: If I were to provide values for a 'Theoretical Tensor Performance' section (FP4/8/16, BF16, INT4/8), would that be of interest?
I have collected them here: https://ethercalc.net/ih5riaqsy7i1 . Please let me know if a different format or additional graphics cards (such as AMD or Quadro) would be more useful.

Question2: Should there be anything explaining the max FP32 to max INT32 (non-Tensor) performance ratio, like it is already done for FP64?
A table is worth 1000 words:
[Unfortunately German:
"reine" -> "pure/mere"
"Einheiten pro" -> "Units/Cores per"]
1740383421506.png


Maybe "SIMD organization" by @TRINITAS already covers Question2.
 
Last edited:
Joined
Nov 5, 2012
Messages
49 (0.01/day)
Location
France
System Name Game computer
Processor AMD RyZen 7 5800X3D 4.35GHZ
Motherboard ASRock X470 Taichi
Cooling be quiet! Pure Rock 2 Black
Memory 32768 Mo DDR4-3200 G-Skill CL16
Video Card(s) AMD Radeon RX 7900 GRE (x2)
Storage SSD Samsung 970 EVO M2 250 Go, Samsung 970 EVO M2 500 Go, Samsung 850 EVO SATA 500 Go, Toshiba 4 To
Display(s) AOC 24' 1440p 144 Hz DisplayPort + ACER KG251Q 24' 1080p 144 Hz DisplayPort
Case NZXT Phantom Black
Audio Device(s) Corsair Gaming VOID Pro RGB Wireless Special Edition
Power Supply BeQuiet Straight Power 11 1000W
Mouse Roccat Kone XTD
Keyboard BTC USB
Software Windows 11 24H2 Pro x64
1740427527490.png


full info on RTX GPUs (I don't have info on Blackwell INT vector calculations yet :)
 
Joined
May 8, 2016
Messages
2,019 (0.62/day)
System Name BOX
Processor Core i7 6950X @ 4,26GHz (1,28V)
Motherboard X99 SOC Champion (BIOS F23c + bifurcation mod)
Cooling Thermalright Venomous-X + 2x Delta 38mm PWM (Push-Pull)
Memory Patriot Viper Steel 4000MHz CL16 4x8GB (@3240MHz CL12.12.12.24 CR2T @ 1,48V)
Video Card(s) Titan V (~1650MHz @ 0.77V, HBM2 1GHz, Forced P2 state [OFF])
Storage WD SN850X 2TB + Samsung EVO 2TB (SATA) + Seagate Exos X20 20TB (4Kn mode)
Display(s) LG 27GP950-B
Case Fractal Design Meshify 2 XL
Audio Device(s) Motu M4 (audio interface) + ATH-A900Z + Behringer C-1
Power Supply Seasonic X-760 (760W)
Mouse Logitech RX-250
Keyboard HP KB-9970
Software Windows 10 Pro x64
@TRINITAS How Volta (full GV100), fits into all of this ?
 
Last edited:
Joined
Nov 5, 2012
Messages
49 (0.01/day)
Location
France
System Name Game computer
Processor AMD RyZen 7 5800X3D 4.35GHZ
Motherboard ASRock X470 Taichi
Cooling be quiet! Pure Rock 2 Black
Memory 32768 Mo DDR4-3200 G-Skill CL16
Video Card(s) AMD Radeon RX 7900 GRE (x2)
Storage SSD Samsung 970 EVO M2 250 Go, Samsung 970 EVO M2 500 Go, Samsung 850 EVO SATA 500 Go, Toshiba 4 To
Display(s) AOC 24' 1440p 144 Hz DisplayPort + ACER KG251Q 24' 1080p 144 Hz DisplayPort
Case NZXT Phantom Black
Audio Device(s) Corsair Gaming VOID Pro RGB Wireless Special Edition
Power Supply BeQuiet Straight Power 11 1000W
Mouse Roccat Kone XTD
Keyboard BTC USB
Software Windows 11 24H2 Pro x64
@TRINITAS How Volta (full GV100), fits into all of this ?
BLACKWELL (PRO)BLACKWELL (RTX)HOPPERADAAMPERE (PRO)AMPERE (RTX)TURING (RTX)VOLTA
Chipset exempleGB100GB202GH100AD102GA100GA102TU102GV100
Partitions?12 GPCs8 GPCs12 GPCs8 GPCs7 GPCs6 GPCs6 GPCs
Clusters?96 TPCs72 TPCs72 TPCs64 TPCs42 TPCs36 TPCs42 TPCs
Cores?192 SM144 SM144 SM128 SM84 SM72 SM84 SM
SIMD?4xSIMD32 (FP32/INT32)
+
4xSIMD4 (SFU)
+
2xFP64
4xSIMD32 (FP32)
+
4xSIMD16 (INT32)
+
4xSIMD4 (SFU)
+
4xSIMD16 (FP64)
4xSIMD16 (FP32)
+
4xSIMD16 (FP32/INT32)
+
4xSIMD4 (SFU)
+
2xFP64
4xSIMD16 (FP32)
+
4xSIMD16 (INT32)
+
4xSIMD4 (SFU)
+
4xSIMD8 (FP64)
4xSIMD16 (FP32)
+
4xSIMD16 (FP32/INT32)
+
4xSIMD4 (SFU)
+
2xFP64
4xSIMD16 (FP32)
+
4xSIMD16 (INT32)
+
4xSIMD4 (SFU)
+
2xFP64
4xSIMD16 (FP32)
+
4xSIMD16 (INT32)
+
4xSIMD4 (SFU)
+
4xSIMD8 (FP64)
Max ALU Vector?28032
(24576 FP32/INT32 + 3072 SFU + 384 FP64
39168
(18432 FP32 + 9216 INT32 + 2304 SFU + 9216 FP64
21024
(9216 FP32 + 9216 FP32/INT32 + 2304 SFU + 288 FP64
22528
(8192 FP32 + 8192 INT32 + 2048 SFU + 4096 FP64
12264
(5376 FP32 + 5376 FP32/INT32 + 1344 SFU + 168 FP64
10512
(4608 FP32 + 4608 INT32 + 1152 SFU + 144 FP64
14784
(5376 FP32 + 5376 INT32 + 1344 SFU + 2688 FP64
Matrix ALU?768 Gen5576 Gen4576 Gen4512 Gen3336 Gen3576 Gen2672 Gen1
RTU-192 Gen4-144 Gen3-84 Gen272 Gen1-
Scalar ALU?768 (4/SM)576 (4/SM)576 (4/SM)512 (4/SM)336 (4/SM)288 (4/SM)336 (4/SM)
Raster Engine?128128766
Tesselator?96727264423684
TMU?768576576512366288336
ROP?1922419219211296128
Clock max?2407 MHz1980 MHz2520 MHz1440 MHz1860 MHz1770 MHz1627 MHz
INT8 Vector?473,23 TOPs185,79 TOPs94,37 TOPs79,99 TOPs65,25 TOPs69,97 TOPs
INT16 Vector???????
INT24 Vector??46,48 TOPs23,59 TOPs19,99 TOPs16,31 TOPs17,49 TOPs
INT32 Vector??46,48 TOPs23,59 TOPs19,99 TOPs16,31 TOPs17,49 TOPs
INT64 Vector??11,61 TOPs5,89 TOPs4,99 TOPs4,08 TOPs4,37 TOPs
BF16 Vector?118,30 TFLOPs145,98 TFLOPs92,89 TFLOPs or 46,48 TFLOPs47,18 TFLOPs39,99 TFLOPs or 19,99 TFLOPs--
FP16 Vector?118,30 TFLOPs145,98 TFLOPs92,89 TFLOPs or 46,48 TFLOPs94,37 TFLOPs39,99 TFLOPs or 19,99 TFLOPs32,62 TFLOPs34,98 TFLOPs
FP32 Vector?118,30 TFLOPs72,99 TFLOPs92,89 TFLOPs or 46,48 TFLOPs23,59 TFLOPs39,99 TFLOPs or 19,99 TFLOPs16,31 TFLOPs17,49 TFLOPs
FP64 Vector?1,85 TFLOPs36,49 TFLOPs1,45 TFLOPs11,79 TFLOPs624,9 GFLOPs509,76 GFLOPs8,74 TFLOPs
Tracendental Vector?14,79 TFLOPs9,12 TFLOPs11,61 TFLOPs5,89 TFLOPs4,99 TFLOPs4,08 TFLOPs4,37 TFLOPs
INT4 Matrix (Sparsity)?--1486,35 TOPs (2972,71 TOPs)1509,94 TOPs (3019,89 TOPs)639,95 TOPs (1279,91 TOPs)521,99 TOPs-
INT8 Matrix (Sparsity)?946,47 TOPs (1892,94 TOPs)-743,17 TOPs (1486,35 TOPs)754,97 TOPs (1509,94 TOPs)319,97 TOPs (639,95 TOPs)260,99 TOPs-
FP4 wFP32 accumulate Matrix (Sparsity)?1892,94 TFLOPs (3785,88 TFLOPs)------
FP8 wFP16 accumulate Matrix (Sparsity)?946,47 TFLOPs (1892,94 TFLOPs)1751,77 TFLOPs (3503,55 TFLOPs)743,17 TFLOPs (1486,35 TFLOPs)----
FP8 wFP32 accumulate Matrix (Sparsity)?473,23 TFLOPs (946,47 TFLOPs)1751,77 TFLOPs (3503,55 TFLOPs)743,17 TFLOPs (1486,35 TFLOPs)----
FP16 wFP16 accumulate Matrix (Sparsity)?473,23 TFLOPs (946,47 TFLOPs)875,88 TFLOPs (1751,77 TFLOPs)371,58 TFLOPs (743,17 TFLOPs)377,48 TFLOPs (754,97 TFLOPs)159,98 TFLOPs (319,97 TFLOPs)130,49 TFLOPs-
FP16 wFP32 accumulate Matrix (Sparsity)?236,62 TFLOPs (473,23 TFLOPs)875,88 TFLOPs (1751,77 TFLOPs)185,79 TFLOPs (371,58 TFLOPs)377,48 TFLOPs (754,97 TFLOPs)79,99 TFLOPs (159,98 TFLOPs)130,49 TFLOPs139,94 TFLOPs
BF16 wFP32 accumulate Matrix (Sparsity)?236,62 TFLOPs (473,23 TFLOPs)875,88 TFLOPs (1751,77 TFLOPs)185,79 TFLOPs (371,58 TFLOPs)377,48 TFLOPs (754,97 TFLOPs)79,99 TFLOPs (159,98 TFLOPs)--
TF32 Matrix (Sparsity)?118,30 TFLOPs (236,62 TFLOPs)437,94 TFLOPs (875,88 TFLOPs)92,89 TFLOPs (185,79 TFLOPs)188,74 TFLOPs (377,48 TFLOPs)39,99 TFLOPs (79,99 TFLOPs)--
FP64 Matrix?72,99 TFLOPs-23,59 TFLOPs---

Here :)

I add Volta, Ampére PRO and Hopper. For Blackwell B100, nothing information.
 

T4C Fantasy

CPU & GPU DB Maintainer
Staff member
Joined
May 7, 2012
Messages
2,593 (0.55/day)
Location
Rhode Island
System Name Whaaaat Kiiiiiiid!
Processor Intel Core i9-14900K @ Default
Motherboard Gigabyte Z690 AORUS Elite AX DDR4
Cooling Corsair H150i AIO Cooler
Memory Corsair Dominator Platinum 128GB DDR4-3200
Video Card(s) EVGA GeForce RTX 3080 FTW3 ULTRA @ Default
Storage Samsung 970 PRO 512GB + Crucial MX500 2TB x3 + Crucial MX500 4TB + Samsung 980 PRO 1TB
Display(s) 27" LG 27MU67-B 4K, + 27" Acer Predator XB271HU 1440P
Case Thermaltake Core X9 Snow
Audio Device(s) Logitech G PRO X 2 Lightspeed
Power Supply SeaSonic Platinum 1050W Snow Silent
Mouse Logitech G903 Lightspeed
Keyboard Logitech G915 X Lightspeed
Software Windows 11 Pro
Benchmark Scores FFXV: 19329
I love the tables, i have data tables with partial data in the first post under graphics ip, love the information feel free to add on to it.


@TRINITAS How Volta (full GV100), fits into all of this ?
i need to revamp the L2 cache stat in the chip database and gpu database doe ADA and Blackwell, can you help me out?
 
Joined
Nov 5, 2012
Messages
49 (0.01/day)
Location
France
System Name Game computer
Processor AMD RyZen 7 5800X3D 4.35GHZ
Motherboard ASRock X470 Taichi
Cooling be quiet! Pure Rock 2 Black
Memory 32768 Mo DDR4-3200 G-Skill CL16
Video Card(s) AMD Radeon RX 7900 GRE (x2)
Storage SSD Samsung 970 EVO M2 250 Go, Samsung 970 EVO M2 500 Go, Samsung 850 EVO SATA 500 Go, Toshiba 4 To
Display(s) AOC 24' 1440p 144 Hz DisplayPort + ACER KG251Q 24' 1080p 144 Hz DisplayPort
Case NZXT Phantom Black
Audio Device(s) Corsair Gaming VOID Pro RGB Wireless Special Edition
Power Supply BeQuiet Straight Power 11 1000W
Mouse Roccat Kone XTD
Keyboard BTC USB
Software Windows 11 24H2 Pro x64
I love the tables, i have data tables with partial data in the first post under graphics ip, love the information feel free to add on to it.



i need to revamp the L2 cache stat in the chip database and gpu database doe ADA and Blackwell, can you help me out?
For Ada: 96 Mo for AD102, 64 Mo for AD103, 48 Mo for AD104, 32 Mo for AD106/107
For Blackwell RTX: 128 Mo for GB202, 64 Mo for GB203, 48 Mo for GB205, 32 Mo for GB206/207
For Blackwell GB100: No information---
 
Joined
May 8, 2016
Messages
2,019 (0.62/day)
System Name BOX
Processor Core i7 6950X @ 4,26GHz (1,28V)
Motherboard X99 SOC Champion (BIOS F23c + bifurcation mod)
Cooling Thermalright Venomous-X + 2x Delta 38mm PWM (Push-Pull)
Memory Patriot Viper Steel 4000MHz CL16 4x8GB (@3240MHz CL12.12.12.24 CR2T @ 1,48V)
Video Card(s) Titan V (~1650MHz @ 0.77V, HBM2 1GHz, Forced P2 state [OFF])
Storage WD SN850X 2TB + Samsung EVO 2TB (SATA) + Seagate Exos X20 20TB (4Kn mode)
Display(s) LG 27GP950-B
Case Fractal Design Meshify 2 XL
Audio Device(s) Motu M4 (audio interface) + ATH-A900Z + Behringer C-1
Power Supply Seasonic X-760 (760W)
Mouse Logitech RX-250
Keyboard HP KB-9970
Software Windows 10 Pro x64
Here :)

I add Volta, Ampére PRO and Hopper. For Blackwell B100, nothing information.
Thank you !

Q : Shouldn't Tesselator count be linked to TPU count (42) instead of SMs (84) on Volta ?
 
Joined
Nov 5, 2012
Messages
49 (0.01/day)
Location
France
System Name Game computer
Processor AMD RyZen 7 5800X3D 4.35GHZ
Motherboard ASRock X470 Taichi
Cooling be quiet! Pure Rock 2 Black
Memory 32768 Mo DDR4-3200 G-Skill CL16
Video Card(s) AMD Radeon RX 7900 GRE (x2)
Storage SSD Samsung 970 EVO M2 250 Go, Samsung 970 EVO M2 500 Go, Samsung 850 EVO SATA 500 Go, Toshiba 4 To
Display(s) AOC 24' 1440p 144 Hz DisplayPort + ACER KG251Q 24' 1080p 144 Hz DisplayPort
Case NZXT Phantom Black
Audio Device(s) Corsair Gaming VOID Pro RGB Wireless Special Edition
Power Supply BeQuiet Straight Power 11 1000W
Mouse Roccat Kone XTD
Keyboard BTC USB
Software Windows 11 24H2 Pro x64
Thank you !

Q : Shouldn't Tesselator count be linked to TPU count (42) instead of SMs (84) on Volta ?
Oh yes, srry
Its 42 indeed :)

CDNA4CDNA3CDNA2CDNA
Chipset exemple?AQUA VANJARANALDEBARANARCTURUS
Partitions?32 Shaders Engine8 Shaders Engine4 Shaders Engine
Clusters?---
Cores?320 CU240 CU128 CU
SIMD?4xSIMD16 (FP32/INT32)
+
4xSIMD4 (SFU)
4xSIMD16 (FP32/INT32)
+
4xSIMD4 (SFU)
4xSIMD16 (FP32/INT32)
+
4xSIMD4 (SFU)
Max ALU Vector?25600
(20480 FP32/INT32 + 5120 SFU)
19200
(15360 FP32/INT32 + 3840 SFU)
10240
(8192 FP32/INT32 + 2048 SFU)
Matrix ALU?1280 Gen3960 Gen2512 Gen1
RTU?---
Scalar ALU?320 (1/CU)240 (1/CU)128 (1/CU)
Raster Engine?---
Tesselator?---
TMU?---
ROP?---
Clock max?2100 MHz1700 MHz1500 MHz
INT4 Vector?344,06 TOPs208,89 TOPs98,30 TOPs
INT8 Vector?172,03 TOPs104,44 TOPs49,15 TOPs
INT16 Vector?172,03 TOPs104,44 TOPs49,15 TOPs
INT24 Vector?86,01 TOPs52,22 TOPs24,57 TOPs
INT32 Vector?86,01 TOPs52,22 TOPs24,57 TOPs
INT64 Vector?21,50 TOPs13,05 TOPs6,14 TOPs
BF16 Vector?---
FP16 Vector (With Packed Math)?344,06 TFLOPs104,44 TFLOPs (208,89 TFLOPs)49,15 TFLOPs
FP32 Vector (With Packed Math)?172,03 TFLOPs52,22 TFLOPs (104,44 TFLOPs)24,57 TFLOPs
FP64 Vector?86,01 TFLOPs52,22 TFLOPs12,28 TFLOPs
Tracendental Vector?21,50 TFLOPs13,05 TFLOPs6,14 TFLOPs
INT4 Matrix (Sparsity)?---
INT8 Matrix (Sparsity)?2752,51 TOPs (5505,02 TOPs)417,79 TOPs-
FP4 wFP32 accumulate Matrix (Sparsity)?---
FP8 wFP16 accumulate Matrix (Sparsity)?2752,51 TFLOPs (5505,02 TFLOPs)--
FP8 wFP32 accumulate Matrix (Sparsity)?2752,51 TFLOPs (5505,02 TFLOPs)--
FP16 wFP16 accumulate Matrix (Sparsity)?1376,25 TFLOPs (2752,51 TFLOPs)417,79 TFLOPs196,60 TFLOPs
FP16 wFP32 accumulate Matrix (Sparsity)?1376,25 TFLOPs (2752,51 TFLOPs)417,79 TFLOPs196,60 TFLOPs
BF16 wFP32 accumulate Matrix (Sparsity)?1376,25 TFLOPs (2752,51 TFLOPs)417,79 TFLOPs98,30 TFLOPs
FP32 Matrix (Sparsity)?172,03 TFLOPs104,44 TFLOPs49,15 TFLOPs
TF32 Matrix (Sparsity)?688,12 TFLOPs (1376,25 TFLOPs)--
FP64 Matrix?172,03 TFLOPs104,44 TFLOPs-

For AMD Instinct CDNA :)
 
Joined
Oct 19, 2022
Messages
442 (0.49/day)
Location
Los Angeles, CA
Processor AMD Ryzen 7 9800X3D (+PBO 5.4GHz)
Motherboard MSI MPG X870E Carbon Wifi
Cooling ARCTIC Liquid Freezer II 280 A-RGB
Memory 2x32GB (64GB) G.Skill Trident Z Royal @ 6200MHz 1:1 (30-38-38-30)
Video Card(s) MSI GeForce RTX 4090 SUPRIM Liquid X
Storage Crucial T705 4TB (PCIe 5.0) w/ Heatsink + Samsung 990 PRO 2TB (PCIe 4.0) w/ Heatsink
Display(s) AORUS FO32U2P 4K QD-OLED 240Hz (DP 2.1 UHBR20 80Gbps)
Case CoolerMaster H500M (Mesh)
Audio Device(s) AKG N90Q w/ AudioQuest DragonFly Red (USB DAC)
Power Supply Seasonic Prime TX-1600 Noctua Edition (1600W 80Plus Titanium) ATX 3.1 & PCIe 5.1
Mouse Logitech G PRO X SUPERLIGHT
Keyboard Razer BlackWidow V3 Pro
Software Windows 10 64-bit
96 Mo for AD102
128 Mo for GB202

It's crazy how NVIDIA feel the need to shrink L2 Cache size on all the new x90 variants! Those GPUs cost a fortune (even at MSRPs) ! And they cheap out everywhere they can... L2 Cache, GDDR6X/7 speeds (never the best ones), lower amount of shunt resistors compared to 3090 Ti, etc.
 
Top