• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD "Vega 20" with 32 GB HBM2 3DMark 11 Score Surfaces

Joined
Nov 3, 2011
Messages
697 (0.14/day)
Location
Australia
System Name Eula
Processor AMD Ryzen 9 7900X PBO
Motherboard ASUS TUF Gaming X670E Plus Wifi
Cooling Corsair H150i Elite LCD XT White
Memory Trident Z5 Neo RGB DDR5-6000 64GB (4x16GB F5-6000J3038F16GX2-TZ5NR) EXPO II, OCCT Tested
Video Card(s) Gigabyte GeForce RTX 4080 GAMING OC
Storage Corsair MP600 XT NVMe 2TB, Samsung 980 Pro NVMe 2TB, Toshiba N300 10TB HDD, Seagate Ironwolf 4T HDD
Display(s) Acer Predator X32FP 32in 160Hz 4K FreeSync/GSync DP, LG 32UL950 32in 4K HDR FreeSync/G-Sync DP
Case Phanteks Eclipse P500A D-RGB White
Audio Device(s) Creative Sound Blaster Z
Power Supply Corsair HX1000 Platinum 1000W
Mouse SteelSeries Prime Pro Gaming Mouse
Keyboard SteelSeries Apex 5
Software MS Windows 11 Pro
That's not how it works, ALUs are ALUs. Vega 10 is GP100 class.
Wrong. ALU's main read/write operations are from TMUs or ROPS.

Compute Shader Path uses TMUs read/write functions.
Pixel Shader Path uses ROPs read/write functions.

Vega 64 has GP104 class classic GPU hardware. GP102 has more rasterization and ROPS hardware when compared to Vega 64. Vega GPUs gained ROPS being connected to L2 cache design as per Maxwell/Pascal designs.

Vega 64 has GP102 class FP32 compute hardware.

Both Vega 64 and GP104 has quad rasterization /quad GPC units and 64 ROPS.
 
Joined
Jan 8, 2017
Messages
9,587 (3.27/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
ALU's main read/write operations are from TMUs or ROPS.

What are you talking about? Vertex, primitive and fragment processing all happen on the unified shaders aka FP32 ALUs. At the same time, that's where the kernels run if you use CUDA/OpenCL , a shader is the analog of a compute kernel and they use exactly the same resources.

TMUs and ROPs are not, by any stretch of the imagination, the only classifiers as far graphics processing goes. It's the reason they have been decoupled from the graphics pipeline more than a decade ago and now shaders far outnumber TMUs and ROPs. It used to be that for each shader you would have one ROP and one TMU but that proved to be wasteful and it turned out you can extract more performance by adding more shaders since rasterization does not require the same scaling in terms of performance.

The ratio of ROP to shader of the new RTX 2080ti is 1 to 49, that's how important they are to graphics performance.

Seriously, ROPs and TMUs is the only hardware that dictates traditional GPU performance ? I feel like I would hear that 20 years ago.
 
Last edited:
Joined
Nov 3, 2011
Messages
697 (0.14/day)
Location
Australia
System Name Eula
Processor AMD Ryzen 9 7900X PBO
Motherboard ASUS TUF Gaming X670E Plus Wifi
Cooling Corsair H150i Elite LCD XT White
Memory Trident Z5 Neo RGB DDR5-6000 64GB (4x16GB F5-6000J3038F16GX2-TZ5NR) EXPO II, OCCT Tested
Video Card(s) Gigabyte GeForce RTX 4080 GAMING OC
Storage Corsair MP600 XT NVMe 2TB, Samsung 980 Pro NVMe 2TB, Toshiba N300 10TB HDD, Seagate Ironwolf 4T HDD
Display(s) Acer Predator X32FP 32in 160Hz 4K FreeSync/GSync DP, LG 32UL950 32in 4K HDR FreeSync/G-Sync DP
Case Phanteks Eclipse P500A D-RGB White
Audio Device(s) Creative Sound Blaster Z
Power Supply Corsair HX1000 Platinum 1000W
Mouse SteelSeries Prime Pro Gaming Mouse
Keyboard SteelSeries Apex 5
Software MS Windows 11 Pro
What are you talking about? Vertex, primitive and fragment processing all happen on the unified shaders aka FP32 ALUs. At the same time, that's where the kernels run if you use CUDA/OpenCL , a shader is the analog of a compute kernel and they use exactly the same resources.

TMUs and ROPs are not, by any stretch of the imagination, the only classifiers as far graphics processing goes. It's the reason they have been decoupled from the graphics pipeline more than a decade ago and now shaders far outnumber TMUs and ROPs. It used to be that for each shader you would have one ROP and one TMU but that proved to be wasteful and it turned out you can extract more performance by adding more shaders since rasterization does not require the same scaling in terms of performance.

The ratio of ROP to shader of the new RTX 2080ti is 1 to 49, that's how important they are to graphics performance.

Seriously, ROPs and TMUs is the only hardware that dictates traditional GPU performance ? I feel like I would hear that 20 years ago.
That's ignorance argument. OpenCL/Compute Shader doesn't use ROPS as it's primary read/write units. Note why AMD is pushing for Compute Shader based optimizations.

GTX 1080 Ti has more rasterization (6 GPCs) and ROPS (88 ROPS at 1600 Mhz to 1800 Mhz) power when compared to Vega 64's 4 GPC equivalent and 64 ROPS at 1550 Mhz to 1600Mhz.

With AMD GPUs, pixel and computer shader read/write paths are different i.e. common ALUs with different read/write paths. AMD's pixel shader read/write path is bottleneck'ed by 64 ROPS units.
RTG changed thier CU/TMU vs ROPS ratio with recent Vega 24 M GH i.e. 24 CU (96 TMU) vs 64 ROPS (64 pix/clk)


Vega Final Presentation-33.pngVega RenderL2.jpg.

NVIDIA Maxwell/Pascal GPUs still has split TMU and ROPS read/write paths with common shader ALUs.

RTX cores in RTX 2070/2080/2080 Ti marks the return non-unified shader/compute cores.
 
Last edited:
Top