- Joined
- Nov 3, 2011
- Messages
- 695 (0.14/day)
- Location
- Australia
System Name | Eula |
---|---|
Processor | AMD Ryzen 9 7900X PBO |
Motherboard | ASUS TUF Gaming X670E Plus Wifi |
Cooling | Corsair H150i Elite LCD XT White |
Memory | Trident Z5 Neo RGB DDR5-6000 64GB (4x16GB F5-6000J3038F16GX2-TZ5NR) EXPO II, OCCT Tested |
Video Card(s) | Gigabyte GeForce RTX 4080 GAMING OC |
Storage | Corsair MP600 XT NVMe 2TB, Samsung 980 Pro NVMe 2TB, Toshiba N300 10TB HDD, Seagate Ironwolf 4T HDD |
Display(s) | Acer Predator X32FP 32in 160Hz 4K FreeSync/GSync DP, LG 32UL950 32in 4K HDR FreeSync/G-Sync DP |
Case | Phanteks Eclipse P500A D-RGB White |
Audio Device(s) | Creative Sound Blaster Z |
Power Supply | Corsair HX1000 Platinum 1000W |
Mouse | SteelSeries Prime Pro Gaming Mouse |
Keyboard | SteelSeries Apex 5 |
Software | MS Windows 11 Pro |
Running CUDA apps disables delta color compression.I concur, however I was pointing out that the IMC has less consequences in a TBR & L2-ROP design. AMD would certainly be able to clock the gpu higher in case they integrated TBR, but also most of Nvidia's advantage is due to r:w amplification through TBR, not frequency alone. They can only write 616GB/s, yes, but setup occurs in reference of texture reads at 1.5TB/s.
NVIDIA Maxwell/Pascal/Turing GPUs doesn't have PowerVR's "deferred tile render" but it has immediate mode tile cache render.
For my GTX 1080 Ti and 980 Ti GPUs, I can increase L2 cache bandwidth with an overclock.
Vega 56 at higher clock speed still has performance increase without increasing memory bandwidth and Vega ROPS has multi-MB L2 cache connection like Maxwell/Pascal's ROPS designs.
VII rivalling the fastest Turing GPU with 64 ROPS would be RTX 2080.
Battlefield series games are well known for software tiled compute render techniques which maximises older AMD GCNs with L2 cache connections with TMUs.
For Vega architecture from https://radeon.com/_downloads/vega-whitepaper-11.6.17.pdf
From AMD's white paper
Vega uses a relatively small number of tiles, and it operates on primitive batches of limited size compared with those used in previous tile-based rendering architectures. This setup keeps the costs associated with clipping and sorting manageable for complex scenes while delivering most of the performance and efficiency benefits.
AMD Vega Whitepaper:
The Draw-Stream Binning Rasterizer (DSBR) is an important innovation to highlight. It has been designed to reduce unnecessary processing and data transfer on the GPU, which helps both to boost performance and to reduce power consumption. The idea was to combine the benefits of a technique already widely used in handheld graphics products (tiled rendering) with the benefits of immediate-mode rendering used high-performance PC graphics.
Pixel shading can also be deferred until an entire batch has been processed, so that only visible foreground pixels need to be shaded. This deferred step can be disabled selectively for batches that contain polygons with transparency. Deferred shading reduces unnecessary work by reducing overdraw (i.e., cases where pixel shaders are executed multiple times when di erent polygons overlap a single screen pixel).
PowerVR's deferred tile render is patent heavy.
Last edited: