The release got screwed over by them taking a cheaper route with the memory setup, and relying on DLSS to make it work.
GDDR6 came out with 4GB per chip (16Gb), and at higher speeds than the older, smaller modules.
That means you can now get higher VRAM capacity with half the complexity... but with half the bus width.
Energy efficiency: Sure, this came out great.
But did it come out great because it's good, or because the last gen were over-volted to the moon (Yes), and because they cut back on the RAM bus size massively due to moving to higher capacity VRAM modules? (Also yes)
The 3090Ti used a ton more power than the 3090 but was also more efficient
View attachment 313262
View attachment 313258
snipped image from here
But it's more efficient? How, asks the voice in my head?
View attachment 313259
Why little voicey... because they used half as many memory modules, so more of that power went to the GPU and not the VRAM.
The difference on the Ti is that each memory chip has twice the capacity, so there's only half as many required, which means no more memory chips on the back side of the card
During the eth mining boom, my 3090 would hit its 375W limit with the GPU clocked at 1.4GHz - super far down vs the 2.1GHz it can boost to when the ram is hardly in use.
Undervolting the GPU let me reach 1.6GHz at 370W, so it's easy to imagine the 3090Ti's higher capacity VRAM would have let it clock even higher with no other changes.
These new GPU's have the same deal
3060TI: 256 bit, 8x GDDR6 modules
4060Ti: 128 bit bus, x4 GDDR6 modules
It's almost like halving the amount of VRAM freed up more wattage, which massively boosts efficiency - but harms performance in bandwidth intensive situations.
At lower resolutions these changes don't impact them, and they're pretty good.
View attachment 313260
But at 4k, it's worse than what it replaced. All VRAM heavy titles will behave this way, and the gap will be bigger if you aren't running top of the line systems like TPU's review rig, where it's got stupid fast DDR5-6000 CL36 and PCI-E 5.0 - you need every nanosecond of lower latency to feed it new data, since it doesnt have the raw bandwidth to get the data across faster.
View attachment 313261
But who cares right, DLSS saves the day by rendering at a lower resolution.
Edited in, didn't even notice an 8GB variant existed. I cared so little about these cards.
The 8GB and 16GB both use the same amount of GDDR6 modules - just 4x2GB or 4x4GB modules.
The power consumption barely changed between them, because of this.
The 3060Ti needs a hecking lot more power - but ~15W from that ram going to the GPU would make a world of difference
View attachment 313263
Throw in the smaller gains from it being a newer GPU design even if it was just a refresh, a less complex PCB due to less wiring to VRAM, cheaper VRMs and all those things SHOULD have added up to a much cheaper card that was good value for money... and it wasnt. If you gamed at higher resolutions, you were paying more for less.