You know, it's amazing. People in this forum are jumping over themselves saying how the 3070 should have had 16GB at launch. Now, if nvidia had launched a 16gb 3070, with the associated $80+ price increase to cover the additional memory cost, the same people would REEEE about the price and how its too expensive and how nvidia is da big evil guy.
Allocation.
Is.
Not.
Utilization.
We need that clapping hand emoji.
Unless you are seeing consistent microstuttering at higher framerates, you are not running out of VRAM. So far nobody has displayed that with TW3.
Allocation may not be utilization, but we've reached a point where games in general are beginning to no longer run adequately with 8 GB GPUs or 16 GB RAM PCs. People who make that argument often forgo or reject the concept of memory pressure. As physical memory nears exhaustion, data will be first compressed (which costs CPU cycles, but should still be manageable in most cases), and in order of priority, swapped onto slower memory tiers (whenever available) until it reaches storage, which is comparably glacial even on high-speed NVMe SSDs.
Apple offers an easy way to read memory pressure in the Mac OS's activity monitor, but Microsoft has yet to do something like this on Windows. A dead giveaway that you are short on RAM is when you begin to see the Compressed Memory figure rise (ideally you want this at 0 MB), and you'll be practically out of RAM when your commit charge exceeds your physical RAM capacity. This is also the reason why you will never see RAM usage maxed out in the Windows task manager, it will attempt to conserve about 1 GB of RAM for emergency use at all times, this leads many people with the mindset of "I paid for 16 GB of RAM and use 16 GB of RAM I shall" to think that their computers are doing OK and that they aren't short at all.
A similar concept applies to GPU memory allocation on Windows. As you may know by now, Windows doesn't treat memory as absolute values, but rather as an abstract concept of addressable pages instead, with the MB values reported by the OS being more estimates than a precise, accurate metric. Normally, the WDDM graphics driver will allocate physical memory present in the graphics adapter plus up to 50% of system RAM, so for example, if you have a 24 GB GPU, plus 32 GB of RAM, you will have a maximum of 24 + 16 = around 40 GB of addressable GPU memory:
This means that a 8 GB GPU such as the RTX 3070 on a PC with 32 GB of RAM actually has around 20 GB of addressable memory. However, at that point, the graphics subsystem is no longer interested in performance but rather, preventing crashes, as it's fighting for resources demanded by programs in main memory. By running games that reasonably demand a GPU that has that much dedicated memory to begin with, you can see where this is going fast.
I believe we may be seeing this symptom in Hardware Unboxed's video, in The Last of Us, where the computer is attempting to conserve memory at all costs:
This is, of course, my personal understanding of things. Don't take it as gospel, I might be talking rubbish - but one thing's for sure, by ensuring that I always have more RAM than an application demands, I have dodged that performance problem for many years now.