I guess I was just caught up in the idea of more cache, more better.
There's this knee-jerk reaction that you see on the Internet, that companies have their engineers neuter their products for various reason (depending on what you read, someone will have that opinion about
any company). That leaves an impression and can add some thought-bias. And while I am sure in
some cases that is true, what usually happens is engineers will come up with something on the drawing board and that something gets adjusted because a) real world testing and b) production/supply realities.
In particular "the more cache, the better" was never true. Because the more cache, the higher the latency. Correctly sizing a cache for an architecture is so complicated that, as you can see, landed us with 4 levels of cache, each still able to put more performance on the table. Without the latency problem, we'd have one huge 1st level cache (kidding, that's not possible, for different reasons).