The only 'facts' I could find were your opinions on how they could have doubled the core count? Let's dissect:
So you want them to remove the ray accelerators from the CUs and put them on the backend as a dedicated block? That's expensive for die area since you now need to physically map and wire out all their supporting cache twice instead of giving them the shared data cache. Also, BVH is a shader-parallel operation, so why would you remove the ray accelerators from the CU and have to store ray data in a secondary location to fetch half way through? Just to increase latency for funsies? It makes no sense.
You want to create a new clock domain ONLY for the ray accelerators within the shader engine, and keep the shader engine clock domain linked to the front end. Sounds great, but those ray accelerators are with utmost certainty NOT what drives up core power draw. So pushing them off onto their own domain isn't going to save you any power. You'll go back to having the problem RDNA2 has where the front end is outrun by the CUs, and entire units stall for multiple cycles waiting for wavefronts.
The shaders are underclocked to save power. I guarantee you when reviews come out, people will attempt to overclock the shader engines and they will see performance scale with clocks, as well as
significant increases in load power draw. The front end has been left alone because it needs the extra bandwidth to feed the CUs, this so far is true. Double the I$ to improve ray tracing on the backend (ray tracing happens in the CUs, not on the backend) without adding shaders to each unit? What? The ray accelerators reside within the CU, how would they add more ray accelerators without adding more shaders? Empty CUs? Oh, wait I think you're back on that idea of removing the ray accelerators from the CUs and grouping them to the backend again. We've been over this.
Correct, and they
more than doubled the die size in turn. The beauty in this comparison is that the 5700 XT was pushed to the absolute limit of what it could endure, and
a properly tuned (980mV-1050mV) 5700 XT runs more like 170W. Unsurprisingly if you look at the 6900 XT it's tuned for that exact voltage range, and uses almost exacty 2x the power of the 5700 XT at the same voltage.
View attachment 269003View attachment 269004
5700 XT Reference on the left, 6900 XT reference on the right.
The front end CAN feed the shader engines because it runs at higher clocks. Also, doubled RT cores? Where did you read that? Navi 31's ray accelerators were only increased by 50% according to the top post detailing the block diagram, and AMD themselves don't even mention a full 2x performance increase of the cores. Where did you get "double" from?
The reason there has only been an increase of 20% to TMU count is because there has only been an increase of 20% to CU count. From 80 to 96. The same reason why TMU count from 5700 XT to 6900 XT doubled, because they went from 40 to 80 CUs. That's how the architecture is laid out, I'm sorry you don't like it?