Sorry, but what are these numbers you're working with? Are you inventing them out of thin air? And what's the relation between the different numbers? You also seem to be mixing power and performance? Remember, performance (clocks) and power do not scale linearly, and any interconnect will consume power. You're making this out to be far simpler than it is. Other than that all you're really saying here seems to be the age-old truism of wide and slow chips generally being more efficient. And, of course, you're completely ignoring the cost of using two dice to deliver the performance of one.
What? A 1/6th/16.67% area reduction from a node change will be 16.67% no matter how large your die, no matter how many of them you combine. A percentage/fractional reduction in area doesn't add up as you add parts together - that number is relative, not absolute.
It's absolutely possible that an MCM approach can allow for power savings, but only if it allows for larger total die sizes and lower clocks. Otherwise it's no different from a monolithic die, except for the added interconnect power. And, of course, larger dice are themselves a fundamental problem when per-transistor costs are no longer dropping noticeably, which is leading to rapidly rising chip prices.
Again, this isn't accurate. A GPU die has its heat very evenly spread across the entire die (unlike CPUs which are very concentrated), as most of the die is compute cores. Spreading this across two dice won't affect thermals much, as both dice will still be connected to the same cooler - it's not like you're running them independently of each other. Assuming the same power draw and area for a monolithic and MCM solution, the thermal difference between the two will be minimal. And, crucially, you want the distance between dice on package to be as small as possible to keep latencies low.
Fans generally run directly off 12V and don't rely on VRMs on the GPU, just a fan controller IC sending out PWM signals (unless the fans are for some reason controlled through voltage, which is rather unlikely).
Idk, I think the truth is somewhere in the middle. Both chips have distinct qualities and deficiencies. The 6800 is fantastically efficient; the 6700 XT gets a lot of performance out of a relatively small die. Now, the 6700 XT is indeed rather poor in terms of efficiency for an RDNA2 chip, but it still beats out the majority of Ampere GPUs, so ... meh. (The 6500XT is another matter entirely.)
I still can't wrap my head around AMD's RDNA2 segmentation though. The 16-32-40-80CU lineup just doesn't make sense IMO, and kind of forced them to tune the 6700XT the way they did. 20-32-48-80 or something like that would have made a lot more sense. It's also weird just how few SKUs Navi 22 has been used in overall.