It won't be. it's not on the cpu's so why would this be different.
CPUs use an IF interconnect through a PCB substrate. This uses a non-IF interconnect through an active (presumably silicon) interposer. It would thus be different because it's using different technology. The use of an interposer is likely to allow for far higher bandwidth, lower latency, and better signal integrity.
Is there a good reason for them to use the same design on the gpu as the cpu, or are they just stroking themselves because it seems to be working so well on the cpu's?
They aren't. See above. Also, CPU cores are individually visible to the system; GPU cores are hidden behind one monolithic "device". This seems to be continuing the latter, just across several chiplets, not implementing a form of mGPU (several GPU devices visible to the system).
Is there actually an advantage to this on the gpu? why is monolithic gpu design suddenly cack? or is everyone gonna just go hoorah for Amd as Intel are now the "underdog"
Advantage? Well, current GPUs are
really expensive to manufacture as they are huge high-performance, high-power dice on cutting-edge production nodes. Navi 21 with 80 CUs is 550mm² on 7nm, making it likely very, very expensive to produce. Now imagine next-gen 100-120CU GPUs on 5nm. Sure, density will increase, but you'll be talking about far more transistors and still huge area. Splitting that into, for example, three ~250mm2 dice would
dramatically lower production costs, while likely allowing for higher overall CU counts (as you can get more functional parts with lower risks of defects). Smaller dice also let you utilize more of the wafer, further increasing yields. Chiplets are undoubtedly the way of the future for all high performance computing, it's just a question of getting it to work as well as monolithic parts.
An example - these die sizes are made up, and I'm assuming linear CU scaling per area here which is a bit on the optimistic side, but reasonably representative. Yield rate is based on published TSMC yield data of 0.09 defects/cm². Yield calculations from
here.
-A 15x15mm die (225mm²) fits 256 dice per 300mm wafer with ~46 (partially or fully) defective dice.
-A 15x30mm die (450mm²) fits 116 dice per 300mm wafer with ~38 (partially or fully) defective dice.
In other words halving die size increases dice per wafer by 120%, while defective dice only increase by 21%. In other words you're left with
far more fault-free silicon, giving a lot more flexibility in binning and product segmentation. If 10% of error-free dice hit the top clock speed/voltage bin, that's ~7.8 dice per wafer on the larger die or ~21 dice per wafer on the smaller die. If two of the smaller can be combined to work as an equivalent of one of the larger, you then have ~2,7 more useable flagship GPUs per wafer or a 34% increase in yields for the top bin, and far more flexibility for the remainder, seeing how it can be used for either 1- or 2-die configurations, giving a much wider possible range of configurations and thus a higher chance of utilizing faulty dice too (a >50% cut-down SKU of a large die is immensely wasteful and extremely unlikely to happen, after all).
Also, cost goes way down. Assuming a per-wafer cost of $20 000, the smaller die ends up at ~$95/die (not counting defective dice), while the larger die ends up at ~$256 (same). Taking into account the higher likelihood of being able to use the defective dice from the smaller design, that further brings down prices. If 50% of defective small dice can be used vs. 30% of defective large dice, that's ~$86 vs. ~$206.
Nobody is saying monolithic GPU designs are cack, but we're reaching the practical and economical upper limits of monolithic GPU die designs. If we want better performance at even remotely accessible prices in the future, we need new ways of making these chips.