AnandTech article starts off describing them as exclusive but then in a separate section below, it mirrors what NVIDIA says. Pretty shoddy journalism, that.
Did you consider Nvidia being fairly open at what they are doing with cards technically which means even an objective story will either use their own slides or reproduce imagery and text that will end up very closely following the Nvidia marketing line? This is true for not only Nvidia, by the way.
Keep in mind that all GPU architectures have INT32 units for addressing memory. The only thing unique about Turing is that they're directly addressable. What I find very interesting about that PNG you referenced is how the INT32 units aren't very tasked when the RT core is enabled but are when it is not. Obviously they're doing a lot of RT operations in INT32 which begs the question: is RT core really just a dense integer ASIC with intersection detection? Integer math explains the apparent performance boost from such a tiny part of the silicon. Also explains why Radeon Rays has much lower performance: it uses FP32 or FP16 (Vega) math. It also explains why RTX has such a bad noise problem: their rays are imprecise.
Considering all of this, it's impossible to know what approach AMD will take with DXR. NVIDIA is cutting so many corners and AMD has never been a fan of doing that. I think it's entirely possible AMD will just ramp up the FP16 capabilities and forego exposing the INT32 addressability. I don't know that they'll do it via tensor cores though. AMD has always been in favor of bringing sledge hammers to fistfights. Why? Because a crapload of FP16 units can do all sorts of things. Tensor cores and RT cores are fixed function.
- AGUs are usually less equipped in terms of operations in addition to direct exposure.
- The same slide clearly shows INT32 units being intermittently tasked throughout the frame. RT is computation heavy and is fairly lenient on what type of compute is used so INT32 cores are more effective than usually. Note that FP compute is also very heavy and consistent during RT part of the frame.
- RT core is a dense specialized ASIC. According to Nvidia (and at least indirectly confirmed by devs and operations exposed in APIs) RT cores do ray triangle intersection and BVH traversal.
- RT is not only INT work, it involves both INT and FP. The share of each depends on a bunch of things, algorithm, which part of the RT is being done etc. RT Cores in Turing are more specialized than simply generic INT compute. That is actually very visible empirically from the same frame rendering comparison.
- Radeon Rays have selectable precision. FP16 is implemented for it because it has a very significant speed increase over FP32. In terms of RTRT (or otherwise quick RT) precision has little meaning when rays are sparse and are denoised anyway. Denoising algorithm along with ray placement play a much larger role here.
- As for AMDs approach, this is not easy to say. The short term solution would be Radeon Rays implemented for DXR. When and if AMD wants to come out with that is in question but I suppose than answer is when it is inevitable. Today, AMD has no reason to get into this as DXR and RTRT is too new and with too few games/demos. This matches what they have said along with the fact that AMD only has Vegas that are likely to be effective enough for it (RX5x0 lacks RPM - FP16). Long term - I am speculating here but I am willing to bet that AMD will also do implementation with specialized hardware.