Monday, May 13th 2024
AMD RDNA 5 a "Clean Sheet" Graphics Architecture, RDNA 4 Merely Corrects a Bug Over RDNA 3
AMD's future RDNA 5 graphics architecture will bear a "clean sheet" design, and may probably not even have the RDNA branding, says WJM47196, a source of AMD leaks on ChipHell. Two generations ahead of the current RDNA 3 architecture powering the Radeon RX 7000 series discrete GPUs, RDNA 5 could see AMD reimagine the GPU and its key components, much in the same way RDNA did over the former "Vega" architecture, bringing in a significant performance/watt jump, which AMD could build upon with its successful RDNA 2 powered Radeon RX 6000 series.
Performance per Watt is the biggest metric on which a generation of GPUs can be assessed, and analysts believe that RDNA 3 missed the mark with generational gains in performance/watt despite the switch to the advanced 5 nm EUV process from the 7 nm DUV. AMD's decision to disaggregate the GPU, with some of its components being built on the older 6 nm node may have also impacted the performance/watt curve. The leaker also makes a sensational claim that "Navi 31" was originally supposed to feature 192 MB of Infinity Cache, which would have meant 32 MB segments of it per memory cache die (MCD). The company instead went with 16 MB per MCD, or just 96 MB per GPU, which only get reduced as AMD segmented the RX 7900 XT and RX 7900 GRE by disabling one or two MCDs.The upcoming RDNA 4 architecture will correct some of the glaring component level problems causing the performance/Watt curve to waver on RDNA 3; and the top RDNA 4 part could end up with performance comparable to the current RX 7900 series, while being from a segment lower, and a smaller GPU overall. In case you missed it, AMD will not make a big GPU that succeeds the "Navi 31" and "Navi 21" for the RDNA 4 generation, but rather focus on the performance segment, offering more bang for the buck well under the $800-mark, so it could claw back some market share from NVIDIA in the performance- mid-range, and mainstream product segments. While it remains to be seen if RDNA 5 will get AMD back into the enthusiast segment, it is expected to bring a significant gain in performance due to the re-architected design.
One rumored aspect of RDNA 4 that even this source agrees with, is that AMD is working to significantly improve its performance with ray tracing workloads, by redesigning its hardware. While RDNA 3 builds on the Ray Accelerator component AMD introduced with RDNA 2, with certain optimizations yielding a 50% generational improvement in ray testing and intersection performance; RDNA 4 could see AMD put more of the ray tracing workload through fixed-function accelerators, unburdening the shader engines. This significant improvement in ray tracing performance, performance/watt improvements at an architectural level, and the switch to a newer foundry node such as 4 nm or 3 nm, is how AMD ends up with a new generation on its hands.
AMD is expected to unveil RDNA 4 this year, and if we're lucky, we might see a teaser at the 2024 Computex, next month.
Sources:
wjm47196 (ChipHell), VideoCardz
Performance per Watt is the biggest metric on which a generation of GPUs can be assessed, and analysts believe that RDNA 3 missed the mark with generational gains in performance/watt despite the switch to the advanced 5 nm EUV process from the 7 nm DUV. AMD's decision to disaggregate the GPU, with some of its components being built on the older 6 nm node may have also impacted the performance/watt curve. The leaker also makes a sensational claim that "Navi 31" was originally supposed to feature 192 MB of Infinity Cache, which would have meant 32 MB segments of it per memory cache die (MCD). The company instead went with 16 MB per MCD, or just 96 MB per GPU, which only get reduced as AMD segmented the RX 7900 XT and RX 7900 GRE by disabling one or two MCDs.The upcoming RDNA 4 architecture will correct some of the glaring component level problems causing the performance/Watt curve to waver on RDNA 3; and the top RDNA 4 part could end up with performance comparable to the current RX 7900 series, while being from a segment lower, and a smaller GPU overall. In case you missed it, AMD will not make a big GPU that succeeds the "Navi 31" and "Navi 21" for the RDNA 4 generation, but rather focus on the performance segment, offering more bang for the buck well under the $800-mark, so it could claw back some market share from NVIDIA in the performance- mid-range, and mainstream product segments. While it remains to be seen if RDNA 5 will get AMD back into the enthusiast segment, it is expected to bring a significant gain in performance due to the re-architected design.
One rumored aspect of RDNA 4 that even this source agrees with, is that AMD is working to significantly improve its performance with ray tracing workloads, by redesigning its hardware. While RDNA 3 builds on the Ray Accelerator component AMD introduced with RDNA 2, with certain optimizations yielding a 50% generational improvement in ray testing and intersection performance; RDNA 4 could see AMD put more of the ray tracing workload through fixed-function accelerators, unburdening the shader engines. This significant improvement in ray tracing performance, performance/watt improvements at an architectural level, and the switch to a newer foundry node such as 4 nm or 3 nm, is how AMD ends up with a new generation on its hands.
AMD is expected to unveil RDNA 4 this year, and if we're lucky, we might see a teaser at the 2024 Computex, next month.
169 Comments on AMD RDNA 5 a "Clean Sheet" Graphics Architecture, RDNA 4 Merely Corrects a Bug Over RDNA 3
If I had to shoot blindly, i'd shoot at around ~400 USD for a finished, packaged 4090 FE. Well, if it provides a sufficient lead, efficiency or feature set, it definitely will.
Hard to tell what's going to happen this gen although I do agree it's likely AMD and Nvidia price around each other again instead of competing. Intel is another wildcard as well, might have some presence in the midrange if they get a decent uArch out the door.
A quick google search show Nvidia has 29,600 employees, meanwhile AMD has 26k employees, Intel has 120k employee (that's some impressive revenue generated per employee on Nvidia). This means software development cost of Nvidia should not be that much higher than AMD.
TL;DR: Nvidia can maintain super high margins because they use everything effectively.
It's really complicated.
10k usd for 7nm
16k usd for 5nm
At 70% yield each AD102 should cost ~260usd
2. Like I said previously, they can use older processes with higher IPC architectures in order to offset the transistor count deficit linked to using an older process.
And still, no one will stop them to make a second revision of Navi 31 with larger die size ~700 mm^2 monolithic and transferring the cost as far as it's possible onto the gamers, and then putting the profit margin at negative values or around zero, like they already have done with the consoles.
$120 for the PCB and all components
$300 for the chip
$150 for the memory
$60 for the cooler
$20 for the packaging
$5 accessories
$15 shipping
=$670
Obviously, this is manufacturer specific, as the larger ones will get these items cheaper, and we don't know what the relationship is between nGreedia and the OEMs, and how much they charge to supply a GPU chip. But we can easily assume nGreedia charge at least double to the OEMs. So $1100 could well be it for the OEMs, as they probably get the PCB, components and memory far cheaper than I stated.
90 PCB and all components
200 the chip
100 the memory
60 the cooler
10 the packaging
10 accessories
15 shipping
=485$
I'd love to be wrong but Nvidia are charging what people will pay and people will continue to pay $2000 for a 4090 regardless of what else is out there. Nvidia are not going to say undermine their own profits on their current highest-margin part just because they've developed an even more profitable one! That's basic free-market capitalism, which is Nvidia's bible.
Also I guess AI isn't that sensitive, because their products have tons of HBM memory, which mitigates the portion of the issue. Indeed. They already got so much wealth, that even if the buble burst today, they will be able to calmly sip the drinks while having a warm bath. They just want to increase the margins even more, while they can.
Also, OpenAI recently got some HW from JHH, so I doub't they are that "Open" after all. Not to mention data sellout to MS, etc. If AI guys want any progress, they should do something really independent, as the cartel lobby has been already established. True. But if you recall the events that old, you can also see, that these lower nodes were always the bread and butter, at least for AMD, and for nVidia until ADA generation. There's nothing wrong in having simplier SKUs, made from lower end chips on cheaper stable nodes. Heck even nVidia managed to produce and sell dozens of millions of hot garbage chips on Samsung's dog-shit 8(10nm) node. Not for 20 years, but if the older less refined node doesn't hinder the performance and power efficiency, then IMHO, it's quite viable solution. It's better to sell more akin 7600 on n6, then make few expensive broken top-end chips on finest node, that nobody would like to buy. Why not? At least for AMD, it's still relevant, since they've hit the invisible wall/theshold in their GPU architecture, where node doesn't bring an advantage anymore. At least for current products. I even would dare to say, that if AMD would have made Radeon RX7000 entire series monolithic and on 6nm, it wold have been more viable, than broken 5nm MCM. An it would have made them time to fix, and refine their MCM approach, so the RDNA4 would have been bug-free.
This is especially esencial, in a view of current horrible situation with TSMC allocations, where all top nodes, were completely consumed by Apple and nVidia with it's "AI" oriented chips. So eg making something decent, that is still not sensitive to the older nodes.
Don't get me wrong. I'm all for stopping the manufacturers to fart the broken inferior chips and products, for the sake of profits. Especially, since it requires a lot of materials, resouces, which otherwise could be put in more advanced, more stable and more powerful products. But there should be some middle ground.
At least some portion of "inferior" older n6 etc products could be made, for reasonable prices, just to meed the demand for a temporary solution. Since so many people sitting on ancient HW, that needs to be changed, but withhold the purchase, as only overpriced and pointless products fill the market. Everyone would. But that won't happen enywhere soon. There's reason why margins are about 60% for nVidia, and for AMD until recently.
They won't disclose it as it will shatter their "premium" brand image, that they both managed to maintain, despite being called out for their shenanigans. It happened many times, when it ended up for nVidia to having cheaping out on the design, while still asking a huge premium. Until both nVidia and AMD's reputation and public image and blind followership won't shatter, nothing will change. I guess nVidia won't make it "cheaper"/sell for the same price. As they made it perfectly clear about five years ago, that they would stack their newer and more powerful solutions above previous gen stuff, while keeping the price of previous. Since newer are greater, thus more expensive. I can't find the reference, but AFAIR it was during RTX inception. Regular consumers- no. But there are a lot of crypto-substitutes AKA "AI", that would be gladly buy any compute power for any money. As much as the dumb rich folks and YT influencers, who would create a public image of "acceptable". Sadly, the MCM won't go anywhere, since this means higher profit margins for AMD. They would do anything to keep it this way. Although it still is cheaper to produce, it doesn't manifest itself in the final price formation.
For large silicon products chiplets are a must have as the cost savings, increased yields, ability to modularize your product (the benefits of which could be it's own subject alone), and better binning (you bin out of every chiplet you make and not just a specific SKU) among other things provide a revolution to chip design.
Interposers will also prove a very fruitfull field for performance gains in the near future as latency, bandwidth, and energy usage among other factors are improved over time, enabling more chiplets at lower latencies using less energy to transmit data between said chiplets.
2. Older processes would throw efficiency out of the window. Do you want a 400-450+ Watt 7900 XT? I don't.
Edit: Maybe AMD saw that chiplets aren't good for a GPU, yet, so they put the project on hold until they figure things out (speculation).