• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD RDNA3 Offers Over 50% Perf/Watt Uplift Akin to RDNA2 vs. RDNA; RDNA4 Announced

Joined
Jul 21, 2016
Messages
144 (0.05/day)
Processor AMD Ryzen 5 5600
Motherboard MSI B450 Tomahawk
Cooling Alpenföhn Brocken 3 140mm
Memory Patriot Viper 4 - DDR4 3400 MHz 2x8 GB
Video Card(s) Radeon RX460 2 GB
Storage Samsung 970 EVO PLUS 500, Samsung 860 500 GB, 2x Western Digital RED 4 TB
Display(s) Dell UltraSharp U2312HM
Case be quiet! Pure Base 500 + Noiseblocker NB-eLoop B12 + 2x ARCTIC P14
Audio Device(s) Creative Sound Blaster ZxR,
Power Supply Seasonic Focus GX-650
Mouse Logitech G305
Keyboard Lenovo USB
I'd like that Navi 33, the question is what will be their price.. because that will be the deciding factor for most of us.
 
Joined
Feb 20, 2019
Messages
8,339 (3.91/day)
System Name Bragging Rights
Processor Atom Z3735F 1.33GHz
Motherboard It has no markings but it's green
Cooling No, it's a 2.2W processor
Memory 2GB DDR3L-1333
Video Card(s) Gen7 Intel HD (4EU @ 311MHz)
Storage 32GB eMMC and 128GB Sandisk Extreme U3
Display(s) 10" IPS 1280x800 60Hz
Case Veddha T2
Audio Device(s) Apparently, yes
Power Supply Samsung 18W 5V fast-charger
Mouse MX Anywhere 2
Keyboard Logitech MX Keys (not Cherry MX at all)
VR HMD Samsung Oddyssey, not that I'd plug it into this though....
Software W10 21H1, barely
Benchmark Scores I once clocked a Celeron-300A to 564MHz on an Abit BE6 and it scored over 9000.
I do not trust manufacturers using this metric.

The TDP is arbitrary at a level they set. I can get 50% performance/Watt gains from any AMD GPU I've used in the last half decade simply by tuning it less aggressively. Polaris, Vega, and Navi all reached stratospheric TDPs but dial the power consumption back by 40% and you still had ~90% of the performance. Voila, instant 50% perf/W gain by messing with a couple of sliders and pressing "apply"

Ampere and Turing are similar, I cram a 3600XT and RTX 3060 into a tiny cramped HTPC case and want to keep them quiet. The CPU has 30W cut from its PPT using PBO and the GPU power limit is set to 75% in Afterburner. The underclocked result gets over 90% of the original performance and that's a tiny price to pay for near-silence at full load.
 

ARF

Joined
Jan 28, 2020
Messages
4,670 (2.61/day)
Location
Ex-usa | slava the trolls
I'd like that Navi 33, the question is what will be their price.. because that will be the deciding factor for most of us.

If lucky, around 300 bucks, if not, 450-500 bucks.
 
Joined
Sep 17, 2014
Messages
22,673 (6.05/day)
Location
The Washing Machine
System Name Tiny the White Yeti
Processor 7800X3D
Motherboard MSI MAG Mortar b650m wifi
Cooling CPU: Thermalright Peerless Assassin / Case: Phanteks T30-120 x3
Memory 32GB Corsair Vengeance 30CL6000
Video Card(s) ASRock RX7900XT Phantom Gaming
Storage Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s) Gigabyte G34QWC (3440x1440)
Case Lian Li A3 mATX White
Audio Device(s) Harman Kardon AVR137 + 2.1
Power Supply EVGA Supernova G2 750W
Mouse Steelseries Aerox 5
Keyboard Lenovo Thinkpad Trackpoint II
VR HMD HD 420 - Green Edition ;)
Software W11 IoT Enterprise LTSC
Benchmark Scores Over 9000
Just pretty much every rumor out there about performance some are crazy like 2.5x and 3x..... It was leaked ages ago that it would be MCM that turned out to be true so I guess we will see. Similar rumors about the 4090 being 1.8x to 2x over the 3090.

Myeah... rumor mill and Youtube hype.
Ampere was x2 as well. But oh yeah, only best case with RT because performance is now half as abysmal.

Let's stay sane. 50% is a realistic gen-to-gen leap, and a pretty big one already at that. We used to be happy with 30% on the same tier.
 
Joined
May 31, 2016
Messages
4,440 (1.42/day)
Location
Currently Norway
System Name Bro2
Processor Ryzen 5800X
Motherboard Gigabyte X570 Aorus Elite
Cooling Corsair h115i pro rgb
Memory 32GB G.Skill Flare X 3200 CL14 @3800Mhz CL16
Video Card(s) Powercolor 6900 XT Red Devil 1.1v@2400Mhz
Storage M.2 Samsung 970 Evo Plus 500MB/ Samsung 860 Evo 1TB
Display(s) LG 27UD69 UHD / LG 27GN950
Case Fractal Design G
Audio Device(s) Realtec 5.1
Power Supply Seasonic 750W GOLD
Mouse Logitech G402
Keyboard Logitech slim
Software Windows 10 64 bit
What? Of course it's MCM. MCM = multi chip module, i.e. a single package with more than one piece of silicon on it. Chiplet = a single piece of silicon that works together with others on the same package to form a single "chip". If AMD is saying RDNA3 uses "Advanced chiplet packaging", they are saying RDNA3 GPUs are MCM. In this context, the two are synonymous.

Heck, the article even says as much (even if the sentence is a bit garbled):
Whether that's one processing die, two, or more is currently unknown, but regardless of that it will be MCM. And, crucially, once you're disaggregating the die, the difference between running one and several processing dice is far smaller than going from a monolithic design. Given that VRAM and PCIe are on the IOD, all chiplets will have equal access to the same data - and if IC is on the IOD as has been speculated, this will also ensure a fully coherent and very fast cache between processing chiplets, ensuring that they don't need to wait on each other for data like in ordinary mGPU setups.
Yes we all know that but there is a slight difference here. Unlike zen arch the chiplets do not contain cores on each chiplet. Only one has the cores.
Nobody doubt the goodness of the chiplet approach bro.
 

ARF

Joined
Jan 28, 2020
Messages
4,670 (2.61/day)
Location
Ex-usa | slava the trolls
Yes we all know that but there is a slight difference here. Unlike zen arch the chiplets do not contain cores on each chiplet. Only one has the cores.
Nobody doubt the goodness of the chiplet approach bro.

Yeah, MCM but more like 3D stacking rather than multi-GPU which is generally understood with MCM.
 
D

Deleted member 185088

Guest
Myeah... rumor mill and Youtube hype.
Ampere was x2 as well. But oh yeah, only best case with RT because performance is now half as abysmal.

Let's stay sane. 50% is a realistic gen-to-gen leap, and a pretty big one already at that. We used to be happy with 30% on the same tier.
nVidia's comparison are awfully misleading with no context, massive RT improvement from our unnecessary tensor cores:
Pascal: 2 FPS
Turing: 10 FPS
Ampere : 15 FPS
But the worst part is those so called reviewers accepting and even praising nVidia.
 
Joined
Dec 5, 2020
Messages
203 (0.14/day)
Looking at Nvidia's near zero performance/Watt uplift from Turing to Ampere, then the new 1.21 Jiggawatt Lovelace architecture, I'm happy with any sort of efficiency increase at this point.
The problem is that they're using a suboptimal node and their memory config is bad for effiency. Saying near zero perofmance/Watt uplift is wrong as it varies quite a bit depending on the SKU. The 3060Ti is at the top of the chart, although it can vary somewhat depending on which games you test, when it comes to performance/Watt for example while the 3080 is worse than Turing.

 

ARF

Joined
Jan 28, 2020
Messages
4,670 (2.61/day)
Location
Ex-usa | slava the trolls
^^^^^ The graph is inaccurate since it doesn't count the cards on the underclocking merits but rather uses the wrong overvolted factory settings.

I do not trust manufacturers using this metric.

The TDP is arbitrary at a level they set. I can get 50% performance/Watt gains from any AMD GPU I've used in the last half decade simply by tuning it less aggressively. Polaris, Vega, and Navi all reached stratospheric TDPs but dial the power consumption back by 40% and you still had ~90% of the performance. Voila, instant 50% perf/W gain by messing with a couple of sliders and pressing "apply"

Ampere and Turing are similar, I cram a 3600XT and RTX 3060 into a tiny cramped HTPC case and want to keep them quiet. The CPU has 30W cut from its PPT using PBO and the GPU power limit is set to 75% in Afterburner. The underclocked result gets over 90% of the original performance and that's a tiny price to pay for near-silence at full load.
 
Joined
Jan 4, 2013
Messages
1,184 (0.27/day)
Location
Denmark
System Name R9 5950x/Skylake 6400
Processor R9 5950x/i5 6400
Motherboard Gigabyte Aorus Master X570/Asus Z170 Pro Gaming
Cooling Arctic Liquid Freezer II 360/Stock
Memory 4x8GB Patriot PVS416G4440 CL14/G.S Ripjaws 32 GB F4-3200C16D-32GV
Video Card(s) 7900XTX/6900XT
Storage RIP Seagate 530 4TB (died after 7 months), WD SN850 2TB, Aorus 2TB, Corsair MP600 1TB / 960 Evo 1TB
Display(s) 3x LG 27gl850 1440p
Case Custom builds
Audio Device(s) -
Power Supply Silverstone 1000watt modular Gold/1000Watt Antec
Software Win11pro/win10pro / Win10 Home / win7 / wista 64 bit and XPpro
RDNA4 before end of 2024 - I like it a lot :D
 
Joined
May 2, 2017
Messages
7,762 (2.78/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
I do not trust manufacturers using this metric.

The TDP is arbitrary at a level they set. I can get 50% performance/Watt gains from any AMD GPU I've used in the last half decade simply by tuning it less aggressively. Polaris, Vega, and Navi all reached stratospheric TDPs but dial the power consumption back by 40% and you still had ~90% of the performance. Voila, instant 50% perf/W gain by messing with a couple of sliders and pressing "apply"

Ampere and Turing are similar, I cram a 3600XT and RTX 3060 into a tiny cramped HTPC case and want to keep them quiet. The CPU has 30W cut from its PPT using PBO and the GPU power limit is set to 75% in Afterburner. The underclocked result gets over 90% of the original performance and that's a tiny price to pay for near-silence at full load.
You're not wrong, but there is some important nuance here, mainly in that they are presenting this to investors, which effectively obligates them to make these numbers appear in real life SKUs. So, saying this means that for some RDNA3 SKU, there needs to be a reasonable comparison to an RDNA2 SKU where it demonstrates 50% higher perf/W at stock. Of course they could do this by using the garbage tuned 6500 XT, though even that would leave them at risk for shareholder lawsuits due to how misleading it would be. And all other RDNA2 SKUs are in roughly the same range of efficiency. This obviously doesn't mean that every RDNA3 SKU will deliver a 50% perf/W uplift - not at all - but it does mean that at least one reasonable SKU must do so.
Yes we all know that but there is a slight difference here. Unlike zen arch the chiplets do not contain cores on each chiplet. Only one has the cores.
That's exactly the same as most Ryzen MCM CPUs. Only a couple of SKUs per generation have dual CCDs, after all.
Yeah, MCM but more like 3D stacking rather than multi-GPU which is generally understood with MCM.
If that is what is generally understood by MCM, then that's a misunderstanding of the term - it just means having multiple chips on the same package after all. It's important to differentiate between concepts/ideas (the "MCM GPU" with a bunch of chips) and concrete implementations (MCM GPUs, i.e. a GPU with MCM packaging for any number of chips above 1, whatever their function). And IMO it's more reasonable to see these things as developing along a line - jumping from monolithic to MCM with multiple compute chiplets in one go might not be feasible, but separating out IO and cache likely is. But once that's done, the next logical step is to enable multiple compute chiplets as well. As for "more like 3D stacking" - what does that even mean? 3D stacking is just one of many possible packaging methods for MCM packaging. Multi-GPU chiplets can be 3D stacked; non-multi GPU chiplets can be not 3D stacked. You're drawing up a distinction here that doesn't make sense.
The problem is that they're using a suboptimal node and their memory config is bad for effiency. Saying near zero perofmance/Watt uplift is wrong as it varies quite a bit depending on the SKU. The 3060Ti is at the top of the chart, although it can vary somewhat depending on which games you test, when it comes to performance/Watt for example while the 3080 is worse than Turing.

While your overall point that Ampere has a wide range of efficiencies depending on the implementation is accurate, saying the 3060Ti is "towards the top of the chart" is ... well, let's call it selective. That's only because that chart predates most of the RDNA2 series - it only has the 6800 and 6800 XT on it, after all. A more up-to-date chart:

The 3060 Ti is still a good implementation overall, but AMD has a firm lead on efficiency still.

^^^^^ The graph is inaccurate since it doesn't count the cards on the underclocking merits but rather uses the wrong overvolted factory settings.
That is ... uh ... just as arbitrary. The results that matter are the results of products that people are actually able to buy - the retail configurations. Whatever underclocking and undervolting results you end up with vary based on the silicon lottery, your willingness to sacrifice performance, and a bunch of other variables. How on earth would you settle on a standard for comparison at that point? Those graphs are literally the only reasonable way of comparing this, unless you're going to spend weeks on end tuning every single GPU to find its peak perf/W point in order to compare their peak efficiency.
 
Joined
May 31, 2016
Messages
4,440 (1.42/day)
Location
Currently Norway
System Name Bro2
Processor Ryzen 5800X
Motherboard Gigabyte X570 Aorus Elite
Cooling Corsair h115i pro rgb
Memory 32GB G.Skill Flare X 3200 CL14 @3800Mhz CL16
Video Card(s) Powercolor 6900 XT Red Devil 1.1v@2400Mhz
Storage M.2 Samsung 970 Evo Plus 500MB/ Samsung 860 Evo 1TB
Display(s) LG 27UD69 UHD / LG 27GN950
Case Fractal Design G
Audio Device(s) Realtec 5.1
Power Supply Seasonic 750W GOLD
Mouse Logitech G402
Keyboard Logitech slim
Software Windows 10 64 bit
That's exactly the same as most Ryzen MCM CPUs. Only a couple of SKUs per generation have dual CCDs, after all.
Yes you have an IO chip and cores chip and that is MCM design. We all got it.
With RDNA3 there wont be even a couple that will accommodate more than 1 core chiplet. That was my point. Perhaps I should have been more precise with my statements.

Yeah, MCM but more like 3D stacking rather than multi-GPU which is generally understood with MCM.
3D stacking is a different story.
 

Mussels

Freshwater Moderator
Joined
Oct 6, 2004
Messages
58,413 (7.91/day)
Location
Oystralia
System Name Rainbow Sparkles (Power efficient, <350W gaming load)
Processor Ryzen R7 5800x3D (Undervolted, 4.45GHz all core)
Motherboard Asus x570-F (BIOS Modded)
Cooling Alphacool Apex UV - Alphacool Eisblock XPX Aurora + EK Quantum ARGB 3090 w/ active backplate
Memory 2x32GB DDR4 3600 Corsair Vengeance RGB @3866 C18-22-22-22-42 TRFC704 (1.4V Hynix MJR - SoC 1.15V)
Video Card(s) Galax RTX 3090 SG 24GB: Underclocked to 1700Mhz 0.750v (375W down to 250W))
Storage 2TB WD SN850 NVME + 1TB Sasmsung 970 Pro NVME + 1TB Intel 6000P NVME USB 3.2
Display(s) Phillips 32 32M1N5800A (4k144), LG 32" (4K60) | Gigabyte G32QC (2k165) | Phillips 328m6fjrmb (2K144)
Case Fractal Design R6
Audio Device(s) Logitech G560 | Corsair Void pro RGB |Blue Yeti mic
Power Supply Fractal Ion+ 2 860W (Platinum) (This thing is God-tier. Silent and TINY)
Mouse Logitech G Pro wireless + Steelseries Prisma XL
Keyboard Razer Huntsman TE ( Sexy white keycaps)
VR HMD Oculus Rift S + Quest 2
Software Windows 11 pro x64 (Yes, it's genuinely a good OS) OpenRGB - ditch the branded bloatware!
Benchmark Scores Nyooom.
I find it hard to believe AMD will put 2 chiplets and just because it says chiplet design doesnt mean there will be two like ZEN. Although, it worked for ZEN with yields so who knows.
If they can remove various slower, low wattage functions from the GPU die (Video encoding/decoding for example, or the audio processor) it might actually make sense

Anything to make the GPU die smaller and cheaper to make, is a win for them and there are functions of a GPU that wont get penalised from being external
 

Nkd

Joined
Sep 15, 2007
Messages
364 (0.06/day)
I'm starting to believe, that the perf/watt is a dead end in the graphics and CPU industries. It no longer satisfies me when companies say that and obviously the growing power consumption for these has a lot to do with it. I'm looking forward for the new tech but if the power consumption is through the roof, I will literally skip buying and investing in graphics cards and CPUs for that matter.


where do you have 2x performance increase over RDNA2? AMD said 50% increase.
Look at RDNA 2 vs RDNA 1. That was literally 2x. They are just measuring performance/watt and said >50% not 50%. Now you take IPC and other changes. Doubling CUs or Multi chip efficiency you can easily get 2x+. Yes rumors already say it could be up to 450w for top end that just equals more performance.
 
Joined
Jan 24, 2011
Messages
289 (0.06/day)
Processor AMD Ryzen 5900X
Motherboard MSI MAG X570 Tomahawk
Cooling Dual custom loops
Memory 4x8GB G.SKILL Trident Z Neo 3200C14 B-Die
Video Card(s) AMD Radeon RX 6800XT Reference
Storage ADATA SX8200 480GB, Inland Premium 2TB, various HDDs
Display(s) MSI MAG341CQ
Case Meshify 2 XL
Audio Device(s) Schiit Fulla 3
Power Supply Super Flower Leadex Titanium SE 1000W
Mouse Glorious Model D
Keyboard Drop CTRL, lubed and filmed Halo Trues
With the ever-so increasing difficulty of being able to source new, power-efficient nodes, this is very hard to believe. I'd love for it to be true, but I'm going to keep my expectations in check until independent reviews come out.
AMD almost doubled perf/W on the same node going from Vega to RDNA2, so obviously there are other pathways to increased perf/W than node shrinks.
 

ARF

Joined
Jan 28, 2020
Messages
4,670 (2.61/day)
Location
Ex-usa | slava the trolls
Look at RDNA 2 vs RDNA 1. That was literally 2x.

RDNA 1 was a small chip. RX 5700 XT was a mid-range card. RX 6900 XT and RX 6950 XT are enthusiast-tier.

They are just measuring performance/watt and said >50% not 50%. Now you take IPC and other changes. Doubling CUs or Multi chip efficiency you can easily get 2x+.

Yes, I also expect more than 50%, or at least 50%. It can be 60 or 75%, we shall see.
 
Joined
Apr 21, 2005
Messages
185 (0.03/day)
AMD have been pretty reliable with their perf/watt increases over the last few years so I anticipate this will reflect the perf / watt gain for similar tiers of cards.

As for the 2x gains the math checks out if 450W is correct. 1.5x perf/watt * 1.5x more power = 2.25x performance.

RDNA2 was actually around a 54% perf/watt improvement over RDNA1 so if that was the case this time around then that would be a 2.3x performance gain.

I expect such gains will only materialise at 4K due to CPU limits at lower resolutions but that is to be expected.

RT will probably actually matter far more this go around so 4K + RT might be where the proper high end battle arises.
 
Joined
May 2, 2017
Messages
7,762 (2.78/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
Look at RDNA 2 vs RDNA 1. That was literally 2x. They are just measuring performance/watt and said >50% not 50%. Now you take IPC and other changes. Doubling CUs or Multi chip efficiency you can easily get 2x+. Yes rumors already say it could be up to 450w for top end that just equals more performance.
That's not quite right. IPC is not separate from any measure of perf/W - IPC*clocks*core count is (roughly) how you find "perf" after all. That >50% number is for the architecture as implemented, including what you're saying here - anything else would make the statement meaningless. Thus, doubling performance also necessitates an increase in power draw.
 
Joined
Mar 28, 2020
Messages
1,761 (1.02/day)
I feel AMD managed to meet the perf/watt target with RDNA and RDNA2, but this does not apply across the board, i.e. RX 6700 XT being an example where I think it failed. The problem with the way AMD segments their card is how heavy handed they are in slashing the specs with their RX 6000 series. The second most jarring cut was on the RX 6700 XT, and the RX 6500/6400 taking the top spot. Taking the Navi22 (RX 6700 XT) again as example, the number of CUs got a nice 50% cut from the Navi 21. The nearest RX 6800 is quite a lot faster than the RX 6700 XT. The latter have to make up the difference in specs by pushing clockspeed hard. As a result, I feel it is inefficient for its target performance, relative to the RX 6800.
 
Joined
Mar 21, 2016
Messages
2,508 (0.78/day)
Two 100w chips reduce performance per chip 50% add together 50% + 50% = 100% you're still at 100w and double the performance. Of course it begs the question how can they get there? Well a node shrink. I'm not sure how much that accounts for on the chip die, but there could be a bit higher performance per watt on the DRAM side advancement as well. Still 50% is a lot it seems like, but then you've always got infinity cache and it makes a huge difference on performance per watt especially since benefits from compression as well and perhaps the I/O die controlling the two MCM chips handles a bit of decompression and intelligently at the same time.

The node reduction itself from 6nm down to 5nm is what 1/6? across two chip dies which works out to 1/3 they also shuffle logic to the I/O and I'm not sure how much that occupies off hand, but say it bumps up to 40% more silicone space with the space that is pretty good. The other good aspect is heat is spread out more between two chip dies which is better than a one chip die the size of 2 all condensed in one spot. It's much better for the heat load to be spread apart and radiate more to the cooler. That even reduces stress on the VRM's that have to power the fans for the GPU. Something interesting is if a AIB's were to ever put a fan header on the side that could be plugged into a system header instead shifting more stress to the MB VRM's and off of the GPU's VRM's given they can consume a few watts.

It seems pretty reasonable and plausible. Let's not forget there could be a bit more room to increase the die size to make room for more of that cache over the previous chip dies. In fact even not taking that into account if the cache is on the die and you pair two you double the cache. This isn't SLI/CF either plus it's got a dedicated I/O die as well. Just moving logic to the I/O die will free up silicone space on the chip die. It might not be 50% in all instances, but up to in the right scenario I can see it. Lastly FSR is another metric in all of this and gives a uplift on efficiency per watt. You can certainly argue it's important to consider the performance per watt context scenario a company be it AMD/Intel/Nvidia or others are talking about.

I'm going to go out on a limb on this one and say it could be 50% performance per watt or greater across the entire RDNA3 product segment under the right circumstances. You have to also consider along with all the other parts mentioned voltage is squared and smaller dies running at lower wattage require lower voltage increasing efficiency per watt as a whole. So I'm pretty certain this can be very much realistic. I'm not going to say I'm 100% about 50% performance per watt across the entire SKU lineup, but AMD hints at it you can argue without explicitly going into detail. AMD neither indicates nor discredits that it's for a particular RDNA3 SKU, but rather lists RDNA3 which could be either or though can subtly pointing out it's across the product lineup or at least the initial launch product lineup.
 
Last edited:
Joined
Nov 23, 2013
Messages
359 (0.09/day)
Processor AMD Ryzen 7 3700X
Motherboard MSI B350 Tomahawk Arctic
Memory 4x8GB Corsair Vengeance LPX DDR4 3200Mhz
Video Card(s) Gigabyte 6700XT Gaming OC (2.80Ghz core / 2.15Ghz mem)
Storage Corsair MP510 NVMe 960GB; Samsung 850 Evo 250GB; Samsung 860 Evo 500GB;
Display(s) Dell S2721DGFA; Iiyama ProLite B2783QSU;
Case Cooler Master Elite 361
Power Supply Cooler Master G750M
I feel AMD managed to meet the perf/watt target with RDNA and RDNA2, but this does not apply across the board, i.e. RX 6700 XT being an example where I think it failed. The problem with the way AMD segments their card is how heavy handed they are in slashing the specs with their RX 6000 series. The second most jarring cut was on the RX 6700 XT, and the RX 6500/6400 taking the top spot. Taking the Navi22 (RX 6700 XT) again as example, the number of CUs got a nice 50% cut from the Navi 21. The nearest RX 6800 is quite a lot faster than the RX 6700 XT. The latter have to make up the difference in specs by pushing clockspeed hard. As a result, I feel it is inefficient for its target performance, relative to the RX 6800.
You've gotten the whole thing backwards. The 6700XT is a very well optimized chip and runs at 2.5+ghz not because they were pushing it hard, but because the silicon was good and could deal with these clocks without any issues. 6800 was produced from (badly) failed Navi 21s and thus its clocks were the lowest out of the whole RDNA2 series.
Sure, ideally Navi 22 would have been a 60CU design and thus with only 25% cuts applied to computation, i.e. the same as the cuts it got to its memory and IC subsystems. But I imagine when AMD had to plan for how many chips they can get from their 7nm wafers they underestimated their yields - or just decided to err on the side of caution - and we never got a symmetrical trimming down the product lineup.
 
Joined
May 2, 2017
Messages
7,762 (2.78/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
Two 100w chips reduce performance per chip 50% add together 50% + 50% = 100% you're still at 100w and double the performance.
Sorry, but what are these numbers you're working with? Are you inventing them out of thin air? And what's the relation between the different numbers? You also seem to be mixing power and performance? Remember, performance (clocks) and power do not scale linearly, and any interconnect will consume power. You're making this out to be far simpler than it is. Other than that all you're really saying here seems to be the age-old truism of wide and slow chips generally being more efficient. And, of course, you're completely ignoring the cost of using two dice to deliver the performance of one.
The node reduction itself from 6nm down to 5nm is what 1/6? across two chip dies which works out to 1/3
What? A 1/6th/16.67% area reduction from a node change will be 16.67% no matter how large your die, no matter how many of them you combine. A percentage/fractional reduction in area doesn't add up as you add parts together - that number is relative, not absolute.

It's absolutely possible that an MCM approach can allow for power savings, but only if it allows for larger total die sizes and lower clocks. Otherwise it's no different from a monolithic die, except for the added interconnect power. And, of course, larger dice are themselves a fundamental problem when per-transistor costs are no longer dropping noticeably, which is leading to rapidly rising chip prices.
he other good aspect is heat is spread out more between two chip dies which is better than a one chip die the size of 2 all condensed in one spot. It's much better for the heat load to be spread apart and radiate more to the cooler.
Again, this isn't accurate. A GPU die has its heat very evenly spread across the entire die (unlike CPUs which are very concentrated), as most of the die is compute cores. Spreading this across two dice won't affect thermals much, as both dice will still be connected to the same cooler - it's not like you're running them independently of each other. Assuming the same power draw and area for a monolithic and MCM solution, the thermal difference between the two will be minimal. And, crucially, you want the distance between dice on package to be as small as possible to keep latencies low.
That even reduces stress on the VRM's that have to power the fans for the GPU. Something interesting is if a AIB's were to ever put a fan header on the side that could be plugged into a system header instead shifting more stress to the MB VRM's and off of the GPU's VRM's given they can consume a few watts.
Fans generally run directly off 12V and don't rely on VRMs on the GPU, just a fan controller IC sending out PWM signals (unless the fans are for some reason controlled through voltage, which is rather unlikely).

You've gotten the whole thing backwards. The 6700XT is a very well optimized chip and runs at 2.5+ghz not because they were pushing it hard, but because the silicon was good and could deal with these clocks without any issues. 6800 was produced from (badly) failed Navi 21s and thus its clocks were the lowest out of the whole RDNA2 series.
Sure, ideally Navi 22 would have been a 60CU design and thus with only 25% cuts applied to computation, i.e. the same as the cuts it got to its memory and IC subsystems. But I imagine when AMD had to plan for how many chips they can get from their 7nm wafers they underestimated their yields - or just decided to err on the side of caution - and we never got a symmetrical trimming down the product lineup.
Idk, I think the truth is somewhere in the middle. Both chips have distinct qualities and deficiencies. The 6800 is fantastically efficient; the 6700 XT gets a lot of performance out of a relatively small die. Now, the 6700 XT is indeed rather poor in terms of efficiency for an RDNA2 chip, but it still beats out the majority of Ampere GPUs, so ... meh. (The 6500XT is another matter entirely.)

I still can't wrap my head around AMD's RDNA2 segmentation though. The 16-32-40-80CU lineup just doesn't make sense IMO, and kind of forced them to tune the 6700XT the way they did. 20-32-48-80 or something like that would have made a lot more sense. It's also weird just how few SKUs Navi 22 has been used in overall.
 
Joined
Nov 23, 2013
Messages
359 (0.09/day)
Processor AMD Ryzen 7 3700X
Motherboard MSI B350 Tomahawk Arctic
Memory 4x8GB Corsair Vengeance LPX DDR4 3200Mhz
Video Card(s) Gigabyte 6700XT Gaming OC (2.80Ghz core / 2.15Ghz mem)
Storage Corsair MP510 NVMe 960GB; Samsung 850 Evo 250GB; Samsung 860 Evo 500GB;
Display(s) Dell S2721DGFA; Iiyama ProLite B2783QSU;
Case Cooler Master Elite 361
Power Supply Cooler Master G750M
Idk, I think the truth is somewhere in the middle. Both chips have distinct qualities and deficiencies. The 6800 is fantastically efficient; the 6700 XT gets a lot of performance out of a relatively small die. Now, the 6700 XT is indeed rather poor in terms of efficiency for an RDNA2 chip, but it still beats out the majority of Ampere GPUs, so ... meh. (The 6500XT is another matter entirely.)

I still can't wrap my head around AMD's RDNA2 segmentation though. The 16-32-40-80CU lineup just doesn't make sense IMO, and kind of forced them to tune the 6700XT the way they did. 20-32-48-80 or something like that would have made a lot more sense. It's also weird just how few SKUs Navi 22 has been used in overall.
Obviously I cannot know this for certain, but it's gotta be all about the yields. Navi 22 is only in 6700XT because they clearly have near perfect yields on that chip. Meanwhile Navi 21 6900XTs outnumber the 6800s by a factor of [I don't how much exactly, but it must be a lot lol] for the exact same reasons. Had they known how few failed 21s they'll be getting they would have designed a Navi that slots in-between the current 21s and 22s, but since mid-range is where AMD usually gets their sales at and with all the supply shortages that were rearing their heads at the time they just doubled down on the smaller chips and called it a day.
 
Joined
Dec 30, 2021
Messages
394 (0.36/day)
MCM isn't a magical secret sauce that will miraculously make RDNA3 way more power efficient; it's simply a way to make larger GPUs without running into the issue of yields or other limiting factors (such as reticule size). So think of the benefits of MCM as just the benefits of having larger silicon areas and nothing more. But having a larger total silicon area can bring a lot of obvious benefits, and the fact that AMD cites MCM as a factor in their supposed 50% power efficiency boost tells us how they plan to utilize that extra silicon... for now, at least.
 
Top