Monday, November 7th 2022

AMD RDNA3 Navi 31 GPU Block Diagram Leaked, Confirmed to be PCIe Gen 4

Nov 7th, 2022 00:28 Discuss (79 Comments)

An alleged leaked company slide details AMD's upcoming 5 nm "Navi 31" GPU powering the next-generation Radeon RX 7900 XTX and RX 7900 XT graphics cards. The slide details the "Navi 31" MCM, with its central graphics compute die (GCD) chiplet that's built on the 5 nm EUV silicon fabrication process, surrounded by six memory cache dies (MCDs), each built on the 6 nm process. The GCD interfaces with the system over a PCI-Express 4.0 x16 host interface. It features the latest-generation multimedia engine with dual-stream encoders; and the new Radiance display engine with DisplayPort 2.1 and HDMI 2.1a support. Custom interconnects tie it with the six MCDs.

Each MCD has 16 MB of Infinity Cache (L3 cache); and a 64-bit GDDR6 memory interface (two 32-bit GDDR6 paths). Six of these add up to the GPU's 384-bit GDDR6 memory interface. In the scheme of things, the GPU has a contiguous and monolithic 384-bit wide memory bus, because every modern GPU uses multiple on-die memory controllers to achieve a wide memory bus. "Navi 31" hence has a total Infinity Cache size of 96 MB—which may be less in comparison to the 128 MB on "Navi 21," but AMD has shored up cache sizes across the GPU. The L0 caches on the compute units is now increased numerically by 240%. The L1 caches by 300%, and the L2 cache shared among the shader engines, by 50%. The RX 7900 XTX is confirmed to use 20 Gbps GDDR6 memory in this slide, for 960 GB/s of memory bandwidth.

The GCD features six Shader Engines, each with 16 compute units (or 8 dual compute units), which work out to 1,024 stream processors. AMD claims to have doubled the IPC of these stream processors over RDNA2. The new RDNA3 ALUs also support BF16 instructions. The SIMD engine of "Navi 31" has an FP32 throughput of 61.6 TFLOP/s, a 168% increase over the 23 TFLOP/s of the "Navi 21." The slide doesn't quite detail the new Ray Tracing engine, but references new RT features, larger caches, 50% higher ray intersection rate, for an up to 1.8X RT performance increase at 2.505 GHz engine clocks; over the RX 6950 XT. There are other major upgrades to the GPU's raster 3D capabilities, including a 50% increase in prim/clk rates, and 100% increase in prim/vertex cull rates. The pixel pipeline sees similar 50% increases in rasterized prims/clock and pixels/clock; and synchronous pixel-wait.

Source: VideoCardz

Add your own comment

79 Comments on AMD RDNA3 Navi 31 GPU Block Diagram Leaked, Confirmed to be PCIe Gen 4

#26

Jism

WirkoYes but PCIe 5.0 would come in handy in cases when you can't have all 16 lanes for the GPU, and these cases are not so rare in latest motherboards.

A good high end AM4 or Am5 board is capable of providing enough bandwidth.

If you like really need it Threadripper is your friend or even a decent EPYC platform. They all excell in more lanes.

#27

TheoneandonlyMrK

TheLostSwedeIntel only does x16 PCIe 5.0 lanes for the "GPU", but board makers are splitting it into x8 for the "GPU" and x4 for one M.2 slot on a lot of boards, with four lanes being lost. Some higher-end boards get two PCIe 5.0 M.2 slots.

As AMD has two times x4 "spare" PCIe lanes (plus four to the chipset), it's not an issue on most AM5 boards, but there are some odd boards still split the PCIe 5.0 lanes to the "GPU" and use them for M.2 slots.

Yes, x8 PCIe 5.0 lanes are in theory x16 PCIe 4.0 lanes, the issue is that a x16 PCIe 4.0 card, ends up as a PCIe 4.0 x8 card in a PCIe 5.0 x8 slot.
The PCIe spec doesn't allow for eight lanes to magically end up being 16 or 32 lanes of a lower speed grade, it requires an additional, costly chip.

As for the slot mix, with bifurcation it's not a problem to split the lanes on the fly, so as long as you don't use the M.2 slots, your x16 slot remains x16.

AMD allows that, Intel does not as yet.

Fair enough, another reason Intel's part's are not good enough for me then.

#28

Chaitanya

CalenhadYou are mixing your Gbps and GB. Speed and capacity

Its bandwidth(bps) of memory interface(GDDR6) and speed(clock) would be hz.

#29

Hxx

Since pcie 3 vs 4 difference is minimal for a 4090 I’m assuming same goes for AMD’s offerings am I missing something ?

#30

ARF

AsRock8900 will be much better maybe with double cache (196 :eek:).

:kookoo:
If you wait for it, you might end with grey hairs, don't expect any 8900 before 2025 as earliest :D
You know Moore's law is dead, is extremely expensive to use state-of-the-art manufacturing processes IF they exist at all :D

But good design by AMD - don't waste your time putting ridiculous and needless PCIe version inflated digits :D

#31

bug

ARFBut good design by AMD - don't waste your time putting ridiculous and needless PCIe version inflated digits :D

On the other hand, what kind of message do you send when you support PCIe5 in your CPUs, but not GPUs? Is it needed? Is it a gimmick? Is it cost-effective? Is it too expensive?

#32

ARF

bugOn the other hand, what kind of message do you send when you support PCIe5 in your CPUs, but not GPUs? Is it needed? Is it a gimmick? Is it cost-effective? Is it too expensive?

That maybe one should start using water-cooled PCIe 5.0 SSDs :rolleyes:

Alphacool Unveils HDX Pro Water M.2 NVMe SSD Water Block | TechPowerUp

#33

mx500torid

CalenhadYou are mixing your Gbps and GB. Speed and capacity

You are correct. Sorry.

#34

Punkenjoy

dgianstefaniThe "advanced chiplets design" would only be "disruptive" vs Monolithic if AMD either
1. Passed on the cost savings to consumers (prices are same for xtx, and 7900xt is worse cu count % wise to top sku than 6800xt was to 6900xt)
2. Actually used the chiplets to have more cu, instead of just putting memory controllers on there. Thereby maybe actually competing in RTRT performance or vs the 4090.

AMD is neither taking advantage of their cost savings and going the value route, or competing vs their competition's best. They're releasing a product months later, with zero performance advantages vs the competition? Potential 4080ti will still be faster and they've just given nvidia free reign (once again) to charge whatever they want for flagships since there's zero competition.

7950xtx could have two 7900xtx dies, but I doubt it

It can be disruptive without being any of these case.

If AMD is able to sell massively the 7900XTX at 900$ with huge margin, they will become even more profitable, they will have more R&D budget for future gen, more budget to buy new wafer, that also, need less of them. They will be able to produce more cards for cheaper. They will be able to wage price wars and win long term.

Right now, Nvidia must ensure they keep their mindshare as intact as possible and must ensure to win flagship no matter the cost to preserve it. Their current chip is huge and that affect defect rate. With the currently advertised defect rate of TSMC for 5/4 nm, they probably have around 56% yield or 46 good die for 82 candidate. Depending on where the defect is, it's possible that the yield could be a bit higher if they can disable a portion of the chip but don't except something huge.

AMD on the other hand can produce 135 good die per wafer for every 180 candidate for a 74.6% yield. Then you need the MCD that are on a much cheaper node, On those, you would get 1556 good die for 1614 candidate for a 96.34% yield. Another advantages are that it's probably the same MCD for Navi 31 and Navi 32, allowing AMD to quickly switch them from where they are the most needed.

AMD can produce around 260 7950XTX with 2 5nm wafer and 1 6nm wafer.
Nvidia can produce around 150 4090 with 3 4nm wafer.

The best comparison would be with ADA 103, for 139 candidate, there would be 96 good candidate for a 69.3% yield. for 3 4nm wafer, that mean 288 cards.

Then we add the cost of each nodes, 5 and 4 nm cost way more than 6/7 nm. I was not able to find 2022/2023 accurate price, and in the end it would be based on what AMD and Nvidia negociated. But those days, people are moving to 5nm and there is capacity available on 7/6 nm. (It's also why they still producing Zen 3 in masses). Rumors are that TSMC 5 nm cost around 17k and 6nm cost around 7-9K but take this with a grain of salt. (and each vendor have their own negociated price too).

So right now, it look that except Ray Tracing, they are able to produce close to as much 4080 cards with one of the wafer on a much cheaper nodes. They can reuse a portion of that wafer and allocate it to other GPU model easily.

This is what is disruptive. It's disruptive for AMD, not for the end users. (sadly), but overtime those will add up and Nvidia will suffer. Their margin will lower, etc. But Nvidia is huge and rich. This will take many gen and Nvidia will probably have the time to get their own chiplet strategy by then.

Intel is a little bit more in hot water because money already drying and they are barely competitive in desktop platform, they struggle in laptop/notebook and they suffer in data centers. They still compete in performance but their margin have shrinked so much that they need to cancel product to stay afloat.

I think it's easy to miss that with all the hype and the clickbait titles we had. But also, it's normal for the end users to look for what will benefits the most.

#35

AnotherReader

dgianstefaniThe "advanced chiplets design" would only be "disruptive" vs Monolithic if AMD either
1. Passed on the cost savings to consumers (prices are same for xtx, and 7900xt is worse cu count % wise to top sku than 6800xt was to 6900xt)
2. Actually used the chiplets to have more cu, instead of just putting memory controllers on there. Thereby maybe actually competing in RTRT performance or vs the 4090.

AMD is neither taking advantage of their cost savings and going the value route, or competing vs their competition's best. They're releasing a product months later, with zero performance advantages vs the competition? Potential 4080ti will still be faster and they've just given nvidia free reign (once again) to charge whatever they want for flagships since there's zero competition.

7950xtx could have two 7900xtx dies, but I doubt it

Given the massive off-chip bandwidth, a dual GCD GPU would have been very viable. I'm puzzled by their reluctance to go down that route.

#36

ARF

dgianstefaniPotential 4080ti will still be faster

Potential RTX 4080 Ti can be a further cut down AD102 which would render RTX 4090 needless.

dgianstefaniThe "advanced chiplets design" would only be "disruptive" vs Monolithic if AMD either
1. Passed on the cost savings to consumers (prices are same for xtx, and 7900xt is worse cu count % wise to top sku than 6800xt was to 6900xt)
2. Actually used the chiplets to have more cu, instead of just putting memory controllers on there. Thereby maybe actually competing in RTRT performance or vs the 4090.

Chiplets are not suitable for GPUs because they will introduce heavy latencies and all kinds of performance issues.
Actually we had "chiplets" called Radeon HD 3870 X2, Radeon HD 4870 X2, Radeon HD 6990, Radeon HD 7990, Radeon R9 295X2, Radeon Pro Duo. And their support was abandoned.

#37

AnotherReader

ARFPotential RTX 4080 Ti can be a further cut down AD102 which would render RTX 4090 needless.

Chiplets are not suitable for GPUs because they will introduce heavy latencies and all kinds of performance issues.
Actually we had "chiplets" called Radeon HD 3870 X2, Radeon HD 4870 X2, Radeon HD 6990, Radeon HD 7990, Radeon R9 295X2, Radeon Pro Duo. And their support was abandoned.

Those had insignificant inter-die bandwidth compared to the numbers AMD has shown; about 3 orders of magnitude more than the 295X2. The interconnect between the MCD and the GCD is a game changer; I don't understand why they didn't go for the jugular with a dual die GCD.

#38

Punkenjoy

AnotherReaderGiven the massive off-chip bandwidth, a dual GCD GPU would have been very viable. I'm puzzled by their reluctance to go down that route.

ARFPotential RTX 4080 Ti can be a further cut down AD102 which would render RTX 4090 needless.

Chiplets are not suitable for GPUs because they will introduce heavy latencies and all kinds of performance issues.
Actually we had "chiplets" called Radeon HD 3870 X2, Radeon HD 4870 X2, Radeon HD 6990, Radeon HD 7990, Radeon R9 295X2, Radeon Pro Duo. And their support was abandoned.

The main problem is to have the system to see 1 GPU and have 1 memory domain.

ARF, all your example are actually just crossfire on a single board. They are not even chiplets.

But the main problem is how you make it so the system only see 1 GPU. On CPU, you can add as many core as you want and the OS will be able to use all those cores. With the I/O die, you can have a single memory domain, but even if you don't, Operating System support Non-Unified Memory Architecture (NUMA) for decades.

For it to be possible on a GPU, you would need to have 1 common front end, with compute die attached. I am not sure if you would need to have the MCD connected to the front end or to the compute die, but i am under the impression it would be to the front end.

The thing is you want edge space to communicate with your other die. MCD, Display output, PCIE-E. There is no more edge space available. The biggest space users is the compute units, not the front end. I think this is why they did it that way.

In the future, we could see a large front end with a significant amount of cache, with multiple independent compute die connected to it. Cache do not scale well with newer nodes so that would mean the front end could be made on a older nodes (if that is worth it. some area might need to be faster).

Anyway, Design are all about tradeoff. Multichips could be done, but the tradeoff are probably not worth it yet so they went with this unperfect design that still do not allow having multiple GCD.

#39

Chrispy_

ARFActually we had "chiplets" called Radeon HD 3870 X2, Radeon HD 4870 X2, Radeon HD 6990, Radeon HD 7990, Radeon R9 295X2, Radeon Pro Duo. And their support was abandoned.

Those weren't chiplets. That was two completely independent graphics cards connected together on a single board by a PCIe switch.
It was called crossfire-on-a-stick because it was just that. You could actually expose each GPU separately in some OpenCL workloads and have them run independently.

#40

ARF

AnotherReaderThose had insignificant inter-die bandwidth compared to the numbers AMD has shown; about 3 orders of magnitude more than the 295X2. The interconnect between the MCD and the GCD is a game changer; I don't understand why they didn't go for the jugular with a dual die GCD.

Ok, I will tell you straight - because it won't work. AMD didn't use it because it doesn't make sense to do so.

#41

dgianstefani

TPU Proofreader

PunkenjoyIt can be disruptive without being any of these case.

If AMD is able to sell massively the 7900XTX at 900$ with huge margin, they will become even more profitable, they will have more R&D budget for future gen, more budget to buy new wafer, that also, need less of them. They will be able to produce more cards for cheaper. They will be able to wage price wars and win long term.

The XTX is $1000.
How well is AMDs "huge margins" going with the release of Zen 4? Remind me the economics of a product with 50% margin that sells 1000 units, compared to a product with 30% margin that sells 10,000 units.

PunkenjoyRight now, Nvidia must ensure they keep their mindshare as intact as possible and must ensure to win flagship no matter the cost to preserve it. Their current chip is huge and that affect defect rate. With the currently advertised defect rate of TSMC for 5/4 nm, they probably have around 56% yield or 46 good die for 82 candidate. Depending on where the defect is, it's possible that the yield could be a bit higher if they can disable a portion of the chip but don't except something huge.

Defect 4090s can become 4080ti. Their defect rate also doesn't matter that much when they can charge whatever they like, due to no competition.

PunkenjoySo right now, it look that except Ray Tracing, they are able to produce close to as much 4080 cards with one of the wafer on a much cheaper nodes. They can reuse a portion of that wafer and allocate it to other GPU model easily.

This is what is disruptive. It's disruptive for AMD, not for the end users. (sadly), but overtime those will add up and Nvidia will suffer. Their margin will lower, etc. But Nvidia is huge and rich. This will take many gen and Nvidia will probably have the time to get their own chiplet strategy by then.

Intel is a little bit more in hot water because money already drying and they are barely competitive in desktop platform, they struggle in laptop/notebook and they suffer in data centers. They still compete in performance but their margin have shrinked so much that they need to cancel product to stay afloat.

I think it's easy to miss that with all the hype and the clickbait titles we had. But also, it's normal for the end users to look for what will benefits the most.

Except for RTRT and a driver and software development team that is competent and huge (compared to AMD).

I guess we'll see what the performance is like when the reviews come out.

AMD doesn't seem to understand their position in the GPU market - they're the underdog, with worse development resources, technology (primarily software, but also dedicated hardware units) that is several years behind the competition, significantly lower market share etc. It's not a reasonable prediction to say "NVIDIA will suffer", it's AMD that needs to actually compete. They needed to go balls to the wall and release a multi die card that blew the 4090 out of the water, while being the same price or cheaper to change this "mindshare", instead they've released another generation of cards that on paper is "almost as good", while being a bit cheaper and without the NVIDIA software stack advantage.

#42

Nopa

dgianstefaniAMD doesn't seem to understand their position in the GPU market - they're the underdog, with worse development resources, technology (primarily software, but also dedicated hardware units) that is several years behind the competition, significantly lower market share etc. It's not a reasonable prediction to say "NVIDIA will suffer", it's AMD that needs to actually compete. They needed to go balls to the wall and release a multi die card that blew the 4090 out of the water, while being the same price or cheaper to change this "mindshare", instead they've released another generation of cards that on paper is "almost as good", while being a bit cheaper and without the NVIDIA software stack advantage.

This is exactly what Coreteks stated in his latest video. AdoredTV presented a different narrative however.

#43

ARF

dgianstefaniAMD doesn't seem to understand their position in the GPU market - they're the underdog, with worse development resources, technology (primarily software, but also dedicated hardware units) that is several years behind the competition, significantly lower market share etc. It's not a reasonable prediction to say "NVIDIA will suffer", it's AMD that needs to actually compete. They needed to go balls to the wall and release a multi die card that blew the 4090 out of the water, while being the same price or cheaper to change this "mindshare", instead they've released another generation of cards that on paper is "almost as good", while being a bit cheaper and without the NVIDIA software stack advantage.

Yes, speaking of how bad the AMD situation is. They have 0 design wins with 4K notebooks. Everything there is Intel / nvidia only.

#44

mechtech

Meh. As long as the bottom cards have a good media engine…….unlike the 6400…….

#45

AnotherReader

mechtechMeh. As long as the bottom cards have a good media engine…….unlike the 6400…….

I'm not sure that we'll get a replacement for the 6400 any time soon; only Intel seems to be interested in the lower end GPUs.

#46

Punkenjoy

dgianstefaniDefect 4090s can become 4080ti. Their defect rate also doesn't matter that much when they can charge whatever they like, due to no competition.

Yes and no.

Not all defect are equal, If you have a slight deflect in one of the compute unit, you can just disable it, same thing with one of the memory controler. But if it's in area with no redundancy, you cannot reuse the chip. And that is there is only one defect per chip.

Cheaper SKU can also be just chip that cannot clock as high as other. Nothing say by example that a 6800 have defect, but they might have found that it didn't clocked high enough to be a 6800 XT or 6900XT/6950XT. It's also possible that the chip had one of the compute unit that really didn't clock very high so they had to disable it to reach clock (so a good chip without defect but not good enough).

That affect everyone, but bigger chip are more at risk of defect. in the end, the number can be taken at face value because the larger chip, the higher the chance something is wrong and that it's not reusable by deactivating some portion of it.

#47

Emu

dgianstefaniThe "advanced chiplets design" would only be "disruptive" vs Monolithic if AMD either
1. Passed on the cost savings to consumers (prices are same for xtx, and 7900xt is worse cu count % wise to top sku than 6800xt was to 6900xt)
2. Actually used the chiplets to have more cu, instead of just putting memory controllers on there. Thereby maybe actually competing in RTRT performance or vs the 4090.

AMD is neither taking advantage of their cost savings and going the value route, or competing vs their competition's best. They're releasing a product months later, with zero performance advantages vs the competition? Potential 4080ti will still be faster and they've just given nvidia free reign (once again) to charge whatever they want for flagships since there's zero competition.

7950xtx could have two 7900xtx dies, but I doubt it

1. You don't think the cost savings have been passed on to consumers? Don't forget that TSMC has increased silicon prices by at least 15% in the past year. This means that AMD releasing the 7900XTX for the same price as the 6900XT released for is the cost savings being passed onto consumers. The "advanced chiplet design" lets AMD get more viable GPU dies per 300mm wafer which means that availablility of the 7900XTX is should be higher than the 4090 which has a die that is twice the size.

2. Perhaps the infinity fabric isn't quite there yet to let them have the CUs on multiple dies? Or perhaps the front end requires 5nm to perform effectively which means that AMD would have to do two different designs on 5nm wafers which would complicate their production schedule and increased costs?

In other words, AMD is taking advantage of their cost savings and releasing their top tier product at $600 less than Nvidia's top tier product and at $200 less than what the apparent competitive GPU is launching at. Things get even worse for Nvidia in markets outside of the USA. For example, here in Australia, if AMD does not price gouge us like Nvidia is doing then the 7900XTX will be roughly half the price of a 4090 and that price saving will be enough to build the rest of your PC (e.g. 5800X3D, 16GB/32GB DDR4, b550 motherboard, 850W PSU, $200 case).

#48

dgianstefani

TPU Proofreader

Emu1. You don't think the cost savings have been passed on to consumers? Don't forget that TSMC has increased silicon prices by at least 15% in the past year. This means that AMD releasing the 7900XTX for the same price as the 6900XT released for is the cost savings being passed onto consumers. The "advanced chiplet design" lets AMD get more viable GPU dies per 300mm wafer which means that availablility of the 7900XTX is should be higher than the 4090 which has a die that is twice the size.

2. Perhaps the infinity fabric isn't quite there yet to let them have the CUs on multiple dies? Or perhaps the front end requires 5nm to perform effectively which means that AMD would have to do two different designs on 5nm wafers which would complicate their production schedule and increased costs?

In other words, AMD is taking advantage of their cost savings and releasing their top tier product at $600 less than Nvidia's top tier product and at $200 less than what the apparent competitive GPU is launching at. Things get even worse for Nvidia in markets outside of the USA. For example, here in Australia, if AMD does not price gouge us like Nvidia is doing then the 7900XTX will be roughly half the price of a 4090 and that price saving will be enough to build the rest of your PC (e.g. 5800X3D, 16GB/32GB DDR4, b550 motherboard, 850W PSU, $200 case).

Availability is irrelevant when 80% of people still choose their competition.

Their top tier product competes in raster to Nvidia's 3rd/4th product down the stack (4090ti, 4090, 4080ti, 4080), therefore the fact it's cheaper is borderline irrelevant. That's without getting started on the non-raster advantages NVIDIA has.

In other words, the situation hasn't changed since the 6950xt and the 3090ti/6900 and 3090.

#49

THU31

AnotherReaderI'm not sure that we'll get a replacement for the 6400 any time soon; only Intel seems to be interested in the lower end GPUs.

I definitely hope they will care again, even though it seems unlikely.

Next year I am planning on building a separate strictly gaming PC and I want to convert my current build to an HTPC that will handle everything else, including recording with a capture card. I would definitely want a cheap graphics card with AV1 encoding. If NVIDIA or AMD do not offer such a product, I might go with Intel (A380 or whatever).

#50

Nopa

mechtechMeh. As long as the bottom cards have a good media engine…….unlike the 6400…….

6400 & 6500 XT's true potentials were nerfed a lot by PCIe 4.0 x4, 4GB VRAM & 64-bit Bus.
I personally won't trade tiny amount of RT Cores, newer Encoder-Decoder & HDMI 2.1 40Gbps for their predecessor 5500 XT 8GB. Even 470 4GB from 2016 brutalises 6400 in all scenarios.

Add your own comment

AMD RDNA3 Navi 31 GPU Block Diagram Leaked, Confirmed to be PCIe Gen 4

79 Comments on AMD RDNA3 Navi 31 GPU Block Diagram Leaked, Confirmed to be PCIe Gen 4

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts

AMD RDNA3 Navi 31 GPU Block Diagram Leaked, Confirmed to be PCIe Gen 4

Related News

79 Comments on AMD RDNA3 Navi 31 GPU Block Diagram Leaked, Confirmed to be PCIe Gen 4

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts