Friday, May 24th 2024

AMD Adds RDNA 4 Generation Navi 44 and MI300X1 GPUs to ROCm Software

May 24th, 2024 02:32 Discuss (22 Comments)

AMD has quietly added some interesting codenames to its ROCm hardware support list. The biggest surprise is the appearance of "RDNA 4" and "Navi 44" codenames, hinting at a potential successor to the current RDNA 3 GPU architecture powering AMD's Radeon RX 7000 series graphics cards. The upcoming Radeon RX 8000 series could see Navi 44 SKU with a codename "gfx1200". While details are scarce, the inclusion of RDNA 4 and Navi 44 in the ROCm list suggests AMD is working on a new GPU microarchitecture that could bring significant performance and efficiency gains. While RDNA 4 may be destined for future Radeon gaming GPUs, in the data center GPU compute market, AMD is preparing a CDNA 4 based successors to the MI300 series. However, it appears that we haven't seen all the MI300 variants first. Equally intriguing is the "MI300X1" codename, which appears to reference an upcoming AI-focused accelerator from AMD.

While we wait for more information, we can't decipher whether the Navi 44 GPU SKU is for the high-end or low-end segment. If previous generations are for reference, then the Navi 44 SKU would target the low end of the GPU performance spectrum. The previous generation RDNA 3 had Navi 33 as an entry-level model, whereas the RDNA 2 had a Navi 24 SKU for entry-level GPUs. We have reported on RDNA 4 merely being a "bug correction" generation to fix the perf/Watt curve and offer better efficiency overall. What happens finally, we have to wait and see. AMD could announce more details in its upcoming Computex keynote.

Sources: Kepler_L2, via Tom's Hardware

Add your own comment

22 Comments on AMD Adds RDNA 4 Generation Navi 44 and MI300X1 GPUs to ROCm Software

Daven

There is no way Navi 44 and 48 are following the old nomenclature. An ‘8’ would be slower than the slowest iGPU. More than likely we are looking at 44 = 40 CUs and 48 = 80 CUs.

Sabotaged_Enigma

I don't know whether Navi 48 is bigger than Navi 44. I don't even know why they chose the name 48 and 44. Despite that they might have abandoned high-end on RDNA 4, they could've stuck to the original naming scheme and called them Navi 43 and 44. Seems stupid to me.
But that aside, if RDNA 4 is any good, which means power efficient enough and neat specs, I would still consider upgrading from my 6600 XT.

Denver

DavenThere is no way Navi 44 and 48 are following the old nomenclature. An ‘8’ would be slower than the slowest iGPU. More than likely we are looking at 44 = 40 CUs and 48 = 80 CUs.

Assuming they are mid-end chips, they certainly don't have 80CU.

I was wondering if that was the number of ASICs dedicated to RT, but it doesn't make sense either, if one is a successor to the RX 7600, the gap in "RT core count" wouldn't be that great.

Daven

DenverAssuming they are mid-end chips, they certainly don't have 80CU.

I was wondering if that was the number of ASICs dedicated to RT, but it doesn't make sense either, if one is a successor to the RX 7600, the gap in "RT core count" wouldn't be that great.

Are you saying 80CU is too low or too high? Remember each generation moves the segmentation down one notch. The 7900 GRE (80 CUs) which is the lowest of the three high end this generation will become the 8700XT or 8800XT next gen. If AMD was to release a high end, it would have to have 128-160 CUs. So 80 CUs is exactly mid-end in 2024. Also the rumored performance of the 8000 series mid-end is between 7900xt and 7900xtx.

Edit: I found a good summation of everything RDNA4. The article places the Navi 48 ahead of the Navi 44 but claims 64 CUs instead of my estimate of 80 CUs. Here is the performance prediction:

"Comparing the flagship RDNA 4 card to the rest of the competition tells a different story. Most sources estimate that the rumored RX 8800 XT should be close to the RX 7900 XT in performance. Moore’s Law Is Dead says that it’ll be around 10% slower than the RX 7900 XTX, putting it close to Navi 31 and Nvidia’s RTX 4080. It should outperform the RTX 4070 Ti Super by a small margin, and RedGamingTech predicts that it’ll be slightly faster than the recent RX 7900 GRE."

AMD RDNA 4: Everything we know about the RX 8000 series | Digital Trends

64 CUs, 80 CUs, either way, predictions are saying 7900 level performance.

Denver

DavenAre you saying 80CU is too low or too high? Remember each generation moves the segmentation down one notch. The 7900 GRE (80 CUs) which is the lowest of the three high end this generation will become the 8700XT or 8800XT next gen. If AMD was to release a high end, it would have to have 128-160 CUs. So 80 CUs is exactly mid-end in 2024. Also the rumored performance of the 8000 series mid-end is between 7900xt and 7900xtx.

Edit: I found a good summation of everything RDNA4. The article places the Navi 48 ahead of the Navi 44 but claims 64 CUs instead of my estimate of 80 CUs. Here is the performance prediction:

"Comparing the flagship RDNA 4 card to the rest of the competition tells a different story. Most sources estimate that the rumored RX 8800 XT should be close to the RX 7900 XT in performance. Moore’s Law Is Dead says that it’ll be around 10% slower than the RX 7900 XTX, putting it close to Navi 31 and Nvidia’s RTX 4080. It should outperform the RTX 4070 Ti Super by a small margin, and RedGamingTech predicts that it’ll be slightly faster than the recent RX 7900 GRE."

AMD RDNA 4: Everything we know about the RX 8000 series | Digital Trends

64 CUs, 80 CUs, either way, predictions are saying 7900 level performance.

From everything that has been revealed*, we can at least point out that RDNA4 has improvements to improve code execution efficiency, which means doing more with less, GPUs in general have a problem with shaders being underutilized. RDNA3 has made great progress in this regard, although it still requires a special look at the software side (game+drivers), RDNA4, apparently, will make this optimization easier. Due to these factors and higher clocks, I believe that now with, let's say 64CU, AMD will be able to deliver performance equivalent to 7900XT.

This is so true that AMD found itself with room(It's still 4nm) to add dedicated RT ASICs and perhaps even AI-related paraphernalia.

Examining AMD’s RDNA 4 Changes in LLVM – Chips and Cheese
AMD RDNA 4: Everything we know about the RX 8000 series | Digital Trends

Daven

DenverThis is so true that AMD found itself with room(It's still 4nm) to add dedicated RT ASICs and perhaps even AI-related paraphernalia.

RDNA3 is 6/5 nm. The N4P process rumored for RDNA4 is different. Not by much but different.

Denver

DavenRDNA3 is 6/5 nm. The N4P process rumored for RDNA4 is different. Not by much but different.

It seems to me that the only difference between the N4P and the N4 is a little more performance potential, which supports the higher clock direction. But, in principle, the logical density is the same, just 6% higher than 5nm.

TSMC Details N4X Process for HPC: Extreme Performance at Minimum Leakage (anandtech.com)

alwayssts

DenverIt seems to me that the only difference between the N4P and the N4 is a little more performance potential, which supports the higher clock direction. But, in principle, the logical density is the same, just 6% higher than 5nm.

TSMC Details N4X Process for HPC: Extreme Performance at Minimum Leakage (anandtech.com)

Everything supports the small(ish) chip/higher-clock direction. We know RDNA3 was built for "3ghz+" according to AMD's own slide, and it was clearly somehow FUBAR wrt that (efficient and/or productized) reality.

N4P brings +8-11% (depending on which company you ask) clocks. N4X another 3-4% on top of that.

I've long looked to the A16/M2 for guidance on what to expect; Snapdragon is similar (3.2-3.4 mobile, 3.8 SDX). It may be lower, but I still think 3.2-3.4ghz stock is likely, with at least 3.5ghz OC; maybe higher.

Do we even know for certain it's N4P? Perhaps that's true, but when I see something like the N4X process even existing...I instantly think of RV790. 55HP was a partnership with TSMC for (only?) that chip.

A thing to note about RV790 (vs RV770) was, and I quote: "AMD revised the core by adding decoupling capacitors, new timing algorithms, and altered the ASIC power distribution for enhanced operation".

Hmm. Yep, that pretty much sounds like what they need to do (AFAICT).

I can't think of many companies that would desire, let-alone use a "X" process, which is about clocks (and perhaps conserving die size vs a larger unit design), not power savings. Certainly not Apple; likely not nVIDIA.

As I mentioned in another thread, it's existance reminds of nothing other than that particular instance, and really is the lineage of chips dating back to RV670 (which launched one year after AMD acquired ATi).

That has been and continues to be my prediction/speculation, but some or all of it could be wrong.

Assuming they enhanced (perhaps 4-bit, but) 8-bit operations (AI/RT), would you not consider that a high-end chip (If, say, if it could [even overclocked] average 1440p120 mins across TPU's suite...just like a 4080)?

Similarly, I think the opportunity is there not only to absolutely destroy the 4060 Ti(/S?) 16GB in value, but compete with (XSX/PS5 vanilla) consoles w/ higher perf for less money; maybe similar dosh to a XSS.

Also, it's interesting AMD's market-share generally goes up when they release value-oriented products (RX480, 5700xt, 6600/6700/6800-series price-drops when 6650/6750/6950xt launched; ~$200-600).

Wrt consumers though, it could actually be a logical tipping point back to PC (as generally happens around this point in the cycle; perhaps this time nVIDIA be-damned) if you ONLY need a GPU upgrade.

Why buy a (7800xt-like) PS5 pro if there is a similar-priced 4080 competitor? Why buy a PS5/XSX if there is a faster product for cheaper? Why buy a XSS if there is a XSX-level card for a similar price?

I suppose it's foolish to go so hard into speculation when the reality isn't far away, but it's what I would do...while there doesn't appear to be evidence against it.

Not only is it feasible, it's the only two markets that are really important (wrt volume). I understand that margins are key (that's why people like nVIDIA, right?), but wrt AMD, they need mindshare/market-share uptake.

I can't think of a better way to do that create and sell parts that are tangibly-better than console options for a similar price, or sometimes lower, depending upon the comparison.

Minus Infinity

DavenThere is no way Navi 44 and 48 are following the old nomenclature. An ‘8’ would be slower than the slowest iGPU. More than likely we are looking at 44 = 40 CUs and 48 = 80 CUs.

Naming scheme makes no sense based on what's gone before, and it's arse about with N48 being higher tier than N44. Not a snowball's chance in hell even N48 will be 80CU unless a CU is way different to what we've had before. N48 is supposedly an 8700XT class GPU.

#10

alwayssts

Minus InfinityNaming scheme makes no sense based on what's gone before, and it's arse about with N48 being higher tier than N44. Not a snowball's chance in hell even N48 will be 80CU unless a CU is way different to what we've had before. N48 is supposedly an 8700XT class GPU.

I thought it was other wat around; N48 high-end and N44 lower-end; haven't kept a close-enough eye on it. I may have them reversed.

That said, I don't see why they would use more than 8192sp. Even if AI/RT scales linearly with RDNA2, that's still 128 AI Engines and 64 RT accelerators.

They trail nVIDIA by what, ~20% clock-per-clock in RT (I forget)? You can see why would want to boost the units slightly (even with a clockspeed bump), else they would require a clockspeed of 3300mhz to even compete with GA104. That's simply not good-enough; they have to beat GA104 (old 4070 Ti) regardless of how well clockspeeds turn out, and this affords them a couple hundred mhz leeway.

Also, in the past, the golden ratio to 64 ROPs would've been something like 7511-7512 units. While 7800xt already has 96 ROPs, it makes sense to add the units because of space afforded by 4nm, especially if they are able to do that and still ramp the clockspeed to it's greatest potential within 375w.

This is just speculation on my part, but I have to wonder if the N41/N42/N43 'designs' were based on N4C configurations; perhaps refreshes/shrinks were (internally) N45/46/47.

When they decided to shift that whole chiplet design to RDNA5, I wonder if the monolithic chips they replaced them with inevitably became N44 and N48 simply as they went into design much later.

It's kind of like how we had R300 (9700)->R420 (x800). Why was there no R400? Because it got scrapped for being kinda-sorta in a similar situation to N4C (although it did go into the Xbox 360).

Also why is RV530 slower than R520? Because the refreshed arch was newer and the new R520 was R580. Same holds true for then going to R670 for the higher-end and lower-numbers for the lower-end.

Of course, then we got trees, islands, and all that stuff.

I'm sure there's some rhyme or reason to do it, but it's not like AMD hasn't changed up their internal naming schemes numerous times in the past.

One can only assume it's because the original versions that followed the previous convention were scrapped, as that's what has happened before, but I can't speak for certain.

Hopefully some day the complete story will come to be told, as those are usually pretty interesting, imo.

#11

Daven

Minus InfinityNaming scheme makes no sense based on what's gone before, and it's arse about with N48 being higher tier than N44. Not a snowball's chance in hell even N48 will be 80CU unless a CU is way different to what we've had before. N48 is supposedly an 8700XT class GPU.

The jump from 6700XT to 7700XT was 40 to 54 CUs, a 35% increase. I’m saying the Navi 48 is the 8800XT with 80 CUs, a 33% increase over the 60 CUs found in the 7800XT. Other sites also think the two SKUs are 8700XT and 8800XT. Both of these are midrange. Also the performance prediction for the 8800XT is around the 7900 range.

alwaysstsI thought it was other wat around; N48 high-end and N44 lower-end; haven't kept a close-enough eye on it. I may have them reversed.

The lower number has been the highest performing part in the past. The rumors say this has been reversed.

Navi 33 7600
Navi 32 7800
Navi 31 7900

Also the numbers are no longer 1,2,3.

Navi 48 8700/8800
Navi 44 8500/8600

#12

Denver

DavenThe jump from 6700XT to 7700XT was 40 to 54 CUs, a 35% increase. I’m saying the Navi 48 is the 8800XT with 80 CUs, a 33% increase over the 60 CUs found in the 7800XT. Other sites also think the two SKUs are 8700XT and 8800XT. Both of these are midrange. Also the performance prediction for the 8800XT is around the 7900 range.

The implication of this comparison is that 6700XT is 7nm, 7800XT is 5nm (45% denser). RDNA4 will work on top of 4nm, which is only 6% denser, and offers no margin to stretch the design without giving up performance, and therefore profit margin.

A design aimed at competing in the most sensitive market in terms of performance/price(Mid/Low-end) has to be lean and extract every % of performance from the silicon possible.

#13

Daven

DenverThe implication of this comparison is that 6700XT is 7nm, 7800XT is 5nm (45% denser). RDNA4 will work on top of 4nm, which is only 6% denser, and offers no margin to stretch the design without giving up performance, and therefore profit margin.

A design aimed at competing in the most sensitive market in terms of performance/price(Mid/Low-end) has to be lean and extract every % of performance from the silicon possible.

The 7900GRE is carved out of the 96 CU Navi 31 using 6/5 nm dies. An 80 CU only die at 4 nm would closely match your criteria.

#14

Denver

DavenThe 7900GRE is carved out of the 96 CU Navi 31 using 6/5 nm dies. An 80 CU only die at 4 nm would closely match your criteria.

It would make sense if RDNA4 did not take the monolithic route integrating all the huge L3 cache, solidified into design, besides the dedicated RT ASICs.
64CU in more efficient architecture at higher frequency can be easily equivalent to or faster than 80CU RDNA3. We are talking about a gap of only 25%.

#15

ARF

64 CUs, 4096 shaders, and performance around or slightly higher than Radeon RX 6800 XT | RTX 4070.

Navi 48: 32 WGP + 48MB of Infinity Cache + GDDR7 memory + 192-bit memory bus + PCIe 5.0 x16
Navi 44: 20 WGP + 32MB of Infinity Cache + GDDR7 memory + 128-bit memory bus + PCIe 5.0 x8

or

Navi 48: 32 WGP + 32MB of Infinity Cache + GDDR7 memory + 128-bit memory bus + PCIe 5.0 x8
Navi 44: 20 WGP + 24MB of Infinity Cache + GDDR7 memory + 96-bit memory bus + PCIe 5.0 x8

www.tweaktown.com/news/94533/amds-next-gen-rdna-4-navi-44-and-48-from-radeon-rx-8000-series-gpus-appear-in-linux/index.html

#16

alwayssts

DenverIt would make sense if RDNA4 did not take the monolithic route integrating all the huge L3 cache, solidified into design, besides the dedicated RT ASICs.
64CU in more efficient architecture at higher frequency can be easily equivalent to or faster than 80CU RDNA3. We are talking about a gap of only 25%.

Correct wrt performance. The 7900GRE is still important though, as that part shows exactly where AMD needs to price this product (even though it will likely perform better).

It is literally the benchmark for $ people will pay in that segment, hence you've seen it drastically fluctuate in price over time.

People forget 7800xt runs at 2425mhz stock. While W1zard showed 2551, most report 2448mhz for 7900xt. We've seen 7900xtx [2631 avg wrt W1z] at 3200mhz (and someone recently froze it for 3390mhz) OC.

This is closer to what the arch was intended, imho, but it was and/or AMD held it back for whatever reason (probably power/heat/yield/price).

The jumps (on the stock charts) will appear huge to the avg Joe because RDNA3 ran at very low clocks (and even 7900gre was limited wrt overclocking not only through PL, but actual bios-limiting of clocks).

To people that overclocked the 7800xt/7900xt to ~2900mhz, these lifts will not look as substantial, but it still matters because of relative performance to 4080, and price/perf to 4070 Super/4070 Ti Super.

The same will (prob) be true of N44/4060 Ti 16GB (XSX): where-as 7600 was just about getting perf similar to a PS5; for PC gamers to keep up with relevant gaming on standard console platforms for cheap.

The efficiency yield/curve (of 5nm) appeared to be around 2900mhz. You see this not only is around the stock clock of Ada (~2730mhz), but it's overclock potential. It sits fairly perfectly on the curve.

The point of RDNA3 was to maximize clock potential with less silicon (over the perf/w curve), but for w/e reason that didn't work (wrt power/[heat?] efficiency) and/or was held back in certain circumstances.

I would imagine RDNA4 not only fixes this, but also adds the performance additions of 4nm (P or X).

You also have to remember the density of non-logic xtors on (5nm) would shrink to .021 from .027 on 7nm.

Yes, they (the MCDs) were 6nm and this is 4nm, but remember that even 3nn only shrinks non-logic/sram xtors to .0199 (and it increases on n3e back to 5nm level).

While I have no idea how sram size was effected on 6nm (from 7nm) and/or how it is on 4nm (from 5nm), it's likely by incorporating it on-chip *may* make it just about as small as it can get (similar to N3B).

This is important, because (as shown with RDNA2) AMD truly does need that extra cache; it would make sense if L3/IC were doubled from 32/64 to 64/128MB, but ofc there are other ways to go about it too.

If you look at a MCD, or even v-cache on their CPUs, 16MB of cache is not very large...so it makes sense, especially when compared to perhaps expensive/less power-efficient 24gbps GDDR6 to increase bw.

I find it interesting when people mention it could use 18gbps GDDR6. While supporting the idea of the larger cache and cheapest memory (making the product sustainable wrt bw but also cheap), IDK about 18gbps.

This is because Samsung literally EOL'ed 18gbps RAM, and Hynix 20gbps is abundant.

I understand most use Hynix, but it would still be weird to cut out a whole supplier. I think sustaining 20Gbps makes the most sense.

If it clocks well, it's possible we may see a (refresh/later) iteration with 24gbps from Samsung. It's unknown at this time if it will require that (given likely cache improvements), but it likely could utilize 20gbps regardless, which again they could dual-source and hence makes more sense. 18Gbps only makes sense if it doesn't clock as well (as I hope), cache improvements are beyond what I anticipate, or they/AIBs struck one hell of a deal with Hynix for 18gbps chips, which appears to not be the case given even low-end and/or stock products with <20gbps stock clockspeeds in-fact use 20Gbps-rated chips.

TLDR: RDNA3 was partly fubar, but was also held-back/positioned VERY OBVIOUSLY at stock to be replaced by a refresh (by both atypically low stock core/memory clocks and power/bios limitations).

RDNA4 should allow the architecture to breathe and live it's best life; hopefully scaling better with voltage/power and more evenly-spread heat, with sustainable bandwidth to perform the best it can.

But also, you know, be cheap, while performing as well as it needs to in order to pacify the required performance per segment (in theory).

ARF64 CUs, 4096 shaders, and performance around or slightly higher than Radeon RX 6800 XT | RTX 4070.

Navi 48: 32 WGP + 48MB of Infinity Cache + GDDR7 memory + 192-bit memory bus + PCIe 5.0 x16
Navi 44: 20 WGP + 32MB of Infinity Cache + GDDR7 memory + 128-bit memory bus + PCIe 5.0 x8

or

Navi 48: 32 WGP + 32MB of Infinity Cache + GDDR7 memory + 128-bit memory bus + PCIe 5.0 x8
Navi 44: 20 WGP + 24MB of Infinity Cache + GDDR7 memory + 96-bit memory bus + PCIe 5.0 x8

www.tweaktown.com/news/94533/amds-next-gen-rdna-4-navi-44-and-48-from-radeon-rx-8000-series-gpus-appear-in-linux/index.html

That is very old/third-hand (or worse) information. I respect Paul, and sometimes he is fed good information, but also sometimes he isn't. Also, things can change in development (as mentioned earlier).

That isn't his fault; he certainly *tries* to get info where he can (which is appreciated), and can sometimes be very much in the forefront of getting certain particulars of future products to the public.

That said, I don't think that information is correct.

#17

ARF

How will Navi 48 with around 240 mm^2 die size reach the performance of AD103 that is already 380 mm^2 and built on the same TSMC N4 node?

I think you must prepare for a flop, rather than an RX 7900 level of performance.

#18

alwayssts

ARFHow will Navi 48 with around 240 mm^2 die size reach the performance of AD103 that is already 380 mm^2 and built on the same TSMC N4 node?

I think you must prepare for a flop, rather than an RX 7900 level of performance.

First, Ada isn't 4nm, it's 5nm. nVIDIA marketed it as 4nm (actually '4n', a made-up name) due to how they tweaked the attributes of 5nm to their liking wrt power/perf/area. This is FACT, and well-documented.

It's why Ada doesn't really scale over ~1.07v; likely saved them area. They sacrified clockspeed potential, but likely gained power savings.

It's likely why Ada is efficient and AMD with higher clock potential is leaky at higher clocks. nVIDIA literally has a gigantic supercomputer that they use to optimize configurations and libraries on a process.

This is why you hear them complain so much when a process isn't perfect (Fermi; 40nm vias) or now even Samsung's HBM. It's because they have everything figured out down to the smallest detail.

Wrt AMD, I think they often use something closer to the more 'generic' version of a process, simply due to engineering budget.

N4P/X is an actual improvement to the inherent process, and hasn't been used in a GPU yet (afaik).

Goddamn it, I get so pissed at how successful nVIDIA is with marketing; Huang truly is like Jobs, and nVIDIA truly is like Apple. People see what they want to see/believe, and those people sell it to them.

Do not be confused; I'm not angry with you, nor do I intend to come across aggressive. You only know what you know, and you know what is marketed, but the reality is that is often bullshit.

Second, are we sure 240mm2? I know people keep saying all kinds of weird numbers, from 200-250mm or so, but I don't think this will necessarily be the case. I *could* be wrong.

I think N48 will be around the size of AD104 (counting the cache which may or may not incorporated into the main die), while N44 the size of AD106/107; somewhere around there.

I believe AMD will fight a chip one size larger from nVIDIA by using higher clockspeeds/voltage and a similar power configuration (ex: 1x8-pin, 2x-pin), but be less power-efficient..which imho doesn't matter.

It doesn't matter because the chips may reach the performance thresholds they need in order to make sense in the market (especially at their comparative price versus other options) with similar connectors.

Will [not any particular aqua-colored youtube channel] probably complain about power/heat? Probably. Will millions of people regurgitate that? Probably. Does it ACTUALLY matter to most people? No.

I'm not comparing it to 7900 levels of performance because 7900 is an extremely vague expression that creates different expectations for different people...which later people will use to bitch and complain.

It's almost-certainly a replacement for the 7900GRE...because that is where they need to compete (with the 4070 Ti Super); they need to beat it in perf...which means competing with 4070 Super in price.

That's because AMD doesn't have a Huang. Well, kinda-sorta technically they do, but let's not get into it.

But anyway, 7900GRE is a really weird SKU slapped together using all sorts of less-than-optimal configurations (like cache/bus) and (clock/power) limitations so it doesn't compete with 7900xt.

I don't think this will be. In essence, it's *potential* should be similar to that of a 7900xt (granted with less memory). It will probably be stock clocked so they can still sell 7900xt as an upgrade.

I apologize, but some people simply do not understand. Looking at W1zard's stock graphs does not tell the whole story of chips' potential or relative value. Too many people think that...and it's ignorant AF.

It's like...for example and as I've said before...some people will continue to believe a 3070 and/or Ti is a better-performing card than a 2080 Ti because of how W1zard chooses to present his charts.

Those people are fools (wrt value if they don't mind taking time to overclock). In some respects that's okay (because that might fit their use case)...I'm sure nVIDIA will sell them something with >8GB ram soon.

But, you know, it's just not true. The absolute performance of a 2080 Ti is similar to a stock 3080; and 3080 doesn't OC very well so in-fact they are very close; I'd take a 2080ti over a 3070 Ti 10/10 times....

...and the value of a 2080 Ti (at least for a long time, haven't checked lately) over a 3080 10/10 times.

The Samsung 8nm process was a POS (that nVIDIA paid very little to fab their enormous chips). The TSMC 14nm (sorry, "12nm") process was not. In some ways nVIDIA made money off of cheap logic...

...but in other ways they made it off of how they presented their newer series of chips.

nVIDIA can stock segment their cards however they want to have people perceive something as an improvement, but that doesn't make it accurate. Many times...wrt nVIDIA...and sometimes AMD...it's not.

I apologize, but I just don't feel it's likely we're going to see eye-to-eye on this. It's fine to think the way you do...but man, as an OG enthusiast...it's depressing AF that so many people think similarly.

I wish I had the time and energy to explain so many things to so many people...but it gets tiring. Too often nVIDIA often wins the market not only through marketing, but surviving the battle of attrition against them.

#19

ARF

alwaysstsFirst, Ada isn't 4nm, it's 5nm. nVIDIA marketed it as 4nm (actually '4n', a made-up name) due to how they tweaked the attributes of 5nm to their liking wrt power/perf/area. This is FACT, and well-documented.
It's why Ada doesn't really scale over ~1.07v; likely saved them area. They sacrified clockspeed potential, but likely gained power savings.
It's likely why Ada is efficient and AMD with higher clock potential is leaky at higher clocks. nVIDIA literally has a gigantic supercomputer that they use to optimize configurations and libraries on a process.
This is why you hear them complain so much when a process isn't perfect (Fermi; 40nm vias) or now even Samsung's HBM. It's because they have everything figured out down to the smallest detail.
Wrt AMD, I think they often use something closer to the more 'generic' version of a process, simply due to engineering budget.

Now I see why literally no one of the big OEMs wants to work with any AMD Radeon, for example no Radeons in the laptops/notebooks sector.
It is simply nvidia works more and better than AMD, who doesn't care about the graphics cards, it seems.

alwayssts4nm is an actual improvement to the inherent process, and hasn't been used in a GPU yet (afaik).

Now I am not even sure that "4nm" exists, in the first place. WikiChip doesn't state that it exists.

Look:

en.wikichip.org/wiki/5_nm_lithography_process
en.wikipedia.org/wiki/5_nm_process

All processes are N5s.

alwaysstsSecond, are we sure 240mm2? I know people keep saying all kinds of weird numbers, from 200-250mm or so, but I don't think this will necessarily be the case. I *could* be wrong.
I think N48 will be around the size of AD104 (counting the cache which may or may not incorporated into the main die), while N44 the size of AD106/107; somewhere around there.

Even if it be around 300 mm^2, that doesn't mean in the slightest that Navi 48 will reach the desired clock ranges.

alwaysstsToo often nVIDIA often wins the market not only through marketing, but surviving the battle of attrition against them.

Yeah :rolleyes:

#20

alwayssts

ARFNow I see why literally no one of the big OEMs wants to work with any AMD Radeon, for example no Radeons in the laptops/notebooks sector.
It is simply nvidia works more and better than AMD, who doesn't care about the graphics cards, it seems.

Now I am not even sure that "4nm" exists, in the first place. WikiChip doesn't state that it exists.

Look:

en.wikichip.org/wiki/5_nm_lithography_process
en.wikipedia.org/wiki/5_nm_process

All processes are N5s.

Even if it be around 300 mm^2, that doesn't mean in the slightest that Navi 48 will reach the desired clock ranges.

Yeah :rolleyes:

AMD's relationship with AIBs/notebook manufacturers almost certainly pertains to margins.

4nm does exist.

I can't guarantee anything wrt Navi 4's ability, I was only giving a hypothesis.

I don't know if the 'yeah' is sarcastic, but it's true. The people that used to address nVIDIA's crap have largely moved on from covering it, become jaded, or have been bought off to work within the industry.

#21

Daven

alwaysstsThat said, I don't think that information is correct.

I agree. This specs probably have changed given where RDNA4 needs to be performance wise.

#22

alwayssts

DavenI agree. This specs probably have changed given where RDNA4 needs to be performance wise.

...and the availability of GDDR7, among other reasons (like truly needing a 16GB buffer).

Add your own comment

AMD Adds RDNA 4 Generation Navi 44 and MI300X1 GPUs to ROCm Software

22 Comments on AMD Adds RDNA 4 Generation Navi 44 and MI300X1 GPUs to ROCm Software

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts

AMD Adds RDNA 4 Generation Navi 44 and MI300X1 GPUs to ROCm Software

Related News

22 Comments on AMD Adds RDNA 4 Generation Navi 44 and MI300X1 GPUs to ROCm Software

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts