Friday, August 14th 2020

Micron Confirms Next-Gen NVIDIA Ampere Memory Specifications - 12 GB GDDR6X, 1 TB/s Bandwidth

Aug 14th, 2020 08:41 Discuss (53 Comments)

Micron have spilled the beans on at least some specifications for NVIDIA's next-gen Ampere graphics cards. In a new tech brief posted by the company earlier this week, hidden away behind Micron's market outlook, strategy and positioning, lie some secrets NVIDIA might not be too keen to see divulged before their #theultimatecountdown event.

Under a comparison on ultra bandwidth solutions, segregated into the GDDR6X column, Micron lists a next-gen NVIDIA card under the "RTX 3090" product name. According to the spec sheet, this card features a total memory capacity of 12 GB GDDR6X, achieved through 12 memory chips with a 384-bit wide memory bus. As we saw today, only 11 of these seem to be populated on the RTX 3090, which, when paired with specifications for the GDDR6X memory chips being capable of 19-21 Gbps speeds, brings total memory subsystem bandwidth towards the 912 - 1008 GB/s range (using 12 chips; 11 chips results in 836 GB/s minimum). It's possible the RTX 3090 product name isn't an official NVIDIA product, but rather a Micron-guessed possibility, so don't look at it as factual representation of an upcoming graphics card. One other interesting aspect from the tech brief is that Micron expects their GDDR6X technology to enable 16 Gb (or 2 GB) density chips with 24 Gbps bandwidth, as early as 2021. You can read over the tech brief - which mentions NVIDIA by name as a development partner for GDDR6X - by following the source link and clicking on the "The Demand for Ultra-Bandwidth Solutions" document.

Source: Micron

Add your own comment

53 Comments on Micron Confirms Next-Gen NVIDIA Ampere Memory Specifications - 12 GB GDDR6X, 1 TB/s Bandwidth

#26

efikkan

chstamosThat's a silly question, just for fun, but those VRAM levels really make me wonder. Could the main system THEORETICALLY access an unused chunk of VRAM memory to use as main RAM ? I know the opposite (using system RAM as an extension of GPU ram) is hampered by the fact system ram is in general bog slow compared to VRAM, plus the bandwidth bottleneck of PCIe. But VRAM is fast. Could vram be used for system ram , in theory? Other than reasons of common sense, it being much more expensive and the whole thing being impractical and , well, stupid, is it technically doable?

You could, in theory, use pretty much any storage medium as an extension of RAM.
On a dedicated graphics card, VRAM is a separate address space, but with a modified driver you should be able to have full access to it.

The reason why swapping VRAM to RAM is slow is not bandwidth, but latency. Data has to travel both the PCIe bus and the system memory bus, with a lot of syncing etc.

#27

1300 Mhz memory clock with PAM4 encoding, wide interface and even clamshell mode. That surely will be a major pita for the PCB design...

#28

ZoneDymo

EarthDogWhat benefits would that have? How would that benefit card makers?

I know it can be used for more things, but that doesn't change my point. At all. :)

More than 12-16GB is really just wasting cash right now. Its like buying a AMD processor with 12c/24t... if gaming is only using at most 6c/12t (99.9%) more NOW?

Again, devs have headroom to play with already... they aren't even using that.

yeah because you cant use what isnt there, the hardware needs to exist for clients to be able to make use of hte software they create.
Why would a company make a game now that requires DirectX 14 that is not supported by any hardware, nobody would be able to buy it.

#29

W1zzard

iOPAM4 encoding

Not relevant for pcb design? Honest question

#30

TheLostSwede

News Editor

chstamosThat's a silly question, just for fun, but those VRAM levels really make me wonder. Could the main system THEORETICALLY access an unused chunk of VRAM memory to use as main RAM ? I know the opposite (using system RAM as an extension of GPU ram) is hampered by the fact system ram is in general bog slow compared to VRAM, plus the bandwidth bottleneck of PCIe. But VRAM is fast. Could vram be used for system ram , in theory? Other than reasons of common sense, it being much more expensive and the whole thing being impractical and , well, stupid, is it technically doable?

Consoles are doing this, but current PC design doesn't allow for it. Also keep in mind that without a memory controller in the CPU that can recognise the memory type used by the GPU, it's unlikely it would work, apart as some kind of virtual cache or something, but not as system memory.

W1zzardNot relevant for pcb design? Honest question

Have a look here, not related to GPUs, but explains the PCB design related issues.
www.accton.com/Technology-Brief/the-challenge-of-pam4-signal-integrity/

#31

ARF

TheLostSwedeConsoles are doing this, but current PC design doesn't allow for it. Also keep in mind that without a memory controller in the CPU that can recognise the memory type used by the GPU, it's unlikely it would work, apart as some kind of virtual cache or something, but not as system memory.

Have a look here, not related to GPUs, but explains the PCB design related issues.
www.accton.com/Technology-Brief/the-challenge-of-pam4-signal-integrity/

Can a modern CPU actually utilise efficiently 1 TB/s throughput with its memory subsystem?

#32

W1zzardNot relevant for pcb design? Honest question

A 384bit interface needs ~400 data pins and that leaked PCB suggests a really cramped layout with memory ICs being very close together. The crosstalk and interference you'll get at Ghz speeds is immense and then having to differentiate between 4 voltage levels instead of just 2 is quite challenging if you want a nice clean eye.

#33

Krzych

ThrashZoneHi,
Death rate after 6 months will say whether I'll buy anything with micron.

You do realize that you have absolutely no access to such data? Unless you consider reddit drama a data.

#34

robb

EarthDogOr maybe 8GB on a 480 was more epeen than useful? That thing was a 1080p card, barely 1440, so 8GB on such a thing wasn't warranted IMO.

I see you still fall under the ignorant myth that resolution is the main determining factor for vram. in reality there's actually only a small difference in most games even going from 1080p to 4K. And textures have almost zero impact on performance if you have enough vram so yes 8 gigs was perfectly usable on that card in plenty of games as 4 gigs would have been a limitation and caused some stutters.

#35

steen

iO1300 Mhz memory clock with PAM4 encoding, wide interface and even clamshell mode. That surely will be a major pita for the PCB design...

Don't forget additional cost of back drilling. <$ than blind VIAs, though. Looks like they used some Mellanox tech/ideas. I wonder if PAM4 transceiver logic is in the memory controllers or a separate PHY...

W1zzardNot relevant for pcb design? Honest question

Very relevant given trade-off of SNR (hence very proximal placement & orientation of GDDR6X to GPU with shortest possible traces) for the gain in bandwidth. You need a very clean board (electrical/rf signal pcb/layout/components).

Edit:

TheLostSwedeBack to 1GB per chip explains why the presumed picture of the RTX 30x0 (21x0?) card has so many memory chips. Even if it's "only" 12, they're going to take up a lot of physical space until 2GB chips arrive at some point. I guess the next Super cards might have half the amount of memory chips...

xkm1948Huh, 12GB Titan RTX? Never saw something like that. Titan RTX was 24GB VRAM. Guess the VRAM capacity is just place holder

Higher tier, higher VRAM capacity SKUs will use slower GDDR6 with higher density modules.

#36

kinjx11

i'm having nightmares thinking about buying an Nvidia gpu with Micron memory, my 1st 2080Ti had it and it never OC'd even +1mhz on core/memory

i hope Samsung is doing 3090 Vram chips

#37

TheLostSwede

News Editor

kinjx11i'm having nightmares thinking about buying an Nvidia gpu with Micron memory, my 1st 2080Ti had it and it never OC'd even +1mhz on core/memory

i hope Samsung is doing 3090 Vram chips

Don't expect GDDR6X to overclock 1MHz either, as they've seemingly already pushed the current technology to its limit.

steenHigher tier, higher VRAM capacity SKUs will use slower GDDR6 with higher density modules.

Speculation or do you know this for a fact? I mean, if the GPU has a wider bus it might not make a huge difference, but then why push for super high-speed memory on the lower tier cards? This doesn't quite add up.

#38

TheoneandonlyMrK

KrzychYou do realize that you have absolutely no access to such data? Unless you consider reddit drama a data.

You could look to the present 5% return rate of the 2080ti that mindfactury released last week, the highest for any single SKU.

And that's now the memory issues are sorted.

Time will tell though.

#39

EarthDog

robbI see you still fall under the ignorant myth that resolution is the main determining factor for vram. in reality there's actually only a small difference in most games even going from 1080p to 4K. And textures have almost zero impact on performance if you have enough vram so yes 8 gigs was perfectly usable on that card in plenty of games as 4 gigs would have been a limitation and caused some stutters.

welp... i was talking performance in what you quoted. I also recall, earlier in this thread (before your post) conceding your point anyway. ;)

EarthDogI know it can be used for more things, but that doesn't change my point. At all. :)

#40

steen

TheLostSwedeSpeculation or do you know this for a fact?

Everything is speculation until launch.

why push for super high-speed memory on the lower tier cards?

You have no choice given competition, product stack & GDDR6X availability. A lot also depends on yields.

This doesn't quite add up.

Let's speculate. If GDDR6X is currently only available in 1GBx32 modules, you need 12 for an appropriately sized (384bit) top tier SKU, so 12GB or 24GB back-to-back. If the competition has a 16GB frame buffer (even if inferior), you look bad. So 24GB, it is. Is 24GB enough for a next gen pro/ws/halo card? Halo maybe, but not pro/ws. GDDR6 is available as 2GBx32 ->48GB @ 17gbps = >800GB/s, which is plenty for full die, lower clocked, primarily non-gaming cards which can't have less frame buffer than their predecessors or lower tier pleb cards... If I were to speculate further, I might suggest only 3080/90 & perhaps the top SKU GA104 get GDDR6X. Everything above and below gets standard GDDR6 (for obviously different reasons). Leaves plenty of scope for a super refresh in 2021, as per your earlier post.

#41

Punkenjoy

The main things with memory is you want to have always enough. Unlike let say a GPU or cpu that is a bit too slow and wo le give you low but mostly steady fps. Not having enough ram can be disastrous on performance.

It's indeed way worst if you don't have enough system ram as you have to swap, but it still bad on a GPU as the latency from ram (if you have enough) is too high for smooth gameplay.

You want in your GPU to have all it need for the next 5 or so seconds.

Still today, one of the low hanging fruits for better visual quality without involving too much calculation is better and more textures. Ram can also be used to store temporary data that will need to be reuse later in the rendering or in future frames.

CGI can use way more than 12 GB of textures.

But, the quantity is not only the main thing to consider. Speed is as important. No point of having a 2 TB SSD as GPU memory.

But in future, GPU might have a SSD slot on board to be use as a asset cache. (Like some Radeon pro card already have).

The key here is the memory bus width. I think 16 GB would have been perfect but without a hack like the 970, that would be impossible and I'm not sure they want to do that on a flagship GPU.

#42

efikkan

PunkenjoyThe main things with memory is you want to have always enough. Unlike let say a GPU or cpu that is a bit too slow and wo le give you low but mostly steady fps. Not having enough ram can be disastrous on performance.

It's indeed way worst if you don't have enough system ram as you have to swap, but it still bad on a GPU as the latency from ram (if you have enough) is too high for smooth gameplay.

You want in your GPU to have all it need for the next 5 or so seconds.

Still today, one of the low hanging fruits for better visual quality without involving too much calculation is better and more textures. Ram can also be used to store temporary data that will need to be reuse later in the rendering or in future frames.

But, the quantity is not only the main thing to consider. Speed is as important. No point of having a 2 TB SSD as GPU memory.

If by "speed" you mean bandwidth, then no bandwidth is ever going to be enough to use a different storage medium as VRAM. The problem is latency, and to handle latency, a specialized caching algorithm needs to be implemented into the game engine.
Most implementations of resource streaming so far has relied on fetching textures based in GPU feedback, which results in "texture popping" and stutter. This approach is a dead end, and will never work well. The proper way to solve it is to prefetch resources into a cache, which is not a hard task for most games, but still something which needs to be tailored to each game. The problem here is that most game developers uses off-the-shelf universal game engines and write no low-level engine code at all.

PunkenjoyBut in future, GPU might have a SSD slot on board to be use as a asset cache. (Like some Radeon pro card already have).

It may, but I'm very skeptical about the usefulness of this. It will be yet another special feature which practically no game engines will implement well, and yet another "gimmick" for game developers to develop for and a QA nightmare.

But most importantly, it's hardware which serves no need. Resource streaming is not hard to implement at the render engine level. As long as you prefetch well, it's no problem to stream even from a HDD, decompress and then load into VRAM.

#43

Punkenjoy

Humm I never said to replace VRAM with other stuff (quite the opposite actually...)

As for SSD texture cache, time will tell. It's already a game changer on pro card for offline rendering. SSD are becoming cheap and having a 1 TB added to a 600$ video card might end up to be a minor portion of the global price.

There are advantages of having it on the GPU. Lower latency. The GPU could also handle the I/O instead of the cpu and the bandwidth between the cpu and the GPU could be used for something else. The GPU could also use it to store many things like more mipmap, more level of details for geometry etc. The demo of the unreal engine 5 demonstrated what you can do by having a more detailed working set.

And I think the fact most gaming studios use third party engine is a better thing for technology adoption than the opposite. The big engine maker have to work on getting the technology into the engine to stay competitive and the studios only have to implement it while they create theirs games.

Time will tell.

#44

Unregistered

PunkenjoyHumm I never said to replace VRAM with other stuff (quite the opposite actually...)

As for SSD texture cache, time will tell. It's already a game changer on pro card for offline rendering. SSD are becoming cheap and having a 1 TB added to a 600$ video card might end up to be a minor portion of the global price.

There are advantages of having it on the GPU. Lower latency. The GPU could also handle the I/O instead of the cpu and the bandwidth between the cpu and the GPU could be used for something else. The GPU could also use it to store many things like more mipmap, more level of details for geometry etc. The demo of the unreal engine 5 demonstrated what you can do by having a more detailed working set.

And I think the fact most gaming studios use third party engine is a better thing for technology adoption than the opposite. The big engine maker have to work on getting the technology into the engine to stay competitive and the studios only have to implement it while they create theirs games.

Time will tell.

Could that be how the PC ends up competing with PS5's hyped SSD system?

#45

Jism

TheLostSwedeNote quite, but almost.

And HBM still has the upper hand in power consumption.

#46

efikkan

PunkenjoyAs for SSD texture cache, time will tell. It's already a game changer on pro card for offline rendering. SSD are becoming cheap and having a 1 TB added to a 600$ video card might end up to be a minor portion of the global price.

There are advantages of having it on the GPU. Lower latency. The GPU could also handle the I/O instead of the cpu and the bandwidth between the cpu and the GPU could be used for something else. The GPU could also use it to store many things like more mipmap, more level of details for geometry etc. The demo of the unreal engine 5 demonstrated what you can do by having a more detailed working set.

While having an SSD locally on the graphics card will certainly be faster than fetching it from the SSD storing the game, it will still be a couple orders of magnitude more latency than VRAM, so it still needs to be prefetched carefully. In reality it doesn't solve any problem.

While SSDs of varying quality are cheap these days*, having one on the graphics card introduces a whole host of new problems. Firstly, if this is going to be some kind of "cache" for games, when it needs to be managed somehow, and the user would probably have to go in to delete or prioritize it for various games. Secondly having it in a normal SSD and going through the RAM serves a huge benefit. You can apply a lossy compressions (probably ~10-20x), decompress it in the CPU and send the decompressed data to the GPU. This way, each large game wouldn't take up >200 GB. To do the same on the graphics card, it would require even more unnecessary dedicated hardware. A graphics card should do graphics, not everything else.

*) Be aware that most "fast" SSDs only have a tiny SLC SSD that's actually fast, and a large TLC/QLC SSD which is much slower.

PunkenjoyAnd I think the fact most gaming studios use third party engine is a better thing for technology adoption than the opposite. The big engine maker have to work on getting the technology into the engine to stay competitive and the studios only have to implement it while they create theirs games.

These big universal engines leaves very little room to utilize hardware well. There are barely any games properly written for DirectX 12 yet, how would you imagine more exotic gimmicks will end up?
Every year which passes by, these engines get more bloated. More and more generalized rendering code means less efficient rendering. The software is really lagging behind the hardware.

#47

Vayra86

EarthDogSamsung equipped cards died too...it wasn't just micron. ;)

Micron does stack launch issues with their new GDDR stuff. Pascal also had a mandatory update due to instability on GDDR5X. Additionally its well known Samsung chips are better clockers for the past gen(s).

#48

EarthDog

Vayra86Micron does stack launch issues with their new GDDR stuff. Pascal also had a mandatory update due to instability on GDDR5X. Additionally its well known Samsung chips are better clockers for the past gen(s).

sorry.. what does "stack launch issues" mean?

I dont recall issues with gddr5x.. link me so I can see? :)

Overclocking is also beyond my perview, but that is true. My point was simply to clarify that both micron and Samsung equipped cards had the same issue.

#49

Vayra86

EarthDogsorry.. what does "stack launch issues" mean?

I dont recall issues with gddr5x.. link me so I can see? :)

Overclocking is also beyond my perview, but that is true. My point was simply to clarify that both micron and Samsung equipped cards had the same issue.

www.nvidia.com/en-us/geforce/forums/geforce-graphics-cards/5/240638/known-issue-with-gddr5x-micron-memory-in-1080-gtx/

EDIT: correction; title says gddr5x, but the issue was on gddr5. My bad

#50

EarthDog

Thanks for the link... looks like gddr5 had some issues? Gddr5x was fine?

Add your own comment

Micron Confirms Next-Gen NVIDIA Ampere Memory Specifications - 12 GB GDDR6X, 1 TB/s Bandwidth

53 Comments on Micron Confirms Next-Gen NVIDIA Ampere Memory Specifications - 12 GB GDDR6X, 1 TB/s Bandwidth

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts

Micron Confirms Next-Gen NVIDIA Ampere Memory Specifications - 12 GB GDDR6X, 1 TB/s Bandwidth

Related News

53 Comments on Micron Confirms Next-Gen NVIDIA Ampere Memory Specifications - 12 GB GDDR6X, 1 TB/s Bandwidth

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts