Tuesday, November 29th 2022

Samsung Develops GDDR6W Memory Standard: Double the Bandwidth and Density of GDDR6 Through Packaging Innovations

As advanced graphics and display technologies develop, they are blurring the lines between metaverse and our everyday experience. Much of this important shift is being made possible by the advancement of memory solutions designed for graphics products. One of the biggest challenges for improving virtual reality is taking the complexities of real-world objects and environments and recreating them in a virtual space. Doing so requires massive memory and increased computing power. At the same time, the benefits of creating more true-to-life metaverse will be far reaching, including real-life simulations of complicated scenarios and more, sparking innovation across a number of industries.

This is the central idea behind one of the most popular concepts in virtual reality: digital twin. A digital twin is a virtual representation of an object or space. Updated in real-time in accordance with the actual environment, a digital twin spans the lifecycle of its source and uses simulation, machine learning and reasoning to help decision-making. While until recently this was not feasible proposition due to limitations on data processing and transference, digital twins are now gaining traction thanks to availability of high bandwidth technologies.
Like other tech innovations, the gaming industry thrives on constant innovation, with new updates in speed and performance driving the market forward year after year. Thanks to the development of technologies like Ray Tracing in 3D rendering, which traces the reflection of light in a given scene, graphics in high-end AAA gaming are becoming hyper realistic and increasingly immersive.

Ray tracing enables the collection of light information to determine the color of each pixel through real-time calculation. This kind of calculation requires near-simultaneous computation of substantial amounts of data—between 60 to 140 pages worth for one second of an in-game scene. What's more, display quality is rising fast, with resolutions rapidly transitioning from 4K to 8K standard, while frame buffers are increasing to expand two times more than existing ones in response. That's why high capacity and high bandwidth are essential to meeting the growing memory demand as games continue to develop.

Developing 'GDDR6W' Graphics Memory, with Doubled Capacity and Performance Based on the Cutting-edge Fan-Out Wafer-Level Packaging (FOWLP) Technology
High performance, high capacity and high bandwidth memory solutions are helping bring the virtual realm to a closer match with reality. To meet this growing market demand, Samsung Electronics has developed GDDR6W (x64): the industry's first next-generation graphics DRAM technology.

GDDR6W builds on Samsung's GDDR6 (x32) products by introducing a Fan-Out Wafer-Level Packaging (FOWLP) technology, drastically increasing memory bandwidth and capacity.

Since its launch, GDDR6 has already seen significant improvements. Last July, Samsung developed a 24 Gbps GDDR6 memory, the industry's fastest graphics DRAM. GDDR6W doubles that bandwidth (performance) and capacity while remaining the identical size of GDDR6. Thanks to the unchanged footprint, new memory chips can easily be put into the same production processes customers have used for GDDR6, with the use of the FOWLP construction and stacking technology, cutting manufacturing time and costs.

As shown in the picture below, since it can be equipped with twice as many memory chips in an identical size package, the graphic DRAM capacity has increased from 16Gb to 32Gb, while bandwidth and the number of I/Os has doubled from 32 to 64. In other words, the area required for memory has been reduced 50% compared to previous models.

Generally, the size of a package increases as more chips are stacked. But there are physical factors that limit the maximum height of a package. What's more, though stacking chips increases capacity, there is a trade-off in heat dissipation and performance. In order to overcome these trade-offs, we've applied our FOWLP technology to GDDR6W.

FOWLP technology directly mounts memory die on a silicon wafer, instead of a PCB. In doing so, RDL (Re-distribution layer) technology is applied, enabling much finer wiring patterns. Additionally, as there's no PCB involved, it reduces the thickness of the package and improves heat dissipation.

The height of the FOWLP-based GDDR6W is 0.7 mm - 36% slimmer than the previous package with a height of 1.1 mm. And despite the chip being multi-layered, it still offers the same thermal properties and performance as the existing GDDR6. Unlike GDDR6, however, the bandwidth of the FOWLP-based GDDR6W can be doubled thanks to the expanded I/O per single package.

Packaging refers to the process of cutting fabricated wafers into semiconductor shapes or connecting wires. In the industry, this is known as a 'back-end process.' While the semiconductor industry has continuously developed towards scaling circuits as much as possible during the front-end process, packaging technology is becoming more and more important as the industry approaches the physical limits of chip sizes limits. That's why Samsung is using its 3D IC package technology in GDDR6W, creating a single package by stacking a variety of chips in a wafer state. This is one of many innovations planned to make advanced packaging for GDDR6W faster and more efficient.

The newly developed GDDR6W technology can support HBM-level bandwidth at a system level. HBM2E has a system-level bandwidth of 1.6 TB/s based on 4K system-level I/O and a 3.2 Gbps transmission rate per pin. GDDR6W, on the other hand, can produce a bandwidth of 1.4 TB/s based on 512 system-level I/O and a transmission rate of 22Gpbs per pin. Furthermore, since GDDR6W reduces the number of I/O to about 1/8 compared with using HBM2E, it removes the necessity of using microbumps. That makes it more cost-effective without the need for an interposer layer.

"By applying an advanced packaging technology to GDDR6, GDDR6W delivers twice the memory capacity and performance of similar-sized packages," said CheolMin Park, Vice President of New Business Planning, Samsung Electronics Memory Business. "With GDDR6W, we're able to foster differentiated memory products that can satisfy various customer needs - a major step towards securing our leadership in the market."

Samsung Electronics completed the JEDEC standardization for GDDR6W products in the second quarter of this year. It has also announced that it will expand the application of GDDR6W to small form factor devices such as notebooks as well as new high-performance accelerators used for AI and HPC applications, through cooperation with its GPU partners.
Add your own comment

30 Comments on Samsung Develops GDDR6W Memory Standard: Double the Bandwidth and Density of GDDR6 Through Packaging Innovations

#1
agent_x007
So... they basicly increased bit bus width per chip by doubling dies used per chip (stacked), without increasing the physical size of it and actually making it slimmer (height). I guess number of BGAs at the bottom increased though, since I don't think you can increase bus width without increasing connections to chip itself.

512-bit bus is fine, but wouldn't 768-bit be possible with this ?
Posted on Reply
#2
wolf
Better Than Native
I wonder if this will be leveraged for double the speed and capacity, or just leveraged for the currently planned speeds and capacities by using half the amount of chips, saving money but likely not passing that saving on to consumers...
Posted on Reply
#3
agent_x007
Yup. The big questions are, how much is Samsung asking for this vs. usual G6 memory, and how much manufacturing capacity it has.
Posted on Reply
#4
WhoDecidedThat
AMD 7900XTX has 24 GB 384-bit @ 20 GHz = 960 GB/sec and needs 12 memory chips

Theoretical 32 GB 512-bit @ 22 GHz = 1408 GB/sec and needs 8 (which is 4 less) memory chips
Posted on Reply
#5
Denver
That's the 7950 XTX² memory :P
Posted on Reply
#6
Wirko
I don't understand. Are silicon dies *not* usually stacked in a GDDR package? DDR makers have employed stacking for decades.
Posted on Reply
#7
thegnome
This feels like HBM, but instead of more seperate HBM chips it's stacking GDDR6 to achieve similar effect...?
Posted on Reply
#8
natr0n
Look like ram trains.
Posted on Reply
#9
P4-630
How about extra heat with that stacking...
Posted on Reply
#10
AusWolf
Why W? I think they need an X to make it more marketable... oh wait...
Posted on Reply
#11
bonehead123
agent_x007Yup. The big questions are, how much is Samsung asking for this
As the saying goes: "If you have to ask, then you probably can't afford it", hehehe :D /s

Perhaps they will be able to translate this tech to their m.2's, so we can FINALLY get some seriously higher capacity drives instead of the puny 1, 2, 4, & 8TB ones that we have to settle for ATM
Posted on Reply
#12
Punkenjoy
agent_x007So... they basicly increased bit bus width per chip by doubling dies used per chip (stacked), without increasing the physical size of it and actually making it slimmer (height). I guess number of BGAs at the bottom increased though, since I don't think you can increase bus width without increasing connections to chip itself.

512-bit bus is fine, but wouldn't 768-bit be possible with this ?
They are already possible with current technology, it's just that it's not worth it cost wise. There was few gens where they used a 512 bit but the cost of routing it on the PCB was not worth it. (Also faster memory and increased cache removed the needs. You still need to have a use for all that bandwidth.)

This tech is more about packaging than pure speed. A desktop card will be able to use half the memory chip for the same bus size. They will be able to use a smaller PCB, a smaller heatsink and simplify the packaging. I don't think that it will be a huge benefits on that side except maybe for low profile cards and similar stuff.

I think the real benefits will be on mobile GPU. they will be able to do much denser packaging leaving more room for cooling or other stuff.
WhoDecidedThatAMD 7900XTX has 24 GB 384-bit @ 20 GHz = 960 GB/sec and needs 12 memory chips

Theoretical 32 GB 512-bit @ 22 GHz = 1408 GB/sec and needs 8 (which is 4 less) memory chips
You would also need 8 MCD, and there is no room to connect that to Navi31, it would require a larger chip. And you still have to face the increased cost for routing a 512 bit bus. (Plus on RDNA3, the added cost of routing 2 more MCD to the CGD).
Posted on Reply
#13
defaultluser
so, is this intended to replace micron gddr6x?
Posted on Reply
#14
ADB1979
bonehead123As the saying goes: "If you have to ask, then you probably can't afford it", hehehe :D /s

Perhaps they will be able to translate this tech to their m.2's, so we can FINALLY get some seriously higher capacity drives instead of the puny 1, 2, 4, & 8TB ones that we have to settle for ATM
Totally different technology, RAM and NAND are simply not the same.

Also NAND is already stacked, in a MASSIVE way, If you buy an 8GB M2 SSD, it will likely be using NAND that is 144 layers.!
Posted on Reply
#15
dragontamer5788
What the hell is this "metaverse" crap being added to the start of the Samsung Press release?

I recognize you're just copy/pasting from Samsung, but... wtf is going on here? Did Facebook / Meta / Mark Zuckerberg just pay a ton of ad money to Samsung (and Fidelity for that matter: institutional.fidelity.com/app/funds-and-products/etf/snapshot/FIIS_ETF_FMET/fidelity-metaverse-etf.html) ??

There's clearly a media blitz going on trying to make Metaverse a thing. It feels very astroturfed / fake to me though. Anyone else getting this feeling?

--------

Anyway, GDDR6W, faster better than GDDR6x (EDIT: Woops, better than GDDR6. Seems like GDDR6x might be a "different branch"?? Maybe incompatible with this new packaging format??). Got it. Good news for GPU makers I guess, but... its a stretch to paint this as part of the "metaverse" (whatever that is...). I'm cool with the overall information being presented, but I'm just... curious... where this metaverse social-media blitz is coming from.
Posted on Reply
#16
R-T-B
WirkoI don't understand. Are silicon dies *not* usually stacked in a GDDR package? DDR makers have employed stacking for decades.
Have they? I think you are confused with NAND, where stacking had been used around a decade or so now. It is possible I am wrong. This is not my strongest field.
Posted on Reply
#17
bonehead123
dragontamer5788What the hell is this "metaverse" crap
It is YOU, and YOU are it, and it is everything, and everything is it, or at least existing within IT. And if you are not, it will consume/surround/engulf you soon enough :) /s
ADB1979Also NAND is already stacked, in a MASSIVE way, If you buy an 8GB M2 SSD, it will likely be using NAND that is 144 layers.!
Yea I know, but I was merely suggesting that any advances/improvements in the overall tech that come from this work maybe could possibly be applied to other types of things that have memory in them. And BTW, high-end SSD's are currently at 176 layers, and 192 will be here before you can say w.T.f.... :D
Posted on Reply
#18
Wirko
R-T-BHave they? I think you are confused with NAND, where stacking had been used around a decade or so now. It is possible I am wrong. This is not my strongest field.
Modern NAND uses both forms of stacking: monolithic stacking of 176 (or so) layers on each single die, and stacking of 4 or 8 (or so) dies on a BGA package.

Modern and less modern DRAM uses the latter. I found these sources from 2007 that mention through-silicon vias as the new tech versus wire bonding as the old tech used to connect the dies:
phys.org/news/2007-04-samsung-highly-efficient-stacking-dram.html
bit-tech.net/reviews/tech/memory/the_secrets_of_pc_memory_part_2/6/

Monolithic stacking in DRAM is still on the drawing board, I'm sure a lot of money is being poured into research, and it will be a breakthrough when it arrives, but it's not about to arrive soon.
Posted on Reply
#19
R-T-B
WirkoModern NAND uses both forms of stacking: monolithic stacking of 176 (or so) layers on each single die, and stacking of 4 or 8 (or so) dies on a BGA package.

Modern and less modern DRAM uses the latter. I found these sources from 2007 that mention through-silicon vias as the new tech versus wire bonding as the old tech used to connect the dies:
phys.org/news/2007-04-samsung-highly-efficient-stacking-dram.html
bit-tech.net/reviews/tech/memory/the_secrets_of_pc_memory_part_2/6/

Monolithic stacking in DRAM is still on the drawing board, I'm sure a lot of money is being poured into research, and it will be a breakthrough when it arrives, but it's not about to arrive soon.
Thanks. Stuff like this helps us all stay sharp.
Posted on Reply
#20
Wirko
dragontamer5788What the hell is this "metaverse" crap being added to the start of the Samsung Press release?
What Samsung is probably thinking: We don't know what that crap is but hey, we still understand that it will need teradollars' worth of RAM to run, so we're all for it!
Posted on Reply
#21
AusWolf
WirkoWhat Samsung is probably thinking: We don't know what that crap is but hey, we still understand that it will need teradollars' worth of RAM to run, so we're all for it!
I don't think anyone cares about it, to be honest. They probably just copy-pasted some marketing BS from Facebook because they got paid to do so.
Posted on Reply
#22
R-T-B
AusWolfI don't think anyone cares about it, to be honest. They probably just copy-pasted some marketing BS from Facebook because they got paid to do so.
OT but: Personally, it will never fail to amuse me how Zuckerburgs avatar seems more human than real pictures of the man.
Posted on Reply
#23
ADB1979
R-T-BOT but: Personally, it will never fail to amuse me how Zuckerburgs avatar seems more human than real pictures of the man.
Even further off topic, but he did have a malfunction in public where by his own words he created the question of whether or not he is Human :laugh:
Posted on Reply
#24
WhoDecidedThat
PunkenjoyYou would also need 8 MCD, and there is no room to connect that to Navi31, it would require a larger chip.
I wasn't talking about RDNA3 using GDDR6W. I'm sorry if I gave that impression. I was just talking bandwidth in general.

I agree that each RDNA3 MCD has a 64-bit wide memory bus and it's difficult to fit more than 6 on a single package. However, future RDNA4 MCD can have 96-bit/128-bit wide memory bus. That's what GDDR6W is targeting anyway.
PunkenjoyAnd you still have to face the increased cost for routing a 512 bit bus.
Moar bandwidth comes at an increased cost anyway. That's expected.

On another note, Nvidia would have benefitted a lot from something like this in 2020 with Ampere (in terms of VRAM size).

RTX 3080 came with 320-bit 19 GHz = 760 GB/sec bandwidth but only had 10 GB capacity.

Considering GDDR6 was at 14 GHz in Ampere, 448-bit 14 GHz = 784 GB/sec bandwidth with 14 GB VRAM size.
Posted on Reply
#25
chrcoluk
wolfI wonder if this will be leveraged for double the speed and capacity, or just leveraged for the currently planned speeds and capacities by using half the amount of chips, saving money but likely not passing that saving on to consumers...
The latter of course as SMR and QLC has proven.
Posted on Reply
Add your own comment
Dec 22nd, 2024 01:10 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts