NVIDIA Shows Future AI Accelerator Design: Silicon Photonics and DRAM on Top of Compute

AleksandarK · Dec 9, 2024

During the prestigious IEDM 2024 conference, NVIDIA presented its vision for the future AI accelerator design, which the company plans to chase after in future accelerator iterations. Currently, the limits of chip packaging and silicon innovation are being stretched. However, future AI accelerators might need some additional verticals to gain the required performance improvement. The proposed design at IEDM 24 introduces silicon photonics (SiPh) at the center stage. NVIDIA's architecture calls for 12 SiPh connections for intrachip and interchip connections, with three connections per GPU tile across four GPU tiles per tier. This marks a significant departure from traditional interconnect technologies, which in the past have been limited by the natural properties of copper.

Perhaps the most striking aspect of NVIDIA's vision is the introduction of so-called "GPU tiers"—a novel approach that appears to stack GPU components vertically. This is complemented by an advanced 3D stacked DRAM configuration featuring six memory units per tile, enabling fine-grained memory access and substantially improved bandwidth. This stacked DRAM would have a direct electrical connection to the GPU tiles, mimicking the AMD 3D V-Cache on a larger scale. However, the timeline for implementation reflects the significant technological hurdles that must be overcome. The scale-up of silicon photonics manufacturing presents a particular challenge, with NVIDIA requiring the capacity to produce over one million SiPh connections monthly to make the design commercially viable. NVIDIA has invested in Lightmatter, which builds photonic packages for scaling the compute, so some form of its technology could end up in future NVIDIA accelerators

Thermal management emerges as another critical consideration. The multi-tier GPU design introduces complex cooling challenges that current technology cannot adequately address. NVIDIA acknowledges that significant advances in materials science will be necessary before the concept of stacking DRAM on logic on logic can become a reality. The company is exploring innovative solutions, including implementing intra-chip cooling systems such as module-level cooling using dedicated cold plates. It will take some time before this design is commercialized, and analysts like Dr. Ian Cutress predict a product utilizing this technology could go live sometimes in 2028-2030.

View at TechPowerUp Main Site | Source

Daven · Dec 9, 2024

Ironically, as well as Nvidia is doing, they haven’t been known for changing the hardware designs. That might have something to do with their laser focus on software.

So far Nvidia chips have been single planar, monolithic designs while others (e.g. Apple, Intel, AMD, etc) have played around with chiplets, fused chips, stacked chips and even photonic chip research. Even their ‘super chips’ are separate monolithic CPU and GPU in separate ‘sockets’.

Novel hardware design might be Nvidia’s Achilles’ heel.

ncrs · Dec 9, 2024

Daven said:
Ironically, as well as Nvidia is doing, they haven’t been known for changing the hardware designs. That might have something to do with their laser focus on software.

So far Nvidia chips have been single planar, monolithic designs while others (e.g. Apple, Intel, AMD, etc) have played around with chiplets, fused chips, stacked chips and even photonic chip research. Even their ‘super chips’ are separate monolithic CPU and GPU in separate ‘sockets’.

Novel hardware design might be Nvidia’s Achilles’ heel.

It's not like NVIDIA doesn't research this stuff as they have published a paper on MCM GPU design in 2017. Looks like until now they haven't though this design was worth it.
Blackwell for the AI market (B100/B200) is two chips bound together.
Another aspect is the results of their monolithic designs. They've been able to dominate consumer, professional and data center markets with them so far. Intel tried a complex multi-chiplet design with Ponte Vecchio and failed. AMD's "chiplet" (in quotations because the main part is still monolithic) GPU isn't doing so hot either, but their Instinct parts look great.

ebivan · Dec 9, 2024

For about 20 year now, I have been reading bits about the upcoming age of photonic computing. Mostly from Intel by the way. They featured some opto electronics here and there but so far, other than fibre networking and TOSLINK, I haven't seen any kind of photonic computing anywhere in the wild.

Daven · Dec 9, 2024

ncrs said:
It's not like NVIDIA doesn't research this stuff as they have published a paper on MCM GPU design in 2017. Looks like until now they haven't though this design was worth it.
Blackwell for the AI market (B100/B200) is two chips bound together.
Another aspect is the results of their monolithic designs. They've been able to dominate consumer, professional and data center markets with them so far. Intel tried a complex multi-chiplet design with Ponte Vecchio and failed. AMD's "chiplet" (in quotations because the main part is still monolithic) GPU isn't doing so hot either, but their Instinct parts look great.

Not worth it or unable to make it work?

ncrs · Dec 9, 2024

ebivan said:
For about 20 year now, I have been reading bits about the upcoming age of photonic computing. Mostly from Intel by the way. They featured some opto electronics here and there but so far, other than fibre networking and TOSLINK, I haven't seen any kind of photonic computing anywhere in the wild.

This isn't optical/photonic computing as in using light instead of electricity. That is still far away

What's used in here is silicon photonics meaning integration of optical communication directly onto chips instead of having to go off-chip via copper to dedicated network cards which increases latency and power usage.

Daven said:
Not worth it or unable to make it work?

Unless we hear directly from NVIDIA we can't know that.

SOAREVERSOR · Dec 9, 2024

Daven said:
Not worth it or unable to make it work?

If it's doable and profitable they will do it. Really both are the same thing. If it can be done and someone will pay for it they will do it. Usually if it can be done and actually improves someone will pay whatever the ask is.

lexluthermiester · Dec 9, 2024

This is very cool!

theglaze · Dec 9, 2024

Thermal management emerges as another critical consideration....NVIDIA acknowledges that significant advances in materials science will be necessary before the concept of stacking DRAM on logic on logic can become a reality.

As of today, human settlement on another planet is more realistic than this stack of tech.

Steevo · Dec 9, 2024

ncrs said:
This isn't optical/photonic computing as in using light instead of electricity. That is still far away
What's used in here is silicon photonics meaning integration of optical communication directly onto chips instead of having to go off-chip via copper to dedicated network cards which increases latency and power usage.

Unless we hear directly from NVIDIA we can't know that.

The optical signal still requires conversion to electrical signals and where that happens along the path has very little to do with latency and power usage at each end.

The speed of light in fiber is 70ish% the speed in vacuum, the electromagnetic propagation in copper is 98% of the speed of light in vacuum.

The difference in capacitive induction that causes interference and electrical load in copper. The "speed" of copper is actually higher until you reach the multi-spectral ability of fiber optic which requires diffraction splitting for each "channel" and we are 50 years from making that able to fit under/between a GPU die and Cache/memory. This is PR spin and perhaps an attempt at patent trolling.

bonehead123 · Dec 10, 2024

nGreediya....

whoops, they're doing it again, hahaha (sung to the melody of the Brittany Spears jingle of yesteryear)

But as already stated, if they believe they can make a buck (or a gazillion or 2) from it, they will bring it to market in a friggin heartbeat !

Wirko · Dec 10, 2024

Steevo said:
The speed of light in fiber is 70ish% the speed in vacuum, the electromagnetic propagation in copper is 98% of the speed of light in vacuum.

The electric signal propagates through the insulator between the wires (plastics in cables, epoxy on PCBs, oxides etc on chips). Not through copper. Its speed in these media is also around 70% of speed of light in vacuum, which by itself is neither an advantage nor a disadvantage compared to optics.

ncrs said:
This isn't optical/photonic computing as in using light instead of electricity. That is still far away

Yeah... Even simple functions such as signal switching can't be implemented with pure optics yet, so Google for example made an opto-mechanical switch for their computing clusters, with moving mirrors, DLP projector style.

Steevo said:
The optical signal still requires conversion to electrical signals and where that happens along the path has very little to do with latency and power usage at each end.

Optical connentions could (among other uses) carry signals across a large substrate, not to reduce latency but to reduce power consumption. That's how I understand one of possible use cases.

lexluthermiester · Dec 10, 2024

theglaze said:
As of today, human settlement on another planet is more realistic than this stack of tech.

Except that they already have working samples and need only perfect the tech for mass-market use.

Vayra86 · Dec 10, 2024

lexluthermiester said:
Except that they already have working samples and need only perfect the tech for mass-market use.

Yeah we have lots of revolutionary battery tech too. Just need to perfect it...

Processor	Ryzen 3900x
Motherboard	B550M Steel Legend
Cooling	XPX (custom loop)
Memory	32GB 3200MHz cl16
Video Card(s)	3080 with Bykski block (custom loop)
Storage	980 Pro
Case	Fractal 804
Power Supply	Focus Plus Gold 750FX
Mouse	G603
Keyboard	G610 brown
Software	yes, lots!

System Name	Compy 386
Processor	7800X3D
Motherboard	Asus
Cooling	Air for now.....
Memory	64 GB DDR5 6400Mhz
Video Card(s)	7900XTX 310 Merc
Storage	Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s)	55" Samsung 4K HDR
Audio Device(s)	ATI HDMI
Mouse	Logitech MX518
Keyboard	Razer
Software	A lot.
Benchmark Scores	Its fast. Enough.

System Name	The Little One
Processor	i5-11320H @4.4GHZ
Motherboard	AZW SEI
Cooling	Fan w/heat pipes + side & rear vents
Memory	64GB Crucial DDR4-3200 (2x 32GB)
Video Card(s)	Iris XE
Storage	WD Black SN850X 8TB m.2, Seagate 2TB SSD + SN850 8TB x2 in an external enclosure
Display(s)	2x Samsung 43" & 2x 32"
Case	Practically identical to a mac mini, just purrtier in slate blue, & with 3x usb ports on the front !
Audio Device(s)	Yamaha ATS-1060 Bluetooth Soundbar & Subwoofer
Power Supply	65w brick
Mouse	Logitech MX Master 2
Keyboard	Logitech G613 mechanical wireless
VR HMD	Whahdatiz ???
Software	Windows 10 pro, with all the unnecessary background shitzu turned OFF !
Benchmark Scores	PDQ

Processor	i5-6600K
Motherboard	Asus Z170A
Cooling	some cheap Cooler Master Hyper 103 or similar
Memory	16GB DDR4-2400
Video Card(s)	IGP
Storage	Samsung 850 EVO 250GB
Display(s)	2x Oldell 24" 1920x1200
Case	Bitfenix Nova white windowless non-mesh
Audio Device(s)	E-mu 1212m PCI
Power Supply	Seasonic G-360
Mouse	Logitech Marble trackball, never had a mouse
Keyboard	Key Tronic KT2000, no Win key because 1994
Software	Oldwin

System Name	Tiny the White Yeti
Processor	7800X3D
Motherboard	MSI MAG Mortar b650m wifi
Cooling	CPU: Thermalright Peerless Assassin / Case: Phanteks T30-120 x3
Memory	32GB Corsair Vengeance 30CL6000
Video Card(s)	ASRock RX7900XT Phantom Gaming
Storage	Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s)	Gigabyte G34QWC (3440x1440)
Case	Lian Li A3 mATX White
Audio Device(s)	Harman Kardon AVR137 + 2.1
Power Supply	EVGA Supernova G2 750W
Mouse	Steelseries Aerox 5
Keyboard	Lenovo Thinkpad Trackpoint II
VR HMD	HD 420 - Green Edition ;)
Software	W11 IoT Enterprise LTSC
Benchmark Scores	Over 9000

NVIDIA Shows Future AI Accelerator Design: Silicon Photonics and DRAM on Top of Compute

News Editor