Monday, December 9th 2024
NVIDIA Shows Future AI Accelerator Design: Silicon Photonics and DRAM on Top of Compute
During the prestigious IEDM 2024 conference, NVIDIA presented its vision for the future AI accelerator design, which the company plans to chase after in future accelerator iterations. Currently, the limits of chip packaging and silicon innovation are being stretched. However, future AI accelerators might need some additional verticals to gain the required performance improvement. The proposed design at IEDM 24 introduces silicon photonics (SiPh) at the center stage. NVIDIA's architecture calls for 12 SiPh connections for intrachip and interchip connections, with three connections per GPU tile across four GPU tiles per tier. This marks a significant departure from traditional interconnect technologies, which in the past have been limited by the natural properties of copper.
Perhaps the most striking aspect of NVIDIA's vision is the introduction of so-called "GPU tiers"—a novel approach that appears to stack GPU components vertically. This is complemented by an advanced 3D stacked DRAM configuration featuring six memory units per tile, enabling fine-grained memory access and substantially improved bandwidth. This stacked DRAM would have a direct electrical connection to the GPU tiles, mimicking the AMD 3D V-Cache on a larger scale. However, the timeline for implementation reflects the significant technological hurdles that must be overcome. The scale-up of silicon photonics manufacturing presents a particular challenge, with NVIDIA requiring the capacity to produce over one million SiPh connections monthly to make the design commercially viable. NVIDIA has invested in Lightmatter, which builds photonic packages for scaling the compute, so some form of its technology could end up in future NVIDIA acceleratorsThermal management emerges as another critical consideration. The multi-tier GPU design introduces complex cooling challenges that current technology cannot adequately address. NVIDIA acknowledges that significant advances in materials science will be necessary before the concept of stacking DRAM on logic on logic can become a reality. The company is exploring innovative solutions, including implementing intra-chip cooling systems such as module-level cooling using dedicated cold plates. It will take some time before this design is commercialized, and analysts like Dr. Ian Cutress predict a product utilizing this technology could go live sometimes in 2028-2030.
Source:
@IanCutress on X
Perhaps the most striking aspect of NVIDIA's vision is the introduction of so-called "GPU tiers"—a novel approach that appears to stack GPU components vertically. This is complemented by an advanced 3D stacked DRAM configuration featuring six memory units per tile, enabling fine-grained memory access and substantially improved bandwidth. This stacked DRAM would have a direct electrical connection to the GPU tiles, mimicking the AMD 3D V-Cache on a larger scale. However, the timeline for implementation reflects the significant technological hurdles that must be overcome. The scale-up of silicon photonics manufacturing presents a particular challenge, with NVIDIA requiring the capacity to produce over one million SiPh connections monthly to make the design commercially viable. NVIDIA has invested in Lightmatter, which builds photonic packages for scaling the compute, so some form of its technology could end up in future NVIDIA acceleratorsThermal management emerges as another critical consideration. The multi-tier GPU design introduces complex cooling challenges that current technology cannot adequately address. NVIDIA acknowledges that significant advances in materials science will be necessary before the concept of stacking DRAM on logic on logic can become a reality. The company is exploring innovative solutions, including implementing intra-chip cooling systems such as module-level cooling using dedicated cold plates. It will take some time before this design is commercialized, and analysts like Dr. Ian Cutress predict a product utilizing this technology could go live sometimes in 2028-2030.
13 Comments on NVIDIA Shows Future AI Accelerator Design: Silicon Photonics and DRAM on Top of Compute
So far Nvidia chips have been single planar, monolithic designs while others (e.g. Apple, Intel, AMD, etc) have played around with chiplets, fused chips, stacked chips and even photonic chip research. Even their ‘super chips’ are separate monolithic CPU and GPU in separate ‘sockets’.
Novel hardware design might be Nvidia’s Achilles’ heel.
Blackwell for the AI market (B100/B200) is two chips bound together.
Another aspect is the results of their monolithic designs. They've been able to dominate consumer, professional and data center markets with them so far. Intel tried a complex multi-chiplet design with Ponte Vecchio and failed. AMD's "chiplet" (in quotations because the main part is still monolithic) GPU isn't doing so hot either, but their Instinct parts look great.
What's used in here is silicon photonics meaning integration of optical communication directly onto chips instead of having to go off-chip via copper to dedicated network cards which increases latency and power usage. Unless we hear directly from NVIDIA we can't know that.
The speed of light in fiber is 70ish% the speed in vacuum, the electromagnetic propagation in copper is 98% of the speed of light in vacuum.
The difference in capacitive induction that causes interference and electrical load in copper. The "speed" of copper is actually higher until you reach the multi-spectral ability of fiber optic which requires diffraction splitting for each "channel" and we are 50 years from making that able to fit under/between a GPU die and Cache/memory. This is PR spin and perhaps an attempt at patent trolling.
whoops, they're doing it again, hahaha (sung to the melody of the Brittany Spears jingle of yesteryear) :D
But as already stated, if they believe they can make a buck (or a gazillion or 2) from it, they will bring it to market in a friggin heartbeat !