NVIDIA Shows Future AI Accelerator Design: Silicon Photonics and DRAM on Top of Compute
During the prestigious IEDM 2024 conference, NVIDIA presented its vision for the future AI accelerator design, which the company plans to chase after in future accelerator iterations. Currently, the limits of chip packaging and silicon innovation are being stretched. However, future AI accelerators might need some additional verticals to gain the required performance improvement. The proposed design at IEDM 24 introduces silicon photonics (SiPh) at the center stage. NVIDIA's architecture calls for 12 SiPh connections for intrachip and interchip connections, with three connections per GPU tile across four GPU tiles per tier. This marks a significant departure from traditional interconnect technologies, which in the past have been limited by the natural properties of copper.
Perhaps the most striking aspect of NVIDIA's vision is the introduction of so-called "GPU tiers"—a novel approach that appears to stack GPU components vertically. This is complemented by an advanced 3D stacked DRAM configuration featuring six memory units per tile, enabling fine-grained memory access and substantially improved bandwidth. This stacked DRAM would have a direct electrical connection to the GPU tiles, mimicking the AMD 3D V-Cache on a larger scale. However, the timeline for implementation reflects the significant technological hurdles that must be overcome. The scale-up of silicon photonics manufacturing presents a particular challenge, with NVIDIA requiring the capacity to produce over one million SiPh connections monthly to make the design commercially viable. NVIDIA has invested in Lightmatter, which builds photonic packages for scaling the compute, so some form of its technology could end up in future NVIDIA accelerators
Perhaps the most striking aspect of NVIDIA's vision is the introduction of so-called "GPU tiers"—a novel approach that appears to stack GPU components vertically. This is complemented by an advanced 3D stacked DRAM configuration featuring six memory units per tile, enabling fine-grained memory access and substantially improved bandwidth. This stacked DRAM would have a direct electrical connection to the GPU tiles, mimicking the AMD 3D V-Cache on a larger scale. However, the timeline for implementation reflects the significant technological hurdles that must be overcome. The scale-up of silicon photonics manufacturing presents a particular challenge, with NVIDIA requiring the capacity to produce over one million SiPh connections monthly to make the design commercially viable. NVIDIA has invested in Lightmatter, which builds photonic packages for scaling the compute, so some form of its technology could end up in future NVIDIA accelerators