Tuesday, May 5th 2009
GT300 to Boast Around 256 GB/s Memory Bandwidth
Recently, early-information on NVIDIA's next-generation GT300 graphics processor surfaced, that suggested it to pack 512 shader processors, and an enhanced processing model. A fresh report from Hardware-Infos sheds some light on its memory interface, revealing it to be stronger than that of any production GPU. According to a piece of information that has been doing ping-pong between Hardware-Infos and Bright Side of News, GT300 might feature a 512-bit wide GDDR5 memory interface.
The memory interface in conjunction with the use of the lowest latency GDDR5 memory available, at a theoretical 1000 MHz (2000 MHz DDR) would churn out 256 GB/s of bandwidth, the highest for a GPU so far. Although Hardware-Infos puts the lowest-latency figure at 0.5 ns, the math wouldn't work out. At 0.5 ns, memory with actual clock rate of 1000 MHz would churn out 512 GB/s, so a slight inaccuracy there. Qimonda's IDGV1G-05A1F1C-40X leads production today with its "40X" rating. With these chips across a 512-bit interface, the 256 GB/s bandwidth equation is satisfied. The clock speeds of the memory isn't known just as yet, the above is just an example that uses the commonly available high-performance GDDR5 memory chip. The new GPU, at least from these little information leaks, is shaping up to be another silicon-monstrosity by NVIDIA in the making.
Source:
Hardware-Infos
The memory interface in conjunction with the use of the lowest latency GDDR5 memory available, at a theoretical 1000 MHz (2000 MHz DDR) would churn out 256 GB/s of bandwidth, the highest for a GPU so far. Although Hardware-Infos puts the lowest-latency figure at 0.5 ns, the math wouldn't work out. At 0.5 ns, memory with actual clock rate of 1000 MHz would churn out 512 GB/s, so a slight inaccuracy there. Qimonda's IDGV1G-05A1F1C-40X leads production today with its "40X" rating. With these chips across a 512-bit interface, the 256 GB/s bandwidth equation is satisfied. The clock speeds of the memory isn't known just as yet, the above is just an example that uses the commonly available high-performance GDDR5 memory chip. The new GPU, at least from these little information leaks, is shaping up to be another silicon-monstrosity by NVIDIA in the making.
106 Comments on GT300 to Boast Around 256 GB/s Memory Bandwidth
I've been seeing 4870's 512mb and GTX260's (not core 216's) as low as 4850/9800gtx+ reference design price ranges. Big insight on how hardware is much more advanced than software to get that kind of performance/price ratio. When AMD strikes back it will be very very good news for everyones' pockets :laugh:
at 65gb/sec you can do 1920x1200, at 120gb/sec you can do 2560x1600.
So where do we need twice ? i think amd proved that this is not needed when they made the 4770 with 128 bit.
I smell false rumour, or a new Radeon 2900 XT, just from nvidia, maybe it will be the champ in 3dmark like the 2900 xt was.
I suspect ati to be futher ahead in what i like to name Lego strategy, i think that name orginally comes from AMD, we have seen start of this strategy with HD 2xxx->3xxx-4xxx.
Scaleable architecture.
3870x2 was first step, 4870 x2 2nd 4850 x2 3rd, scaling and issues are narrowed down, and we might see lower and lower end cards with setups like this.
They need shared memory system to make this good, nowdays a 4870x2 or a GTX295 has ~1 gb video memory per gpu, and total video memory for use in games is ~1gb, no more than lowest videocard.
With a 512-bit interface you're looking at (bare minimum) a 400mm2+ (20x20) die.
Knowing nVidia, this part will be made to be shrunk to 32nm without loosing it's bus, which would mean at least a 500mm2 die.
Minus the bus (which is 2x), this is 4x g92 (which is 754M transistors) + whatever changes they made for MIMD (dual-issue MADD?) + DX11, which should clock in at ~3 billion(+?) transistors, in my guesstimate.
Comparatively speaking to rv740 (826M, 136mm2) and rv870 (1.25ishB?, 205mm2), we'd we talking a ~23x23 die, or 529mm2, which could realistically shrink to around 400mm2 @ 32nm.
IOW, this mother gonna be big, and 40nm is not a good process for a big die. I wouldn't expect this to see the light of day until 32nm personally, although TSMC might get their problems worked out later this year allowing it happen. Still, it will not be a good yeilding part, nor do I expect high clocks. I figure 700c/1750s sounds doable, with 800/2000 on 32nm.
I believe r800 gen being 400sp/16tmu (low-end, 32nm) 800sp (mid-range,32nm) 1200sp/48tmu (rv870 - 40nm) and 1600/64 (rv870 replacement on 32nm). That really makes the most sense, as 'rv890' could replace rv870, with rv870 essentially becoming the 3/4 product of yore after it's release. This would be 4-16 arrays; 100 shaders (or 20 if you like), and 4 tmus per array. 32nm should allow for roughly a 1/3 shrink over 40nm, which would allow these die sizes to stay comparable to the parts preceding them (rv740, rv870).
That's just an informed guess, but I think a realistic one.