Thursday, June 19th 2014
NVIDIA GeForce GTX 880 and GTX 870 to Launch This Q4
NVIDIA is planning to launch its next high performance single-GPU graphics cards, the GeForce GTX 880 and GTX 870, no later than Q4-2014, in the neighborhood of October and November, according to a SweClockers report. The two will be based on the brand new "GM204" silicon, which most reports suggest, is based on the existing 28 nm silicon fab process. Delays by NVIDIA's principal foundry partner TSMC to implement its next-generation 20 nm process has reportedly forced the company to design a new breed of "Maxwell" based GPUs on the existing 28 nm process. The architecture's good showing with efficiency on the GeForce GTX 750 series probably gave NVIDIA hope. When 20 nm is finally smooth, it wouldn't surprise us if NVIDIA optically shrinks these chips to the new process, like it did to the G92 (from 65 nm to 55 nm). The GM204 chip is rumored to feature 3,200 CUDA cores, 200 TMUs, 32 ROPs, and a 256-bit wide GDDR5 memory interface. It succeeds the company's current workhorse chip, the GK104.
Source:
SweClockers
72 Comments on NVIDIA GeForce GTX 880 and GTX 870 to Launch This Q4
20nm will only be for the extreme high end this generation and will only be used in cases where it's impossible to manufacture a larger 28nm chip (e.g. you can't make a 28nm, 15 billion transistor, 1100mm^2 GM100). 20nm won't become mainstream until NVidia (or anyone else) can't achieve their performance targets on 28nm, which likely will not happen until the generation after this.
But that 28nm v 20nm makes it feel like an inbetween thing you dont want imo.
___
I don't understand why people think a 256-bit/32 ROP chip is going to have something like 3200sp. That makes absolutely no sense. Half that (according to nvidia-speak), at most, is feasible.
One of those components, at least, is wrong. It could be 256-bit/32 ROPs/1536(1920), or given since we know it is 8GB (and sixteen 4GB chips is a lot for a mid-range part), 512-bit/64/3200, or some combo of more cache/256-bit/64 ROPs/3200 because the design probably will indeed likely be shrank to 20nm where size will prohibit a larger bus. You gotta remember 3200sp, or 25 SMM, is essentially similar to 4000sp from AMD. That's a lot of chip, more than actually needed for 64 ROPs on avg (where-as Hawaii would be optimal for 48, if the design allowed it)...and again if true we can probably more realistically expect 23-24 (3072) unit parts, as it makes the most efficient sense. Not unlike Titan, for instance, and the full design is probably a safety net.
I agree it will be shrank, but I think a more suitable comparison would be G80->G92b...because if accurate we're talking a huge chip (~4x gm107) transitioning to a process that's supposed to allow somewhere around 1.9x density, granted around 1.2-1.3x performance/power savings. That means going from behemoth size (GT200 was 576mm) to large 256-bit size (like GK104 which is 294mm, and probably the largest really feasible before being larger and switching to a larger controller with slower ram). I certainly see how it could be conceivable to have such a large design on 28nm, and then scale size down and clockspeed up as we move to newer processes. That doesn't necessarily mean it's market will change...a small(ish) chip on 20nm/16nm (20nmFF) will likely be very expensive, but the clock improvement/power savings could, at least in on the later, make the change worth it.
I'm really curious how they could get a 3072sp (equivalent to 3840sp from amd) with 8GB of ram within a decent power envelope, especially in a feasible manner (meaning at least .9v and around 876mhz, the minimum voltage for the process and avg clocks at that voltage). I don't doubt the design is 'possible', especially with low-speed/voltage and higher density ram on a smaller bus (cache is probably more power efficient), but damn....that's pushing it to the edge of feasibility on pretty much all counts.
Call me cynical, but I don't see the 780ti lowering in price and the 880 taking its place. The 880 is going to hit a higher price point. Then there's the simple fact that the 860 is probably going to just be a rebranded 780ti and everything else below will likely be a rebrand too. Ugh... new GPU releases are so disappointing these days... nothing to get excited about, especially when you know the price gouging is imminent.
Remember, GTX 750 Ti outperforms the GTX 650 Ti by 20% and yet it has 20% fewer shaders, so assuming the same scaling, a 1600SP GTX 880 would have almost 50% more performance than GTX 770/680, completely in line with a generational improvement.
Edit: updated correct card names
Transistor density for 20nm (16nm FEOL + 20m BEOL) isestimated at 1.9 - 2.0x that of 28nm.
Wafer costs: 28nm : $4500-5000 per. 20nm: $6000 per....1.3x that of 28nm.
Reasons to go with 28nm?
Available capacity
Yields
Would the GPU design benefit from, or require increased transistor density over increased GPU silicon cost for the given price points of the product being sold? The GTX 870/880 (and presumably followed by the GTX 860 Ti) would still likely reside in the $350/$500 segment brackets. Why add to the manufacturing cost when you're under no pressure to do so (since AMD will also go with 28nm for their next iteration of GPUs).
My guess is that neither Nvidia nor AMD trust TSMC to deliver a large IC in commercial quantity based on TSMC's projections. Given the woes of 32nm and the slow and problematic ramp of 28nm who could blame them?
Until both video cards and 4K monitors support BOTH of these standards I won't be dumping my GTX590 any time soon.
These two standards are a must for 4K IMO.
BTW, I think the 3200 CUDA cores is impossible. If GM107 can pack 640 CUDA cores onto a ~160mm^2 chip, then a 450mm^2 chip can't pack more than ~2000 cores.
I
expect 15%-20% better performance than the 780TI at lower prices, which is great nevertheless!
The difference in cost per transistor between 20nm and 28nm is minimal, making me question whether it's worth putting engineering effort toward shrinking GPUs for a marginal cost savings per GPU (that may never make up the capital expenditure to make new masks and troubleshoot issues) rather than concentrating engineering on completely new GPUs at that smaller process. Unlike in the past, there's a lot more to be gained from a newer, more efficient architecture than from a die shrink.
You can say that the Maxwell is half the size for slightly better than half the performance, although the comparison is somewhat flawed since the Maxwell chip is hampered by a constrained bus width, and the Maxwell chip devotes a larger percentage of its die area in comparison to GK104 to its uncore (the L2 cache is a significant increase, but not particularly relevant to gaming at this time).
As you say, I'd be very sceptical over the 3200 core claim. The GM204 is obviously designed to supplant GK104, not GK110.
And btw Crysis is a turd that can't be polished.