Wednesday, October 17th 2012
NVIDIA Kepler Refresh GPU Family Detailed
A 3DCenter.org report shed light on what NVIDIA's GPU lineup for 2013 could look like. According to the report, NVIDIA's next-generation GPUs could follow a similar path to previous-generation "Fermi Refresh" (GF11x), which turned the performance-per-Watt equation around back in favor of NVIDIA, even though the company's current GeForce Kepler has an established energy-efficiency lead. The "Kepler Refresh" family of GPUs (GK11x), according to the report, could see significant increases in cost-performance, with a bit of clever re-shuffling of the GPU lineup.
NVIDIA's GK104 GPU exceeded performance expectations, which allowed it to drive this generation's flagship single-GPU graphics card for NVIDIA, the GTX 680, giving the company time to perfect the most upscaled chip of this generation, and for its foundry partners to refine its 28 nm manufacturing process. When it's time for Kepler Refresh to go to office, TSMC will have refined its process enough for mass-production of GK110, a 7.1 billion transistor chip on which NVIDIA's low-volume Tesla K20 GPU compute accelerator is currently based.
The GK110 will take back the reins of powering NVIDIA's flagship single-GPU product, the GeForce GTX 780. This product could offer a massive 40-55% performance increase over GeForce GTX 680, with a price ranging anywhere between US $499 and $599. The same chip could even power the second fastest single-GPU SKU, the GTX 770. The GK110 physically packs 2880 CUDA cores, and a 384-bit wide GDDR5 memory interface.
Moving on, the real successor to the GK104, the GK114, could form the foundation for high-performance SKUs such as the GTX 760 Ti and 760. The chip has the same exact specifications as the GK104, leaving NVIDIA to tinker with clock speeds to increase performance. The GK114 will be relegated to performance-segment SKUs from the high-end segment it currently powers, and so even with minimal increases in clock speed, the chip will have achieved sizable performance gains over current GTX 660 Ti and GTX 660.
Lastly, the GK106 could see a refresh to GK116, too, retaining specifications and leaving room for clock speed increases, much in the same way as GK114, except, it gets a demotion to GTX 750 Ti, GTX 750, as well, and so with minimal R&D, the GTX 750 series gains a sizable performance gain over its previous generation.
Source:
3DCenter.org
NVIDIA's GK104 GPU exceeded performance expectations, which allowed it to drive this generation's flagship single-GPU graphics card for NVIDIA, the GTX 680, giving the company time to perfect the most upscaled chip of this generation, and for its foundry partners to refine its 28 nm manufacturing process. When it's time for Kepler Refresh to go to office, TSMC will have refined its process enough for mass-production of GK110, a 7.1 billion transistor chip on which NVIDIA's low-volume Tesla K20 GPU compute accelerator is currently based.
The GK110 will take back the reins of powering NVIDIA's flagship single-GPU product, the GeForce GTX 780. This product could offer a massive 40-55% performance increase over GeForce GTX 680, with a price ranging anywhere between US $499 and $599. The same chip could even power the second fastest single-GPU SKU, the GTX 770. The GK110 physically packs 2880 CUDA cores, and a 384-bit wide GDDR5 memory interface.
Moving on, the real successor to the GK104, the GK114, could form the foundation for high-performance SKUs such as the GTX 760 Ti and 760. The chip has the same exact specifications as the GK104, leaving NVIDIA to tinker with clock speeds to increase performance. The GK114 will be relegated to performance-segment SKUs from the high-end segment it currently powers, and so even with minimal increases in clock speed, the chip will have achieved sizable performance gains over current GTX 660 Ti and GTX 660.
Lastly, the GK106 could see a refresh to GK116, too, retaining specifications and leaving room for clock speed increases, much in the same way as GK114, except, it gets a demotion to GTX 750 Ti, GTX 750, as well, and so with minimal R&D, the GTX 750 series gains a sizable performance gain over its previous generation.
127 Comments on NVIDIA Kepler Refresh GPU Family Detailed
8970 is expected to be 40% faster than the 7970
GTX 780 is expected to be 40-55% faster than the 680
add in overclocking on both and we end up with the exact same situation as this generation. So in reality it just plain doesnt matter lol performance is all i care about and who gets product onto store shelfs and from their into my hands. Doesn't matter whos fastest if it takes 6 months for stock to catch up.
If you go back to the original linked article the performance gains for the GK114 and GK116 will only be 5-15%. That seems quite low considering the improvements to memory bandwidth, shaders, ROPs, etc. That would suggest nvidia may be focusing on even lower TDP than pure performance increases. And prices will be increasing too.
I think people may be disappointed by the time these are released. I suspect AMD will show similar improvements next year as well with more focus on TDP.
AMD did this to themselves because they released their 79xx cards fairly horridly underclocked (especially the 7950), and at price points that were too high. They didn't make a move on either front soon enough, and so when Kepler finally hit, reviewers were left looking at a situation where the 7970 was outperformed by a cheaper card. Then the 670 came in, trashed the 7950, and competed with AMD's previously $550 card at $150 less.
Those things defined the impressions most people have of this round. AMD then made the mistake of releasing their GHz edition as a reference card for reviewers, and most reviewers then dismissed it as too loud/etc.
You have to do a decent amount of homework before you start realizing that both companies at this point in time are pretty much dead even, and most people don't like to think that hard.
If AMD had released their 7970 clocked around 1050/1500 MHz for $500 at launch, and their 7950 at maybe 950/1400 for $400, I can guarantee you that the impressions would be different. Pretty much every single 7970/7950 will hit those clocks without messing with voltages, so I have no idea why they got so conservative. But they didn't make those moves, and so here we are.
Regardless the refresh will probably see Nvidia take the lead but not by a whole lot they have more room to play when it comes to TDP than AMD does right now.
Look back at the 5850 and 5870
clock both to the same clock speed the 5850 with less shaders but same ROP count was within 1-2% of the 5870 so increased shader count didnt do a whole hell of a lot
with GCN shaders scale a bit better yes but notice
7870 1280 GCN stream processors and 32 ROPs can take on the 7950 which is 32ROPs 1792 shaders etc
looking at previous GPUs
7770 = 640 shaders 16 ROPs, 10 Compute Units, 40 TMUs - 3Dmark 11 P3500
7870 = 1280 shaders 32 ROPs, 20 Compute Units, 80 TMUs - 3Dmark 11 P6600
7970 = 2048 shaders 32 ROPs, 32 Compute Units, 128 TMUs - 3Dmark 11 P8000
what 7970 probably looked like if following AMDs previous design philosphy
1920 shaders 48 Rops, 30 Compute Units, 120 TMUs add in higher GPU clock
for the 8970 being at the same 28nm its looking like AMD will push for 2500-2600 shaders many are saying 2560 but no one knows for sure yet
thats 25% increase in shaders however we can see from the 7870 to 7950 a 20-30% increase in shaders didnt do much for performance
AMD needs more ROPs and higher clocks for GCN to scale well with a large number of stream processors
so with just increasing shaders AMD wont get far they will need to up the # of compute units as well as TMUs and with that ROPs count needs to be bumped up to maintain a balanced GPU design Tweaks in architecture will help but a simple bump in shaders would mean that a heavily clocked 7970 could possible catch the 8970 if the basis of 40% is compared to the 925 Mhz stock cards in which case we see the 7970 at full overclocks pulling as far as 20% faster right now on avg. that would make a stock 8970 just 20% faster so a better balance and more optmized design is necessary.
NVIDIA already has their design finished, AMD on the other hand we can only hope didnt screw the pooch.
Look up any Nvidia transcript this year and 28nm yields issues along with margins will be the dominate fall-back.
Nvidia is currently in talks with Samsung to use its 28nm fabs but Samsung is more expensive and Nvidia only uses Samsung for initial fab of desings and looks to Global Foundries and TSMC for production.
Samsung will have a open slot given there recent litigation with Apple and companies like Qualcomm, Nvidia and others will be looking to fill in that slot and Samsung will charge a premium i'm sure.
Also, a useful post from OCN and my reply:
----- Exactly... and we may see further optimizations ala the GF104 vs. GF114. I doubt it'll come in at "just" 700mhz, but if it does, it's still not outside the realm of possibility that it could be 50% faster out of the box.
In fact, had nVidia done this, to a degree, would amount to price fixing, and of course, is illegal.
Of course, now that both cards are here, and we can see the physical size of each chip, we can easily tell that this is certainly NOT the case, at all, so whatever, it's all just marketing drivel.
In fact, it wouldn't really be any different than AMD talking about Steamroller. :p "Man, we got this chip coming...";)
Wound up being a big win for them on the business side of things (because it IS a midrange card from a manufacturing point of view, with a high end price) and a loss for consumers (who lost out on potentially much greater performance).
Its better to release a product when its truly ready than to release early with massive issues my guess is with Kepler Nvidia learned from their mistakes with Fermi and to great effect.
That theory doesnt really reflect Nvidias own stance and it makes less sense given 2 quater straight AMD has gain market share in discrete graphic sector.
Think thats more of a forum myth driven by fanboyism.
Think about it. As a company your loosing market share and sales down 1million units sold form quater to quarter. You'd think it be the opposite if your selling a mid-range chip at great profit for the high-end market.
If for some weird reason that would be true then its a horrible design and execution.
-All the rumors and leaked info until late Jan/Feb of this year which had the GTX 680 being based on the GK110. That wasn't one or two isolated rumors... there was tons of info floating around indicating that to be the case. Almost NOTHING indicated GK104 to be the high end chip, not until GK110 completely disappeared and rumors of yield problems started cropping up all over.
-The limited memory bus (256 bit) on the GK104, which is typically reserved for mid level cards and not high-end
-The PCB design itself, most notably as it appears on the 670 (which is close to being a half-length PCB in the reference designs).
If you assume that GK110 was originally supposed to be the 680 and GK104 was to be the 660ti, as I do, it makes sense of the above information quite well. As for Nvidia not "making out like [a thief]", the explanation for that is readily apparent in their yield problems, which affected GK104 as well (remember - the GTX 680 was basically a paper launch for 2+ months). Also, aren't desktop GPUs a relatively low-profit/revenue area anyways from a business perspective?
We'll never know with 100% certainty, but I think that it makes better sense of the available data that the original GTX 6xx lineup was to include both Gk110 (680/670?) and GK104 (660ti/660).
Period.
Die sizes say GK100 or whatever was never possible.
HD 7970:
GTX 680:
Note how the AMD chip has nearly 33% more transistors, but is barely physically larger than GTX 680.
If nVidia could have fit more functionality into the same space, they would have.
They could have planned to release something different all they wanted, but if they had, that chip would have to have been quite a bit larger than HD 7970 is.
Since nvidia is selling a chip that is much the same size as 7970. per wafer ,they aren't getting that many more chips.
If Nvidia is selling a mid-range chip as high-end, they either have HUGE HUGE HUGE design issues,
OR AMD is doing the exact same thing.
:roll:
Fact isd, GTX 680 ain't no mid-range chip, unless you beleive that most of that there chip is deactivated.
Not at all.
But the fact of the matter is that what nVidia can do with TSMC's 28nm, AMD can as well.
And AMD's already 33% more efficient in used die space.
If you beleive the 7.1 billion transistor thing, than it must be twice as big as current GTX680 silicon(3078 Million transitors, BTW), or current GTX 680 really is a horrible horrible design, and it's a feat of wonder that nvidia managed to get it stable.
And how does a doubling of transistors only equal a 55% increase in performance?
Oh, I read it just fine. :p
Argue that it's bogus... :roll: