Monday, May 19th 2014
AMD Readies 28 nm "Tonga" to Take on GM107
NVIDIA's energy-efficiency leap achieved on existing 28 nanometer process, using the "Maxwell" based GM107, appears to have rattled AMD. The company is reportedly attempting a super-efficient, 28 nm, mid-range chip of its own, codenamed "Tonga." The chip could power graphics cards that compete with the GeForce GTX 750 Ti and GTX 750. The chip is likely to be based on Graphics CoreNext 2.0 micro-architecture, the same one that drives "Hawaii," which means AMD isn't counting on the micro-architecture for efficiency gains. It could feature an evolution of PowerTune, which works closer to the metal than its existing implementation on "Hawaii." Other features could include Mantle, TrueAudio, and perhaps even XDMA CrossFire (no cables needed). The chip could be wired to up to 2 GB of memory.
Another equally plausible theory doing rounds is that "Tonga" could be a replacement to "Tahiti Pro," designed to compete with the GK104 at much lower power footprint (than "Tahiti"), so AMD could more effectively compete with the GeForce GTX 760. The chip could be similar in feature-set to "Tahiti," with a narrower memory bus (256-bit wide), but higher clock speeds to make up for it. If this theory holds true, then "Tonga" could disrupt both Tahiti Pro and "Curacao XT." Curacao XT (R9 270X) is designed to offer a value-conscious alternative to the $250 GTX 760. The R9 280 is competitive in performance, but takes a beating on the energy-efficiency front, and is also costlier to manufacture, due to the higher transistor count and four additional memory chips. We could hear more at Computex 2014.
Source:
VideoCardz
Another equally plausible theory doing rounds is that "Tonga" could be a replacement to "Tahiti Pro," designed to compete with the GK104 at much lower power footprint (than "Tahiti"), so AMD could more effectively compete with the GeForce GTX 760. The chip could be similar in feature-set to "Tahiti," with a narrower memory bus (256-bit wide), but higher clock speeds to make up for it. If this theory holds true, then "Tonga" could disrupt both Tahiti Pro and "Curacao XT." Curacao XT (R9 270X) is designed to offer a value-conscious alternative to the $250 GTX 760. The R9 280 is competitive in performance, but takes a beating on the energy-efficiency front, and is also costlier to manufacture, due to the higher transistor count and four additional memory chips. We could hear more at Computex 2014.
27 Comments on AMD Readies 28 nm "Tonga" to Take on GM107
Tonga is replacing Tahiti Pro as you can tell;
FirePro W8100(Tonga XT) from FirePro W8000(Tahiti Pro)
R9 M295X(Amethyst XT) is above Pitcairn by a large margin.
Iceland is a rebranded Oland.
Maui has yet to be revealed.
It would make sense AMD use a mainstream 28nm GPU to introduce some new architectural changes since
1.) The 20nm process is not ready yet
2.) Mainstream GPUs are going to be on 28nm for the foreseeable future (since it less than half the price of 20nm) which means that this 28nm GPU will not be obsolete once 20nm GPUs come out That's two different architectures; the OP states that this new GPU will use the same architecture as the old one. A better comparison is the 3870 to 4870, which were both on the 55nm process and both used VLIW5, although the 4870 did have architectural tweaks compared to the 3870. The OP indicates that the new GPU doesn't even have that opportunity to improve performance.
Considering all the upcoming games are the same old **** recycled, visuals and content wise, I'm not sure there's much to get excited about here...
I've been saying since the launch of Tahiti that this part should exist (as a refresh part...launching more than a year ago). I totally understand why Tahiti is what it is. Everything from the 'extra' units on the early process for yields; it really only needs 28-30 CUs on the over-under (similar to GK110 only needing 12-13 smx, where-as gk104 'needs' all 8), to having extra compute as a differenciator, to having sufficient bandwidth up to the processes' max clock for those units at a 300w TDP that really only made sense with a 384-bit bus (or help Tahiti Pro's performance per clock be similar to part with up to the ideal count of ~1880sp shaders for 32 rops), to having the extra ram for higher resolution (at the time vs 2GB). It made sense from a certain point of view, for that certain point in time, but what it was never going to be was under 225w (in a non-wasteful config) or a shorter-length card because of that...and currently Tahiti is really weird shoe-horned as a 'max 250w' part (because of Hawaii).
Nvidia went that opposite route with gk104. It hurt them in the beginning for yields because of the tight design and faster controller, but people paid the absurd price when they branded it a high-end card because of that 'efficiency' (in power vs cost). As time went on and yields got better that design made more and more sense for it's now current market, especially when you look at where the 225w max tdp puts their clocks versus the average low-voltage (power-efficient) clock on the process...it's fairly genius and perfect. To this day, a shorter-length 680 (equivalent to 1792sp but requiring less bandwidth) would be a ideal card for a lot of people, as compared to the over-reach they did with 770. I imagine we something similar with maxwell (10-12 SMM)...getting as close to ~1880sp (a smm is similar to 160sp but needing bw for 128 because of cache/unit structure of sp/sfu) with best power usage through more efficient ram, more cache, etc, more ideal clock/v while staying somewhere around 225w.
I wouldn't begrudge AMD getting gpu SKUs out the door that were (over?/)under 225w and the best mix of units/clock/bandwidth per that tdp/die size/cost with a 256-bit bus, as this showdown was enevitable (be it earlier versus gk104 or later versus an ~11-12 unit Maxwell)...but damn, they missed that boat by a long, long, way...even if only now does newer ram (ie 1.5-1.55v 7gbps, perhaps 1.6v 8gbps, certainly 4Gb if not 2Gb because of HBM) make total efficiency and more perfect balance of units/clocks/bandwidth fit within that tdp.
Have a look at this chart to understand what I'm talking about:
They have, for now, some issues understanding how exactly to implement improvements addressing this problem. :)
They had the problem with GDDR5 memory controller on Radeon HD 4890 which was fixed in Radeon HD 6870.
Either way I think its a fact that Nvidia have been better for the past two generations in Performance/Watt in Games. I think Nvidia are ahead of AMD`s newly designed architecture in fine tuning it, also AMD has no excuse next gen.
Edit: I agree with OP that AMD has been rattled by the GTX 750 TI, and is scrambling for a response. AMD always chasing the leader and always a step behind Intel and Nvidia it seems
Looking at the two they'll offer very similar "visual experience’s" when in the fairly mundane i3 OEM boxes they mostly find their way into, while depending on the game spar somewhat consistent. Power efficiency for the GTX750Ti is while at "peak gaming" is 57W vs the 260X of 93W so yes like 50%. The question how much would gaming say 2hr's a day change your monthly bill if you then factor in that 80% of the month your computer is in Sleep. At which point AMD has ZeroCore meaning the 260X almost shuts-down (~1-2W), while the 750Ti continues drawing its' idle power of 7W. I'd like to see the total power used from such identical OEM i3 systems, running the identical average person's work/gaming/sleep and see the difference in total power on a monthly time-frame.
Kepler and GCN were designed with different goals in mind. I"m not gonna delve into that now, but a Kepler processing cluster is simpler decode and schedule wise than a GCN one. Maxwell is more of a move in the direction of GCN and Fermi than in the direction of Kepler.
The GM107 chip is bigger than 0.5 the size of GK104 yet it contains only 0.4x the number of ALUs of GK104.
Lastly, from the reviews I've seen, Maxwell's efficiency, which is more like 20% more efficient than GCN, could
be mostly due to optimization on the fabrication level, something AMD did with their new lower power Beema chips.
In his reviews, he talks about card design (through disassembling it), performance in games, power consumption, overclocking, noise, and temperatures. Nothing is mentioned about software bundles or GPGPU, and accessories and price are only a minor part. To sum it up, he favors game performance, power consumption, and overclocking, and NVidia usually beats AMD in those categories. If he put a focus on software bundles, accessories, and price, AMD would have much better review scores. You could call it bias that he favors categories in which NVidia usually does well, but he does applaud AMD when the company does excel in those categories (e.g. HD 5970), and he hits NVidia hard when the company does not produce a card that does well in those categories (e.g. GTX 590). In my opinion it's just a different approach to reviewing, and it's the reason why there are many review sites all with different opinions.
Now I don't see super great things from Tonga, as both AMD/Nvidia want to prop-up their pricing structure to not dilute it to much once 20nm does come along. They have to work with what they’ve got, while not showing so much of their architectures (prowess) to offer substancial progress once 20nm mainstream make real financial sense.
If AMD get R9 280+ performance, while similar perf/w % found with a GM107 that would be fine. The bigger up side is if AMD got a better pricing on the wafers from GloFlo which should make even beter pricing , while the best part the geldings for $200 or less?
I don't see TSCM giving Nvidia any break on wafer pricing at least at this point. If Nvidia ends up on a similar GK104 size part, being their pricing at this point appears they can't move GK104 chips/cards for much under $230, AMD will have the BfB on their side. Heck Egg has had a Sapphire 280 (Tahiti) for $200... now down to $190 for a week or better. Consider why would AMD move to GloFlo if they can give Tahiti's for that; they'll have to see a significant difference on the price to help justify the risk .
In our April 28th AAPL Update, we noted that 20nm production levels at Samsung Austin were in the 3000-4000 wpm range. These volumes were sufficient to debug/improve their yields as they vied for second source position for the AAPL designs. But our latest checks indicate a surprising twist to the 20nm development story. We are getting indications that Samsung Austin is planning to ramp their 20nm technology designs to 12,000 wpm by July, but the upside is for QCOM, not AAPL. It is our understanding that QCOM is not happy with the 20nm development/yield progress at TSM and thus have been qualifying their latest technology node designs at Samsung. Obviously, the potential loss of business from AAPL and QCOM would be bad news for TSM after recently losing the AMD (AMD) GPU business. And while 20nm demand will continue to be strong for TSM, we expect Samsung to be a viable threat to TSM for the advanced process nodes going forward.
blogs.barrons.com/techtraderdaily/2014/05/08/taiwan-semi-increased-risk-of-losses-to-samsung-says-bluefin/
Nvidia has their own issues with their low power cards and new "performance" drivers, and other drivers it seems. I would rather burn off a whole extra 10 watts to keep my screen from going black or flickering.