Thursday, April 23rd 2015
AMD to Skip 20 nm, Jump Straight to 14 nm with "Arctic Islands" GPU Family
AMD's next-generation GPU family, which it plans to launch some time in 2016, codenamed "Arctic Islands," will see the company skip the 20 nanometer silicon fab process from 28 nm, and jump straight to 14 nm FinFET. Whether the company will stick with TSMC, which is seeing crippling hurdles to implement its 20 nm node for GPU vendors; or hire a new fab, remains to be seen. Intel and Samsung are currently the only fabs with 14 nm nodes that have attained production capacity. Intel is manufacturing its Core "Broadwell" CPUs, while Samsung is manufacturing its Exynos 7 (refresh) SoCs. Intel's joint-venture with Micron Technology, IMFlash, is manufacturing NAND flash chips on 14 nm.
Named after islands in the Arctic circle, and a possible hint at the low TDP of the chips, benefiting from 14 nm, "Arctic Islands" will be led by "Greenland," a large GPU that will implement the company's most advanced stream processor design, and implement HBM2 memory, which offers 57% higher memory bandwidth at just 48% the power consumption of GDDR5. Korean memory manufacturer SK Hynix is ready with its HBM2 chip designs.
Source:
Expreview
Named after islands in the Arctic circle, and a possible hint at the low TDP of the chips, benefiting from 14 nm, "Arctic Islands" will be led by "Greenland," a large GPU that will implement the company's most advanced stream processor design, and implement HBM2 memory, which offers 57% higher memory bandwidth at just 48% the power consumption of GDDR5. Korean memory manufacturer SK Hynix is ready with its HBM2 chip designs.
71 Comments on AMD to Skip 20 nm, Jump Straight to 14 nm with "Arctic Islands" GPU Family
I'd take the implied constraint on HBM memory at face value ,I mean was it possible for them to make enough , not in one plant ,that shits gonna be hot potatos for a few years yet and pricing will confirm this.
AFAIK the main customers will be AMD (GPUs, APUs) and nVIDIA (at least GPUs). We know nvidia isn't jumping on until HBM2 (Pascal), and it can be assumed by the approximate dates on roadmaps APUs will also use HBM2. We know Arctic Islands will use HBM2.
There may be others, but afaict HBM1 is more-or-less a trial product...a risk version of the technology...developed by not only Hynix, but also AMD for a very specific purpose; AMD needed bandwidth while keeping their die size and power consumption in check for a 28nm gpu product. The realistic advantages over GDDR5 with a gpu on a smaller core process that can accommodate it (say 8ghz gddr5 on 14nm) aren't gigantic for HBM1, but it truly blooms with HBM2. The fact is they needed that high level of efficient bandwidth now to be competitive given their core technology....hence it seems HBM1 is essentially stacking 2Gb DDR3 while the mass commercial product will be stacking more-relevant (and cheaper by that time) up to 4-8Gb DDR4.
I personally dont think that AMD and Hynix co-operating on this tech precludes its use in other markets for Hynix ,,with imaging sensors ,Fpgas and some other less known about networking and instrumentation chips being candidates for its use(while not hindering Amd's use of it).
Is Nvidia going to use this on Pascal?? or could that be some other variant like Micron/intel's ,point is with 3D/HBM/3DS were going to be seeing the same high bandwidth memory standards(JEDEC) used on various different propositions over the next few years so I dont think any CO-op tie ins are going to last that long if they do at all and exclusivity wont last but a year at best.
The article got it wrong, as there is no such thing as 14nm flash memory.
www.google.co.uk/search?q=14+nm+flash+memory.&sa=X&biw=921&bih=525&tbm=isch&tbo=u&source=univ&ei=Yk05Vev6DtDUaveTgYgC&ved=0CEsQsAQ
It's my understanding nvidia will use HBM2 in Pascal. Their latest roadmap essentially gave their plan away: biggest chip will use 12GB/768Gbps ram iirc. That means 3xHBM2 4GB.
I think an interesting way for nvidia to prove a point about HBM1 is to simply do the following:
GM204 shrunk to ~1/2 it's size on 14/16nm (so essentially 200-some mm2), with 4/8GB (4-8Gb) 8ghz GDDR5 running at something like 1850/8000
vs
FijiXT
Hypothetically...who wins?
As to the issue of 4Gb not being enough or needing 8Gb... Isn’t more, the amount of memory is almost meaningless if you don't have the processing power to support it? I thought I read 8Gb of HBM will offer up to 1 Terabyte of bandwidth, so given that wouldn't it be a waste for AMD to add extra memory if the GPU designs on 28nm physically prevents a die size that could exploit all that. Would Fiji with 4096 SP's, not have the oomph and watch 50% of such 1Tb bandwidth go unused?
You made a good point when saying, "This can be noticed by the frantic 'dx12 can combine ram from multi gpus into a single pool' coming across the AMD PR bow." But, isn’t that a good thing as a single 390X is not going to offer excellent 4K, but a Crossfire and all 8Gb (2x 4Gb) would act as one. While is any of the color compression (memory) of Tonga able to factored into what Fiji might be able to exploit? I mean Tonga was made for Apples 5K Retina display could that provide an advantage for 4K panels?
I have not a clue, the maths is easy,, but imho its too hypothetical, to clean to easy and chips dont bin that way , not many nodes have panned out exactly how they were scripted to and its that which makes this cat and mouse chip game so worthy of debate.
Part of reason i believe amd is competitive with nvidia is due to higher memory bandwidth that keeps their gpu's there. AMD likely fears that day nvidia switches to HBM.
AMD fulfil the launch customer requirement, but I suspect that many other vendors are waiting to see how 2.5D pricing aligns with product maturity, and 3D pricing/licencing and standards shake out. AFAIA, the HBM spec while ratified by JEDEC is still part of ongoing (and not currently resolved) test/validation ratification/spec finalization process - such as the IEEE P1838 specthat (I assume) will provide a common test/validation platform for 3D heterogeneous die stacking across HBM, HMC, Wide I/O2 etc. Would seem logical. AMD's R&D is probably stretched pretty thin considering the number of projects they have on their books. I'm also guessing that a huge GPU (by AMD's standards) incorporating a new memory technology that needs a much more sophisticated assembly process than just slapping BGA chips onto PCB presents its own problems.
980 needs 7ghz/256-bit at 1620mhz (Yes, it's that over-specced). At 8ghz it could could support up to 1850mhz. Samsung's tech *should* give around a ~30% performance boost (my brain seems to think it'll be 29.7%). Currently, maxwell clocks around 22.5% better than other 28nm designs...which are around to slightly less than 1v = 1ghz. Extra bandwidth from 980 gives it roughly a 4% performance boost going on a typical clock of 1291.5mhz (according to wizard's review), if you wish to do scaling that way. Since I matched them, we don't need that.
1850*(2048sp+512sfu)/4096 = 1156.25 Fiji at matched bw/clock....
...but Fiji has 33% more bandwidth than it needs (or should get ~5.33_% performance from extra bw) so...
1050*1.05333_ = 1106mhz 'real' performance
If you want to get SUPER technical, Fiji's voltage could be 1.14v, matching the lowest voltage of HBM (which operates at 1.14-1.26v) and should overclock some. Theoretically that gm214 would need to be around 1.164v (1850/1.297/1.225= 1.164)....which just so happens to be the accepted best scaling voltage/power consumption on 28nm. Weird...that.
You could go even further, assuming Fiji could take up to 1.26v, as could the HBM....and that HBM is going to at least be proportional to 1600mhz ddr3 at 1.35v...squaring those averages all away (and assuming Fiji scales like most chips; not Hawaii) you could end up with something like a 1240mhz/1493mhz Fiji comparing to a ~2100/9074 (yes, that could actually happen) GM214. It wouldn't be much different than how 770 was set up and clocked, proportionally (similar to gk204 at around 1300mhz; small design at high voltage, if not the pipeline adjusted to do so at a lower voltage). Given that nvidia clearly took their pipeline/clockspeed cues from ARM designs (which are 2ghz+ on 14nm), and their current memory controllers are over-volted (1.6 vs 1.5v spec)....it's possible (if totally unlikely)!
TLDR: Depending on how you look at it, they would be really really damn close...and would be interesting to see just for kicks. That's not to say they won't just go straight to Pascal...which I have got to assume will be something like 32/64/96 rop designs scaled to 1/2/3 HBM similar to the setup of maxwell (.5/1/1.5).
Yeah yeah...It's all just speculation...but I find the similarities in the possibilities of design scaling (versus previous gen) quite uncanny. There are really only so many ways to correlatively skin a cat (between units, clockspeeds, and bw) and these companies plan their way forward years ahead of time (hoping nodes will somewhat fit what they designed)...and one such as that makes a lot of sense.
I'm getting into the crazy talk and writing a sentence every ten minutes between doing other stuff....must be time to sleep. :)
I am in no disagreement that putting 4/8(16? How does 2x1GB work?) stacked ram die + 1/2 gpus on a (probably) 832 or 1214mm2 interposer is likely a huge pain in the ass....It just seemed that was at least *part* of the issue. Lot of Q's there.
Buffer size and bandwidth are two different things. Sure, they could swap things out of buffer with faster bandwidth, but that's generally impractical (and why extra bandwidth doesn't give much more performance). A larger tangible buffer for higher-rez textures is absolutely necessary if you have the processing power to support it, which I think Fiji does (greater than 60fps at 1440p requiring ~4GB.)
I do not believe AMD's (single-card) 8GB setup will be 1280Gbps, I think that is the distinction made by '2x1GB'. I believe it will be 640 just like the 4GB model. I would love to be wrong, as that would provide a fairly healthy boost to performance just based on the scale.
I personally believe (scaling between 1440p->2160p) is where the processing power of 390x will lie. Surely some games will run great at 30->60fps at 4k, but on the whole I think we're just starting to nudge over 30 at 4k....it's generally a correlation to the consoles (720p xbox, 900p ps4). I personally don't think 4k60 will be a consistant ultra-setting reality until 14nm and dual gpus....hopefully totalling 16GB in dx12. Buffer requirement could even go higher, and if there's room, PC versions can always use more effects to offset whatever scaling differences.
I'm not at all saying the improvements in dx12 don't matter, they absolutely do, only that for the lifespan of this card they cannot be depended upon (yet)...and in the future, worse-case, they may still not be. How many dx9 (ports) titles do we still see?
When you smush everything together into a box, I personally believe these cards avg out making sense around ~3200x1800 and 6GB. Obviously ram amount will play a larger factor later on, as it becomes feasible to scale textures from consoles making the most of their capabilities. That means more xbox games will be 720p, more ps4 games slightly higher, rather than the current 1080p. Currently the most important scaling factor is raw performance (from those inflated resolutions on the consoles).
There are certainly a lot of factors to consider, and obviously even more unknowns. I can only go on the the patterns we've seen.
For instance, I use a 60fps metric. Just like performance dx12 may bring, perhaps we will all quickly adopt some form of adaptive sync making that moot. As it currently sits though, I personally can only draw from the worst-case/lowest common denominator, as nothing else is currently widely applicable.
So it's a risky move, if the 14nm production technology is not going well you are in deep shit.
"Arctic Islands" is a fun name, but why exactly does everyone think the cards will be so much cooler? Heat transfer from a surface is a function of the area, when looking at a simplistic model of a chip. When you decrease the manufacturing size by half, you lose 75% of the surface area. Yes, you'll also have to decrease voltage inside the chip, but if you look at a transistor as a very poor resistor you'll see that power = amperage * voltage = amperage^2 * resistance. To decrease the power flowing through the transistor, just to match the same thermal limits of the old design, you need to either half the amperage or quarter the resistance. While this is possible, AMD has had the tendency to not do this.
HBM is interesting as a concept, but we're still more than 8 months from seeing anything using it. Will AMD or Nvidea use the technology better, I cannot say. I'm willing to simply remain silent until actual numbers come out. Any speculation about a completely unproven technology are just foolish.
TL;DR:
All of this discussion is random speculation. People are arguing about things that they've got no business arguing about. Perhaps, just once, we can wait and see about the actual performance, rather than being disappointed when our wild speculations doesn't match with what we actually get. I'm looking forward to whatever AMD offers, because it generally competes with Nvidea on some level, and makes sure GPU prices aren't ridiculous.
At this point, them skipping it was inevitable as it was not good for high end performance. Lets just hope 14nm is a great success in the distant future.
AMD knows that they currently have a reputation for designing GPUs that run too hot and use too many watts for the same performance as an Nvidia GPU. I'm not saying they deserve that reputation but it does exist. Over and over I see people citing those two reasons as why they won't buy an AMD card. As far as the extra watts used it doesn't amount to anything much on an electricity bill for an average gamer playing 15-20 hours a week unless you live in an area where electricity is ridiculously expensive or you're running your card at max 24/7 for Folding or Mining. For me the difference would be about 8 cents a month on my power bill between a reference GTX 780 Ti (peak 269 watts) and a reference R9 290X (peak 282 watts) from W1zzard's reviews based on the last generations flagship cards. Even if AMD used 100 watts more than Nvidia it still wouldn't amount to much. 65 cents a month difference at 10 cents per kWh.
AMD is already the brunt of many jokes about heat/power issues. I don't think they would add fuel to the fire by releasing a hot inefficient GPU and calling it Arctic Islands.
Maxwell is good, and saving while gaming is commendable, but the "vampire" load during sleep compared to AMD ZeroCore is noteworthy over a months' time.
I ask why Apple went with the AMD Tonga for their iMac 5K Retina display? Sure it could’ve been either Apple/Nvidia just didn’t care to or need to "partner up". It might have been a timing thing, or more the spec's for GM206 didn’t provide the oomph, while a GTX 970M (GM204) wasn't the right fit spec's/price for Apple.
Still business is business and keeping the competition from any win enhances one's "cred". Interestingly, we don’t see that Nvidia has MXM version of the GM206?
For R9 285 being a "gelding" from such a design constrained process it came away fairly respectable.