Wednesday, March 21st 2012
GK110 Specifications Approximated
Even as launch of the GK104-based GeForce GTX 680 nears, it's clear that it is emerging that it is not the fastest graphics processor in the GeForce Kepler family, if you sift through the specifications of the GK110 (yes, 110, not 100). Apparently, since GK104 meets or even exceeds the performance expectations of NVIDIA, the large-monolithic chip planned for this series, is likely codenamed GK110, and it's possible that it could get a GeForce GTX 700 series label.
3DCenter.org approximated the die size of the GK110 to be around 550 mm², 87% larger than that of the GK104. Since the chip is based on the 28 nm fab process, this also translates to a large increment in transistor count, up to 6 billion. The shader compute power is up by just around 30%, because the CUDA core count isn't a large increment (2000~2500 cores). The SMX (streaming multiprocessor 10) design could also face some changes. NVIDIA could prioritize beefing up other components than the CUDA cores, which could result in things such as a 512-bit wide GDDR5 memory interface. The maximum power consumption is estimated to be around 250~300 Watts. Its launch cannot be expected before August, 2012.
Source:
3DCenter.org
3DCenter.org approximated the die size of the GK110 to be around 550 mm², 87% larger than that of the GK104. Since the chip is based on the 28 nm fab process, this also translates to a large increment in transistor count, up to 6 billion. The shader compute power is up by just around 30%, because the CUDA core count isn't a large increment (2000~2500 cores). The SMX (streaming multiprocessor 10) design could also face some changes. NVIDIA could prioritize beefing up other components than the CUDA cores, which could result in things such as a 512-bit wide GDDR5 memory interface. The maximum power consumption is estimated to be around 250~300 Watts. Its launch cannot be expected before August, 2012.
34 Comments on GK110 Specifications Approximated
AMD, please kick Nvidias ass so they will be forced to release this soon!? :nutkick:
400-450mm2 is the max size that worth it to make.
Or if NV want as huge 550mm2 gpus, one gpu cost will be 500-1000USD, not the card, the card will be around 1500USD, because only few gpus will be good in one wafer.
Till the time they want to release this card the 28nm yields would have improved a lot, they wont make the same mistake they did with 40nm - release the monolithic gpu first with very bad yields, poor power/performance and costs .
As for GK110. I'd go out on a limb (not really) and say that 384-bit memory bus is the minimum with 448 or 512 not out of the question. GK104 probably isn't anywhere close to where Nvidia want to be regarding bandwidth and double precision (esp as Tesla/Quadro wont be clocked anywhere close to GeForce if history is any indication) in the HPC and workstation/pro graphics area's. Adding a larger memory controller and compute functionality in addition to an increased shader count is definitly going to balloon out the size of the die. Anyone know TSMC's max reticle size ? What was GT200 - 576mm² ? GT200, GT200b, GF110, GF100 and G80 would beg to differ. From 90nm to 40nm, Nvidia's big die increased from 484mm² to 520mm². If Nvidia have a proven track record in anything, it's that they have no qualms about using as much silicon real estate as is needed to include the functionality that they want. Unlikely IMO. Can't see GK110 beating an HD7990 or GTX680 duallie in performance, and I could well see both those cards at a $899-999 price tag unless the single GPU cards pricing takes a nosedive. As for GPU pricing. Even if a 28nm wafer was $10,000 (and it's probably a lot less) you are estimating that TSMC and Nvidia could squeeze out only 10-20 functional GPU's ? And you job title at TSMC is ? Do TSMC have a VP of Trolling ?
Since the SP per SMX is probably going to be lower than on GK104, following Fermi's tradition, personally I have two posibilities in my mind:
1) 8 GPC, 2 SMX per GPC, 160 SPs per SMX (10 SIMD lanes), 2560 SPs total, 128 TMU, 48 ROP, 384 bit.
2) 6 GPC, 3 SMX per GPC, 128 SPs per SMX (8 SIMD), 2304 SPs total, 144 TMU, 48 ROP, 384 bit.
I like the first one more than the second one, because appart from a higher number of SPs that would justify the much bigger die and transistor count, it also ensures better Geometry performance for Quadro line, and mimics Fermi on that it doubles GPCs while keeping the number of SM the same, which would help in compute tasks too.
Of course there are countless of combinations that would be posible, but those are the ones that make more sense to me, all things considered.
Its all the other things, which the 104 seems to have little of
Damn, I must've ripped off the bastards big time !:rolleyes:
basically along the lines of what Benetanegia is saying, which generally makes a lot of sense, at least in my mind.
Acording to die shots it looks like 50% of die is shaders. Hard to tell tho and I have not measured it "scientifically".
In any case, less shader space only gives strenght to my point. Remember that GK110 is more GPGPU oriented so I'd say that number crunching shaders are relevant, more so when die area is not going to increase a lot by adding them. What would be the purpose of increasing ROPs, TMUs, and other units beyond the increase in shader units? My example #1, is a 66% increase in shaders, a 50% increase in ROP/MC and 0% TMU (Fermi did well with 64), for a total increase in transistors and die size of 80%. So that gives ample die and transistor count for increased register and caches which is what GK104 is lacking. Second example is a net 50% for everything, with even more space for cache and RF, but IMO a less likely scenario. Just speculating anyway.
I fail to see why would they increase shaders by only 30% and memory/ROP by 100%, when that is not going to increase GPGPU nor gaming performance, it's a waste. Fermi had only a 30% increase in shaders and 50% in ROP/memory and 43% in die size. The shaders only increased in 30% because they didn't have many options, SMs could only go from 48 SP down to 32 SP and they had a single dual issue dispatcher, so that really limited the number of total SPs IMO, do your own calculations about which other options they had, I tell you right now: not many. With Kepler's SMX I see no reason for not going with a larger amount. Due to how they included many dispachers in the SMX's I think that 5 SIMD pairs is more than reasonable, as per my example number one. At least 4 pairs is a given as per example 2.
384-bit.. hmm.. going by the presumption that GK110 will also be the 'base' for next gen Quadro cards(high-end of course), i believe the next gen quadros/geforces will feature even higher memories & higher bandwidths which might mean 512-bit.. but i could be wrong.. only time will tell
With almost 2x die size and looking at the die shot of GK104, there wouldn't be any problem to fit 512 bit MC on the borders most probably and in case they included a 512 bit interface IMO it would be clocked at 5000 Mhz or a little higher, but far from 6000 Mhz. Reasons: 1) That should make the controler itself smaller. Easily allowing for 512 bit. 2) Professional cards will not feature extra high clocks so as to increase reliability. ECC memory IS slower too. 3) Because high density and fast memory chips would cost a fortune, an innecesary fortune.
Running at 5000 Mhz with a 512 bit interface, memory BW would be 80% higher than on GK104 cards and I think that is more than enough. It would also offer a little more BW than 384 bit @ 6 Ghz, for almost no price increase, so you ended up convincing me. :)
the top chip is no longer for enthusiasts but for professional use. (All of Intel's 8-core chips go to server procs) I'm sure they'll release this baby for consumers but most chips will go to workstation cards where $2k - $5k isn't out of the question. 512 bus makes more sense in that case.
I think AMD will still go mainstream-ish with their 8000 series. We'll get a performance increase but they'll be under 400 mm^2.
I also think that nVidia had the GK100 in the development cycle but didn't want another GF100. Work longer on the development cycle so that the initial release comes off without a hitch.
If they had done a release of GK100 it'd be worse than the GTX480 was at release. But I'm sure the GK110 (GTX780?) will be problem free.