Friday, February 10th 2012

NVIDIA GeForce Kepler Packs Radically Different Number Crunching Machinery

Feb 10th, 2012 00:43 Discuss (139 Comments)

NVIDIA is bound to kickstart its competitive graphics processor lineup to AMD's Southern Islands Radeon HD 7000 series with GeForce Kepler 104 (GK104). We are learning through reliable sources that NVIDIA will implement a radically different design (by NVIDIA's standards anyway) for its CUDA core machinery, while retaining the basic hierarchy of components in its GPU similar to Fermi. The new design would ensure greater parallelism. The latest version of GK104's specifications looks like this:

SIMD Hierarchy

4 Graphics Processing Clusters (GPC)
4 Streaming Multiprocessors (SM) per GPC = 16 SM
96 Stream Processors (SP) per SM = 1536 CUDA cores

TMU / Geometry Domain

8 Texture Units (TMU) per SM = 128 TMUs
32 Raster OPeration Units (ROPs)

Memory

256-bit wide GDDR5 memory interface
2048 MB (2 GB) memory amount standard

Clocks/Other

950 MHz core/CUDA core (no hot-clocks)
1250 MHz actual (5.00 GHz effective) memory, 160 GB/s memory bandwidth
2.9 TFLOP/s single-precision floating point compute power
486 GFLOP/s double-precision floating point compute power
Estimated die-area 340mm²

Source: 3DCenter.org

Add your own comment

139 Comments on NVIDIA GeForce Kepler Packs Radically Different Number Crunching Machinery

phanbuey

wow... that is definitely different...

LiveOrDie

I bet your mommy always told you to eat your greens ;)

ViperXTR

its looking like an AMD specification now hehe (wait 32 ROPs? D: )

Shou Miko

i just hope they a serious about that 2048mb of memory if not it will be a shame.

EpicShweetness

These specs are defiantly strange for an Nvidia chip. 1536 CUDA Cores is triple that of the GTX 580, yet with only a 30% reduction in the size of the fabrication as well the fact that GK104 is smaller then GF110. This only indicates a few things, a "nerf" on the CUDA core itself, or the architecture is much more "cluster based". Very Interesting I'll be following this closely

LAN_deRf_HA

It's a lot more shaders but they're running much slower too. Seems it'd even out on the heat front.

ViperXTR

just like what the HD 2000 and the present 7000 cards are doing, moar shaders but lower clocks (or rather clocks are tied with the TMU/ROP clocks)

radrok

My massive loop is waiting for the heat :rockout:

hardcore_gamer

Die size is very close to that of 7970 (365mm2). Interesting:cool:

#10

radarblade

Seems like Nvidia's pretty prepped up to wipe AMD off the slate! But what would be the TDP on these things? Preferably lesser than the earlier 480 and 580 heaters. :)

#11

TheoneandonlyMrK

Interested In how this is going to be 50% faster then a7970 they seem similar I'm shader layout

#12

NC37

The end of NV's monolithic GPU era is at hand...was about to say...Bout freaken time! ATI was slower at first when they switched but I knew eventually NV would have to change too.

Very interested to see how well NV does at ATI's own game.

#13

gaximodo

this isn't supposed to be NV's flagship anywayz.

#14

Xaser04

gaximodothis isn't supposed to be NV's flagship anywayz.

GK104 so GTX560Ti replacement (ish).

Considering this is 1536 shaders it would be logical to assume that the full fat model would have 2048 shaders, after all the GTX560TI was - in simplistic terms - roughly 75% of a GTX580.

The shader count itself is very interesting.

The increase in shaders (384-1536 if we assume a GTX560TI replacement) would suggest that each Kepler shader is less complex than its Fermi contemporary.

If we also assume similar performance to the HD7950 (doesn't seem to unrealistic) then clock for clock GCN and Kepler could be quite evenly matched (HD7950 has more shaders but a lower core clock).

Should be very interesting.

#15

Crap Daddy

theoneandonlymrkInterested In how this is going to be 50% faster then a7970 they seem similar I'm shader layout

This is not going to be 50% faster than 7970. Judging by the specs it should fall between 7950 and 7970 at a rumored 300$.
GK110 will probably be the Tahiti killer. At a price...

#16

Red_Machine

At this rate, I will feel compelled to replace my 580. GK110 will likely be 70-80% faster...

#17

pantherx12

Red_MachineAt this rate, I will feel compelled to replace my 580. GK110 will likely be 70-80% faster...

I reckon it will be half that, at best. :p

#18

Benetanegia

I assume this specs have been judged legit since Btarunr did post them unlike most others.

Ah crap they are too different, imposible to guesstimate the performance based on them (don't know how other people are so sure). I'll try to make my analysis anyway.

At a first glance it looks like they doubled GF104's shader domain (128 TMU, 4 GPCs, etc.) and then doubled the shader amount per SM because abandoning hot clocks allows for that. Performance wise the end result should be similar.

Based on die size this chip must contain twice the amount of transistors on GF104, while retaining the 256 bit bus, so there's no compelling reason to assume the shaders are any less capable than they were in Fermi. They could have just as easily gone with 768 SPs and hot-clocks within the same die size.

And finally efficiency. That's the key to knowing the performance. We don't know how well they will be able to use all those SP. I'd assume they are using 6x16 SP wide superscalar shader multiprocessors, but with how many schedulers? GF104 had 2. So now they have 4? Or since shaders run at half the speed the schedulers are just issuing the same amount of ops-per-cycle? (in reality cycles-per-op)

So many questions but I had fun. Based on raw specs this chip has the potential to rape any other card on the market, think 2x GTX560 Ti, at least at 1080/1200p. But efficiency/scaling is the key factor and that's completely unknown to us.

EDIT: As you can see, I changed my mind competely as I was writing this post. I first thought they were very different and came to realizing that they are pretty much the same. If you think about Fermi based GF104/114 as a 768 SP chip with no hot-clocks, they just doubled the amount of GPCs.

#19

Filiprino

NVIDIA seems that has come with something very similar to GCN from AMD. But after all it's NVIDIA and the successor to Fermi, so we'll have to wait and see performance numbers.

#20

General Lee

I wouldn't take them without a big grain of salt, but it's always fun to do some what iffing.

The specs look similar to what AMD has now, so given the estimated die size and unit counts, I'd say it would reach 580/7950 level performance. I doubt they'll price it at 300$ if 7950 is at 470$. More likely it's at best 50$ cheaper, that's enought to get the ball rolling. It's not really difficult to undercut the 7900 series in price, so regardless of performance it shouldn't be hard for Nvidia to claim a perf/$ crown simply because 7900 is sold at a premium currently. Of course AMD should respond to that, and I think this is the scenario we all hope for.

#21

xenocide

General LeeI wouldn't take them without a big grain of salt, but it's always fun to do some what iffing.

The specs look similar to what AMD has now, so given the estimated die size and unit counts, I'd say it would reach 580/7950 level performance. I doubt they'll price it at 300$ if 7950 is at 470$. More likely it's at best 50$ cheaper, that's enought to get the ball rolling. It's not really difficult to undercut the 7900 series in price, so regardless of performance it shouldn't be hard for Nvidia to claim a perf/$ crown simply because 7900 is sold at a premium currently. Of course AMD should respond to that, and I think this is the scenario we all hope for.

A lot of people are holding out for Nvidia just to see prices level out. If they sell a card on par for the 7950 $100 cheaper, they'll make up the difference in volume. I guarantee they would sell twice as many cards as if they priced it around $450.

#22

jamsbong

Confirmed Nvidia is doing an ATI!
The specs look so identical that if I rename these specs as say....

HD7870:
256bit GDDR5 2GB memory
1536 CU, 128TMU, 32ROP, small 340mm^2 die size, no hot clocks.

It looks totally believable! Has Nvidia been hiring lots of ATI engineers? or they reversed engineered ATI's Cayman?

Jokes aside, some rational observations:
The specs itself looks like a mid-high end card, will be very competitive price wise as it uses 256bit memory and small die. I won't be surprise that it is only faster than cayman by 10-20%. It will be on par with GTX580 at best.
I believe Nvidia is working on a high end card which has yet to show itself.

#23

Crap Daddy

Charlie seems to be very into Kepler these days. He says the ball is rolling :

"Reports coming in from the far east say that those high up in the priority list started getting Kepler cards in various guises early this week, possibly late last. The number of sightings from sources that SemiAccurate trusts has been going up almost exponentially over the past few days, and will probably keep doing so for a bit."

He concludes:

"If things go as normal, it takes 4-6 weeks from AIB sampling to cards on the shelves. This would mean late March or early April, just like we have been saying for weeks."

#24

arnoo1

seriously 1536 shaders? thats 3 x times more than fermi

#25

1c3d0g

I have a feeling that NVIDIA will kill the competition this time around...Kepler sounds like a new Voodoo2, if y'all still remember that...

Add your own comment

NVIDIA GeForce Kepler Packs Radically Different Number Crunching Machinery

139 Comments on NVIDIA GeForce Kepler Packs Radically Different Number Crunching Machinery

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts

NVIDIA GeForce Kepler Packs Radically Different Number Crunching Machinery

Related News

139 Comments on NVIDIA GeForce Kepler Packs Radically Different Number Crunching Machinery

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts