• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

NVIDIA GeForce GF100 Architecture

Impressive architecture, but above evrything, it has me wondering one thing. If GeForce and Tesla use exactly the same GPU, HOW in hell have they disabled the chip on the Tesla card in order to have 448 SPs???

Me too. From what I understand the smallest group they can dissable is 4X32shaders = 128 shaders
 
i just think why should we beleive what nvidia say, just because they say the GF100 does 43 fps we wont actually know untill launch. it wouldnt be the first time nvidia has just claimed something and we have found it wasnt golden. no one has seen or used this card, so im not gona beleive some babble they come out with.

If you are very skeptical as we all are about this release I'd like to share with you something I wrote.

This post has been a long time formulating, and I welcome any criticisms.

How many of us have gotten at least 3 different claims as to the performance or release of this card? Cynically I've decided that I'm not going to bat my eyelashes at any claims that come out of CES. There's bound to be a little more truth circling the bowl, but most people will excuse me if I assume the cycle of bullsh!t has yet to flush. I'm not sure if the majority of posters/readers will excuse my overall indifference because that isn't very exciting. Likewise it's not hard to speculate that NV may have a true performer to take a crown in 2010, but ATI has a firm place in this generation's the line-up which could mean good or bad things in the future. With the downturn of the global economy there is enough of a depressant force present in a number of software companies to recycle old engines, or adopt some sort of broad design utility. The mainstream GPUs will see more action than chopsticks during the Chinese new-year. I think it's wise to assume that it's getting dangerously close to a point in time where the GPUs must offer stellar performance in a new API because Microsoft not only authors DX runtimes, but they are also a console competitor. Realistically (and correct me if I'm wrong) they're going to merge development of their runtimes with console development. The paradigm shift will be when enough of the software industry is willing to move.

If you accept any of these ideas then I offer a summary of my thoughts.

-the GT300/GT100 series cards are going to take a crown in performance, but this generation will offer little more than a spitting contest between ATI/NV.
-3D environment software development will become further compartmentalized, and game developers will buy into a smart, economic, standard before leaning head-on into a new API which is not yet mature/affordable in hardware support.
-Microsoft (3v!L3) will most likely decide which generation of GPU will hold the standard for a life determined by their next console.

I'm a bit off topic, and a little on topic. It's pretty obvious, but I figured this is a nice mix of topic all rooted around the importance of the GT300.

The thoughts I had behind the post were more or less about what's on my mind about GPUs as a whole. Pay close attention that none of us are too impressed with the benchmarks, but the hardware they've laid out has some serious kick. If DX11 takes off as the next API for big consoles then we're going to see some serious moves by software devs to make their software DX11.
 
Last edited:
Thanks bta, that was a very well written and interesting article. :)

It's a shame they've had to cut the chip down to a 384-bit bus from 512. This reduces it's performance and makes it lopsided (odd memory and bus sizes) which is never optimum for a computer and gives it less processing units. I guess it's likely due to die size and power/heat constraints though.

I'll likely get the 256-bit variant when it comes out, because of this. If this chip really is as good as they claim here, it will still be a great performer.
 
Thanks bta, that was a very well written and interesting article. :)

It's a shame they've had to cut the chip down to a 384-bit bus from 512. This reduces it's performance and makes it lopsided (odd memory and bus sizes) which is never optimum for a computer and gives it less processing units. I guess it's likely due to die size and power/heat constraints though.

I'll likely get the 256-bit variant when it comes out, because of this. If this chip really is as good as they claim here, it will still be a great performer.

Remember this one uses GDDR5, memory performance has been increased a lot. We don't know final clocks, but if it uses the same as HD5xxx it will be a nice boost. After seing the rest of the architecture, I can think of memory being the most limiting factor though (it's been iproved a lot, but not in the "ZOMG! Overkill!" way the rest of the architecture has been improved), but in no way will be crippling it, as in making it uncompetitive. Even when comparing it to the HD5970 IMO.
 
GDDR5 has gotten a heap better since the 4870 days, overclocking GDDR5 on a 384-bit bus will be a whole bunch of fun i reckon :)
 
Thanks bta, that was a very well written and interesting article. :)

It's a shame they've had to cut the chip down to a 384-bit bus from 512. This reduces it's performance and makes it lopsided (odd memory and bus sizes) which is never optimum for a computer and gives it less processing units. I guess it's likely due to die size and power/heat constraints though.

I'll likely get the 256-bit variant when it comes out, because of this. If this chip really is as good as they claim here, it will still be a great performer.

The card was always supposed to have a 384-bit bus? That spec is OLD.
 
Remember this one uses GDDR5, memory performance has been increased a lot. We don't know final clocks, but if it uses the same as HD5xxx it will be a nice boost. After seing the rest of the architecture, I can think of memory being the most limiting factor though (it's been iproved a lot, but not in the "ZOMG! Overkill!" way the rest of the architecture has been improved), but in no way will be crippling it, as in making it uncompetitive. Even when comparing it to the HD5970 IMO.

GDDR5 has gotten a heap better since the 4870 days, overclocking GDDR5 on a 384-bit bus will be a whole bunch of fun i reckon :)

Oh yeah, guys, it's gonna work fine, I'm sure the memory won't bottleneck and it'll take the performance crown. But if you're a perfectionist like me, everything has to fit into the proper power of 2, as that has fully utilised the design potential of a digital device. :D nvidia aren't stupid and would have done this too, had the physical constraints been less.

Also, it would be interesting to know how much faster a full 512-bit version of the chip (with the extra compute clusters too) would perform. I'd hazard 20% off the top of my head.
 
Oh yeah, guys, it's gonna work fine, I'm sure the memory won't bottleneck and it'll take the performance crown. But if you're a perfectionist like me, everything has to fit into the proper power of 2, as that has fully utilised the design potential of a digital device. :D nvidia aren't stupid and would have done this too, had the physical constraints been less.

So triple channel memory, triple core CPUs, and so on are all "imperfect" ?
 
So triple channel memory, triple core CPUs, and so on are all "imperfect" ?

Yes. Anyone that's done any sort of digital design will understand what I mean.

Manufacturers only ever move away from a power of 2 design when they have physical and/or cost constraints.

Another example are the 6-core high end CPUs just coming out. They should really be 8-core, but that will probably have to wait until another process shrink is perfected.
 
Yes. Anyone that's done any sort of digital design will understand what I mean.

Manufacturers only ever move away from a power of 2 design when they have physical and/or cost constraints.

Or when the end result does not justify doing just that. Both physical and cost constraints are an integral part of any product design process, you cannot just wish them away and call any other result imperfect.
 
why do you think memory bus width has to work in powers of two ? a single chip is 32-bit, add as many chips as you wish

compared to gt200 the memory interface is 768 bit because it uses gddr5 which is twice the bandwidth of gddr3
 
Funny part about those 6 core processors is that they scale perfectly in multi-threaded benchmarks. Funny. Very funny.
 
completely agree with your post further up binge, its what i wanted to say but couldn't formulate with the annoyance's of nvidia's crap circulating around in my brain.
 
Or when the end result does not justify doing just that. Both physical and cost constraints are an integral part of any product design process, you cannot just wish them away and call any other result imperfect.

It depends how you mean by "imperfect". The physical and cost parameters indeed always have a strong influence on what is achievable in practice.

I have actually done some chip design when studying for my qualifications. There, I learned how the optimum design for a binary (ie base 2) computer is always to base everything on base 2 (power of 2) using the full address range for the number of bits used. (This includes having the number of bits used also at powers of 2 amounts, W1zzard).

There's lots of other subtleties in efficiency when doing this, but I haven't done this for years and can't think of them off the top of my head. In essence, everything just dovetails nicely together when you scale up by powers of 2. That's why memory chip sizes always go up in powers of 2, for example.

Another example of where a power of 2 cannot be realised due to physical constraints, are hard discs. Because they are based on rotating media and crucially, round media and are of a fixed physical size, you cannot just scale them up in powers of 2, so we have a physical limitation there. Hence we are left with odd sizes, such as 80GB instead of 128GB, for example.
 
Impressive architecture, but above evrything, it has me wondering one thing. If GeForce and Tesla use exactly the same GPU, HOW in hell have they disabled the chip on the Tesla card in order to have 448 SPs???

By disabling two SMs.
 
Last edited:
Yeah, mr. obvious, but once again that leaves you with an asymetric chip, which I find odd and unlikely.

That asymmetry doesn't affect the chip in any way. Every SM has access to all the memory on the card, the GigaThread component dispatches workloads to the SMs, not GPCs.

Disabling an SM is exactly the way I see NVIDIA is going to create the GT part. It will most likely have 480 or 448 SPs, with 320-bit memory interface.
 
I have actually done some chip design when studying for my qualifications. There, I learned how the optimum design for a binary (ie base 2) computer is always to base everything on base 2 (power of 2) using the full address range for the number of bits used. (This includes having the number of bits used also at powers of 2 amounts, W1zzard).

you should go work at nvidia then.

bring forward your evidence why 384 or even 352 bits is less perfect than 256 bits for a memory interface design. there are a couple of valid (but not important) points you can make, go show us your qualifications
 
you should go work at nvidia then.

bring forward your evidence why 384 or even 352 bits is less perfect than 256 bits for a memory interface design. there are a couple of valid (but not important) points you can make, go show us your qualifications

Instead of just challenging me because you don't know about this, why don't you do some research yourself?

In my previous post and various others (eg the monitor aspect ratio discussion) I gave a very nice and complete answer. It would be nice to be appreciated for teaching people instead of getting attacked all the time. :rolleyes:
 
Instead of just challenging me because you don't know about this, why don't you do some research yourself?

In my previous post and various others (eg the monitor aspect ratio discussion) I gave a very nice and complete answer. It would be nice to be appreciated for teaching people instead of getting attacked all the time. :rolleyes:

and that's how people get banned at [H]
if you post things that are factually correct i'm not going to attack you, claiming that a 384 bits wide memory interface is an imperfect design sounds like something i would read at certain rumor sites
 
@qubit- I want to know what it is they can/will learn from your observations. If we learned whatever it is you're trying to teach in one way it is still mostly lost on the ignorance of the readers. On the subject of motor vehicles I argue that there is certainly a number of oddities in engine design, but I don't make a point of damning the tri-cylinder engine because it looks funny on paper.
 
Instead of just challenging me because you don't know about this, why don't you do some research yourself?

In my previous post and various others (eg the monitor aspect ratio discussion) I gave a very nice and complete answer. It would be nice to be appreciated for teaching people instead of getting attacked all the time. :rolleyes:

It is your assertion that 384-bit isn't nice, for the reasons you stated. So the onus lies on you to back it up with references, he doesn't need to do any research for your assertions which he never came across. So go find us some, and not make confrontational statements.
 
That asymmetry doesn't affect the chip in any way. Every SM has access to all the memory on the card, the GigaThread component dispatches workloads to the SMs, not GPCs.

Disabling an SM is exactly the way I see NVIDIA is going to create the GT part. It will most likely have 480 or 448 SPs, with 320-bit memory interface.

But the balance between texturing+geometry against shaders would be changed, wouldn't it? Disabling an SM wouldn't disable the texture units and the (how do they call it?) polymorph unit, or would it?

I'm not saying that would break the card, or the performance, but you would have silicon sitting there unused, which is not optimal. Well not unused, but being overkill since they have now have to work with less units.

And you have to agree it would be the first time we saw something like that being done. <-- That's the only reason why I'm not convinced tbh. :laugh:

EDIT: Meh! Forget about the subject and forgive me. I had just not paid attention to the slides. After taking a second look at this:

GF100_16.jpg


It's obvious that the polymorph thing is attached to the SM, I had just thought it wasn't because it's been zoomed in (next to the raster engine) and I didn't pay attention to the arrows. :banghead: Only the raster unit seems to be independent.
 
Last edited:
All i can say is that all these numbers look rather impressive and i am sure it will kick my ATI's butt when it comes out. "6.44 times the Tesselation perfromance over HD5870" - O rly?

If that is true, then the GF 100 will shine in several areas more than in anything else. Looking forward to it tbh, even though i probably won't buy one.
 
Back
Top