NVIDIA GeForce GF100 Architecture

HalfAHertz · Jan 18, 2010

Benetanegia said:
Impressive architecture, but above evrything, it has me wondering one thing. If GeForce and Tesla use exactly the same GPU, HOW in hell have they disabled the chip on the Tesla card in order to have 448 SPs???

Me too. From what I understand the smallest group they can dissable is 4X32shaders = 128 shaders

Binge · Jan 18, 2010

Reefer86 said:
i just think why should we beleive what nvidia say, just because they say the GF100 does 43 fps we wont actually know untill launch. it wouldnt be the first time nvidia has just claimed something and we have found it wasnt golden. no one has seen or used this card, so im not gona beleive some babble they come out with.

If you are very skeptical as we all are about this release I'd like to share with you something I wrote.

Binge said:
This post has been a long time formulating, and I welcome any criticisms.

How many of us have gotten at least 3 different claims as to the performance or release of this card? Cynically I've decided that I'm not going to bat my eyelashes at any claims that come out of CES. There's bound to be a little more truth circling the bowl, but most people will excuse me if I assume the cycle of bullsh!t has yet to flush. I'm not sure if the majority of posters/readers will excuse my overall indifference because that isn't very exciting. Likewise it's not hard to speculate that NV may have a true performer to take a crown in 2010, but ATI has a firm place in this generation's the line-up which could mean good or bad things in the future. With the downturn of the global economy there is enough of a depressant force present in a number of software companies to recycle old engines, or adopt some sort of broad design utility. The mainstream GPUs will see more action than chopsticks during the Chinese new-year. I think it's wise to assume that it's getting dangerously close to a point in time where the GPUs must offer stellar performance in a new API because Microsoft not only authors DX runtimes, but they are also a console competitor. Realistically (and correct me if I'm wrong) they're going to merge development of their runtimes with console development. The paradigm shift will be when enough of the software industry is willing to move.

If you accept any of these ideas then I offer a summary of my thoughts.

-the GT300/GT100 series cards are going to take a crown in performance, but this generation will offer little more than a spitting contest between ATI/NV.
-3D environment software development will become further compartmentalized, and game developers will buy into a smart, economic, standard before leaning head-on into a new API which is not yet mature/affordable in hardware support.
-Microsoft (3v!L3) will most likely decide which generation of GPU will hold the standard for a life determined by their next console.

I'm a bit off topic, and a little on topic. It's pretty obvious, but I figured this is a nice mix of topic all rooted around the importance of the GT300.

The thoughts I had behind the post were more or less about what's on my mind about GPUs as a whole. Pay close attention that none of us are too impressed with the benchmarks, but the hardware they've laid out has some serious kick. If DX11 takes off as the next API for big consoles then we're going to see some serious moves by software devs to make their software DX11.

qubit · Jan 18, 2010

Thanks bta, that was a very well written and interesting article.

It's a shame they've had to cut the chip down to a 384-bit bus from 512. This reduces it's performance and makes it lopsided (odd memory and bus sizes) which is never optimum for a computer and gives it less processing units. I guess it's likely due to die size and power/heat constraints though.

I'll likely get the 256-bit variant when it comes out, because of this. If this chip really is as good as they claim here, it will still be a great performer.

Benetanegia · Jan 18, 2010

qubit said:
Thanks bta, that was a very well written and interesting article.

It's a shame they've had to cut the chip down to a 384-bit bus from 512. This reduces it's performance and makes it lopsided (odd memory and bus sizes) which is never optimum for a computer and gives it less processing units. I guess it's likely due to die size and power/heat constraints though.

I'll likely get the 256-bit variant when it comes out, because of this. If this chip really is as good as they claim here, it will still be a great performer.

Remember this one uses GDDR5, memory performance has been increased a lot. We don't know final clocks, but if it uses the same as HD5xxx it will be a nice boost. After seing the rest of the architecture, I can think of memory being the most limiting factor though (it's been iproved a lot, but not in the "ZOMG! Overkill!" way the rest of the architecture has been improved), but in no way will be crippling it, as in making it uncompetitive. Even when comparing it to the HD5970 IMO.

wolf · Jan 18, 2010

GDDR5 has gotten a heap better since the 4870 days, overclocking GDDR5 on a 384-bit bus will be a whole bunch of fun i reckon

Binge · Jan 18, 2010

qubit said:
Thanks bta, that was a very well written and interesting article.

It's a shame they've had to cut the chip down to a 384-bit bus from 512. This reduces it's performance and makes it lopsided (odd memory and bus sizes) which is never optimum for a computer and gives it less processing units. I guess it's likely due to die size and power/heat constraints though.

I'll likely get the 256-bit variant when it comes out, because of this. If this chip really is as good as they claim here, it will still be a great performer.

The card was always supposed to have a 384-bit bus? That spec is OLD.

qubit · Jan 18, 2010

Benetanegia said:
Remember this one uses GDDR5, memory performance has been increased a lot. We don't know final clocks, but if it uses the same as HD5xxx it will be a nice boost. After seing the rest of the architecture, I can think of memory being the most limiting factor though (it's been iproved a lot, but not in the "ZOMG! Overkill!" way the rest of the architecture has been improved), but in no way will be crippling it, as in making it uncompetitive. Even when comparing it to the HD5970 IMO.

wolf said:
GDDR5 has gotten a heap better since the 4870 days, overclocking GDDR5 on a 384-bit bus will be a whole bunch of fun i reckon

Oh yeah, guys, it's gonna work fine, I'm sure the memory won't bottleneck and it'll take the performance crown. But if you're a perfectionist like me, everything has to fit into the proper power of 2, as that has fully utilised the design potential of a digital device.

nvidia aren't stupid and would have done this too, had the physical constraints been less.

Also, it would be interesting to know how much faster a full 512-bit version of the chip (with the extra compute clusters too) would perform. I'd hazard 20% off the top of my head.

Yukikaze · Jan 18, 2010

qubit said:
Oh yeah, guys, it's gonna work fine, I'm sure the memory won't bottleneck and it'll take the performance crown. But if you're a perfectionist like me, everything has to fit into the proper power of 2, as that has fully utilised the design potential of a digital device. nvidia aren't stupid and would have done this too, had the physical constraints been less.

So triple channel memory, triple core CPUs, and so on are all "imperfect" ?

qubit · Jan 18, 2010

Yukikaze said:
So triple channel memory, triple core CPUs, and so on are all "imperfect" ?

Yes. Anyone that's done any sort of digital design will understand what I mean.

Manufacturers only ever move away from a power of 2 design when they have physical and/or cost constraints.

Another example are the 6-core high end CPUs just coming out. They should really be 8-core, but that will probably have to wait until another process shrink is perfected.

Yukikaze · Jan 18, 2010

qubit said:
Yes. Anyone that's done any sort of digital design will understand what I mean.

Manufacturers only ever move away from a power of 2 design when they have physical and/or cost constraints.

Or when the end result does not justify doing just that. Both physical and cost constraints are an integral part of any product design process, you cannot just wish them away and call any other result imperfect.

W1zzard · Jan 18, 2010

why do you think memory bus width has to work in powers of two ? a single chip is 32-bit, add as many chips as you wish

compared to gt200 the memory interface is 768 bit because it uses gddr5 which is twice the bandwidth of gddr3

Binge · Jan 18, 2010

Funny part about those 6 core processors is that they scale perfectly in multi-threaded benchmarks. Funny. Very funny.

Reefer86 · Jan 18, 2010

completely agree with your post further up binge, its what i wanted to say but couldn't formulate with the annoyance's of nvidia's crap circulating around in my brain.

qubit · Jan 18, 2010

Yukikaze said:
Or when the end result does not justify doing just that. Both physical and cost constraints are an integral part of any product design process, you cannot just wish them away and call any other result imperfect.

It depends how you mean by "imperfect". The physical and cost parameters indeed always have a strong influence on what is achievable in practice.

I have actually done some chip design when studying for my qualifications. There, I learned how the optimum design for a binary (ie base 2) computer is always to base everything on base 2 (power of 2) using the full address range for the number of bits used. (This includes having the number of bits used also at powers of 2 amounts, W1zzard).

There's lots of other subtleties in efficiency when doing this, but I haven't done this for years and can't think of them off the top of my head. In essence, everything just dovetails nicely together when you scale up by powers of 2. That's why memory chip sizes always go up in powers of 2, for example.

Another example of where a power of 2 cannot be realised due to physical constraints, are hard discs. Because they are based on rotating media and crucially, round media and are of a fixed physical size, you cannot just scale them up in powers of 2, so we have a physical limitation there. Hence we are left with odd sizes, such as 80GB instead of 128GB, for example.

btarunr · Jan 18, 2010

Benetanegia said:
Impressive architecture, but above evrything, it has me wondering one thing. If GeForce and Tesla use exactly the same GPU, HOW in hell have they disabled the chip on the Tesla card in order to have 448 SPs???

By disabling two SMs.

qubit · Jan 18, 2010

btarunr said:
By disabling two SMs.

So, is the Tesla chip Fermi, but wider? The uncut version I discussed above :confused:

Benetanegia · Jan 18, 2010

btarunr said:
By disabling two SMs.

Yeah, mr. obvious, but once again that leaves you with an asymetric chip, which I find odd and unlikely.

btarunr · Jan 18, 2010

Benetanegia said:
Yeah, mr. obvious, but once again that leaves you with an asymetric chip, which I find odd and unlikely.

That asymmetry doesn't affect the chip in any way. Every SM has access to all the memory on the card, the GigaThread component dispatches workloads to the SMs, not GPCs.

Disabling an SM is exactly the way I see NVIDIA is going to create the GT part. It will most likely have 480 or 448 SPs, with 320-bit memory interface.

W1zzard · Jan 18, 2010

qubit said:
I have actually done some chip design when studying for my qualifications. There, I learned how the optimum design for a binary (ie base 2) computer is always to base everything on base 2 (power of 2) using the full address range for the number of bits used. (This includes having the number of bits used also at powers of 2 amounts, W1zzard).

you should go work at nvidia then.

bring forward your evidence why 384 or even 352 bits is less perfect than 256 bits for a memory interface design. there are a couple of valid (but not important) points you can make, go show us your qualifications

qubit · Jan 18, 2010

W1zzard said:
you should go work at nvidia then.

bring forward your evidence why 384 or even 352 bits is less perfect than 256 bits for a memory interface design. there are a couple of valid (but not important) points you can make, go show us your qualifications

Instead of just challenging me because you don't know about this, why don't you do some research yourself?

In my previous post and various others (eg the monitor aspect ratio discussion) I gave a very nice and complete answer. It would be nice to be appreciated for teaching people instead of getting attacked all the time. :rolleyes:

W1zzard · Jan 18, 2010

qubit said:
Instead of just challenging me because you don't know about this, why don't you do some research yourself?

In my previous post and various others (eg the monitor aspect ratio discussion) I gave a very nice and complete answer. It would be nice to be appreciated for teaching people instead of getting attacked all the time.

and that's how people get banned at [H]
if you post things that are factually correct i'm not going to attack you, claiming that a 384 bits wide memory interface is an imperfect design sounds like something i would read at certain rumor sites

Binge · Jan 18, 2010

@qubit- I want to know what it is they can/will learn from your observations. If we learned whatever it is you're trying to teach in one way it is still mostly lost on the ignorance of the readers. On the subject of motor vehicles I argue that there is certainly a number of oddities in engine design, but I don't make a point of damning the tri-cylinder engine because it looks funny on paper.

btarunr · Jan 18, 2010

qubit said:
Instead of just challenging me because you don't know about this, why don't you do some research yourself?

In my previous post and various others (eg the monitor aspect ratio discussion) I gave a very nice and complete answer. It would be nice to be appreciated for teaching people instead of getting attacked all the time.

It is your assertion that 384-bit isn't nice, for the reasons you stated. So the onus lies on you to back it up with references, he doesn't need to do any research for your assertions which he never came across. So go find us some, and not make confrontational statements.

Benetanegia · Jan 18, 2010

btarunr said:
That asymmetry doesn't affect the chip in any way. Every SM has access to all the memory on the card, the GigaThread component dispatches workloads to the SMs, not GPCs.

Disabling an SM is exactly the way I see NVIDIA is going to create the GT part. It will most likely have 480 or 448 SPs, with 320-bit memory interface.

But the balance between texturing+geometry against shaders would be changed, wouldn't it? Disabling an SM wouldn't disable the texture units and the (how do they call it?) polymorph unit, or would it?

I'm not saying that would break the card, or the performance, but you would have silicon sitting there unused, which is not optimal. Well not unused, but being overkill since they have now have to work with less units.

And you have to agree it would be the first time we saw something like that being done. <-- That's the only reason why I'm not convinced tbh. :laugh:

EDIT: Meh! Forget about the subject and forgive me. I had just not paid attention to the slides. After taking a second look at this:

It's obvious that the polymorph thing is attached to the SM, I had just thought it wasn't because it's been zoomed in (next to the raster engine) and I didn't pay attention to the arrows. :banghead:

Only the raster unit seems to be independent.

FilipM · Jan 18, 2010

All i can say is that all these numbers look rather impressive and i am sure it will kick my ATI's butt when it comes out. "6.44 times the Tesselation perfromance over HD5870" - O rly?

If that is true, then the GF 100 will shine in several areas more than in anything else. Looking forward to it tbh, even though i probably won't buy one.

System Name	penguin
Processor	R7 5700G
Motherboard	Asrock B450M Pro4
Cooling	Some CM tower cooler that will fit my case
Memory	4 x 8GB Kingston HyperX Fury 2666MHz
Video Card(s)	IGP
Storage	ADATA SU800 512GB
Display(s)	27' LG
Case	Zalman
Audio Device(s)	stock
Power Supply	Seasonic SS-620GM
Software	win10

System Name	Molly
Processor	i5 3570K
Motherboard	Z77 ASRock
Cooling	CooliT Eco
Memory	2x4GB Mushkin Redline Ridgebacks
Video Card(s)	Gigabyte GTX 680
Case	Coolermaster CM690 II Advanced
Power Supply	Corsair HX-1000

System Name	Quantumville™
Processor	Intel Core i7-2700K @ 4GHz
Motherboard	Asus P8Z68-V PRO/GEN3
Cooling	Noctua NH-D14
Memory	16GB (2 x 8GB Corsair Vengeance Black DDR3 PC3-12800 C9 1600MHz)
Video Card(s)	MSI RTX 2080 SUPER Gaming X Trio
Storage	Samsung 850 Pro 256GB \| WD Black 4TB \| WD Blue 6TB
Display(s)	ASUS ROG Strix XG27UQR (4K, 144Hz, G-SYNC compatible) \| Asus MG28UQ (4K, 60Hz, FreeSync compatible)
Case	Cooler Master HAF 922
Audio Device(s)	Creative Sound Blaster X-Fi Fatal1ty PCIe
Power Supply	Corsair AX1600i
Mouse	Microsoft Intellimouse Pro - Black Shadow
Keyboard	Yes
Software	Windows 10 Pro 64-bit

System Name	MightyX
Processor	Ryzen 9800X3D
Motherboard	Gigabyte B650I AX
Cooling	Scythe Fuma 2
Memory	32GB DDR5 6000 CL30 tuned
Video Card(s)	Palit Gamerock RTX 5080 oc
Storage	WD Black SN850X 2TB
Display(s)	LG 42C2 4K OLED
Case	Coolermaster NR200P
Audio Device(s)	LG SN5Y / Focal Clear
Power Supply	Corsair SF750 Platinum
Mouse	Corsair Dark Core RBG Pro SE
Keyboard	Glorious GMMK Compact w/pudding
VR HMD	Meta Quest 3
Software	case populated with Artic P12's
Benchmark Scores	4k120 OLED Gsync bliss

System Name	Molly
Processor	i5 3570K
Motherboard	Z77 ASRock
Cooling	CooliT Eco
Memory	2x4GB Mushkin Redline Ridgebacks
Video Card(s)	Gigabyte GTX 680
Case	Coolermaster CM690 II Advanced
Power Supply	Corsair HX-1000

System Name	Dire Wolf IV
Processor	Intel Core i9 14900K
Motherboard	Asus ROG STRIX Z790-I GAMING WIFI
Cooling	Arctic Liquid Freezer II 280 w/Thermalright Contact Frame
Memory	2x24GB Corsair DDR5-6600
Video Card(s)	NVIDIA RTX4080 FE
Storage	Intel Optane P5801X 400GB + AORUS 7300 1TB
Display(s)	Alienware AW3423DWF (QD-OLED, 3440x1440, 165hz)
Case	Corsair Airflow 2000D
Power Supply	Corsair SF1000L
Mouse	Razer Deathadder Essential
Keyboard	E-Yooso Rapid Trigger 80%
Software	Windows 11 Professional

Processor	Ryzen 7 5700X
Memory	48 GB
Video Card(s)	RTX 4080
Storage	2x HDD RAID 1, 3x M.2 NVMe
Display(s)	30" 2560x1600 + 19" 1280x1024
Software	Windows 10 64-bit

System Name	Betty The Dirty Dragon
Processor	AMD 8350 @4.8
Motherboard	ASUS sabertooth rev 2
Cooling	Watercooling - Swiftech GTZ - XSPC RS360 - XSPC RX120
Memory	8GB corsair vengeance cas 9
Video Card(s)	280x 1100core 1600 mem
Storage	240 gb ssd- 2 tb Storage
Display(s)	Samsung 24" 2433BW
Case	Corsair 650d
Audio Device(s)	Xfi Fatality Titanium Pro
Power Supply	XFX 850 fully modular
Software	Windows 7 Ultimate x64

System Name	RBMK-1000
Processor	AMD Ryzen 7 5700G
Motherboard	Gigabyte B550 AORUS Elite V2
Cooling	DeepCool Gammax L240 V2
Memory	2x 16GB DDR4-3200
Video Card(s)	Galax RTX 4070 Ti EX
Storage	Samsung 990 1TB
Display(s)	BenQ 1440p 60 Hz 27-inch
Case	Corsair Carbide 100R
Audio Device(s)	ASUS SupremeFX S1220A
Power Supply	Cooler Master MWE Gold 650W
Mouse	ASUS ROG Strix Impact
Keyboard	Gamdias Hermes E2
Software	Windows 11 Pro

System Name	Brutus
Processor	AMD Ryzen 5600X PBO
Motherboard	Asus Prime X570-P
Cooling	EKWB AIO 240MM Push-Pull fans
Memory	Patriot Viper Steel DDR4 4000 32GB (4x8) @4066 CL16, Custom Timings
Video Card(s)	MSI RTX 3080 SuprimX 10GB
Storage	Kingston A2000 500GB + Toshiba 1TB HDD
Display(s)	Samsung 24" S24D300 + 2x LG LED 24"
Case	Cooler Master H500
Audio Device(s)	SB X-Fi Titanium Fatality Professional
Power Supply	Sama Forza Modular 750W 80+ Gold
Mouse	Cooler Master Master Keys Lite
Keyboard	Cooler Master Master Keys Lite
Software	Windows 10 Pro 64-Bit

NVIDIA GeForce GF100 Architecture

Overclocking Surrealism

Overclocked quantum bit

New Member

Better Than Native

Overclocking Surrealism

Overclocked quantum bit

Overclocked quantum bit

Administrator

Overclocking Surrealism

Overclocked quantum bit

Editor & Senior Moderator

Overclocked quantum bit

New Member

Editor & Senior Moderator

Administrator

Overclocked quantum bit

Administrator

Overclocking Surrealism

Editor & Senior Moderator

New Member