• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA GeForce 4XX Series Discussion

Status
Not open for further replies.
willl 300 be better than 5xxx?
 
yes, it will be better but we dont know what dual GPU solutions will have and we also dont know what prices will do. I think we may very well see a < $300 5870 @ gt300 launch.

also gt300 aint comin soon
 
attachment.php


not sure if its legit though.. because if thats the case, there is not much to look forward to in terms of performance.
 
not sure if its legit though.. because if thats the case, there is not much to look forward to in terms of performance.

Are you seriously saying 512 Cores, 384bit GDDR5, added internal cache and pretty much new architechture is not much in terms of performance? :)
 
willl 300 be better than 5xxx?

Know one knows, yet....

Nivida arn't really on the game at the moment, they dont even have a release date for it, no ones seen it(apart form an empty shell of a graphics card) by the time the are redy to release ATI may have released 5870x2 or 5890, they will be some pretty damn hefty cards to be compeating with.
 
Know one knows, yet....

Nivida arn't really on the game at the moment, they dont even have a release date for it, no ones seen it(apart form an empty shell of a graphics card) by the time the are redy to release ATI may have released 5870x2 or 5890, they will be some pretty damn hefty cards to be compeating with.

And heck, will our CPUs be fast enough to max out the new generation of cards?!

Perhaps it'll take something like Crysis at 2560x1600 with maxed AA & AF to make them sweat? That's not a configuration that most of us have.
 
:laugh:

I have this mental image of AMD researchers hangin around the lab working to AC/DC's [I]Caught With Your Pants Down[/I] blaring in the background.
 
So guys... I've been going here, http://forum.beyond3d.com, a lot because these guys actually talk the tech. They are familiar with what these cards do and the differences of the architectures each company uses. Since the release of nVidia's Fermi White papers (yes there are many) these fine folks have been really getting into the gritty nits of what the chip is and isn't. In layman's here is what is mostly different from AMD's design plan.

  • The GF100, Fermi, chip uses a different kind of shader processor called a 'CUDA core'
  • The 'CUDA core' shader unit is much like their previous generation cores in that it is very programmable. This means that when people talk about specialized tessilators they are talking about the exact opposite of a 'CUDA core' which could be programmed to emulate functions which other GPU manufacturers would design specific hardware.
  • The GF100 is the first gpu since the start of DX9 gpus to have read and write access for L1 cache. Read and write access gives the 'CUDA cores' an edge over specialized hardware that has to send some data off chip and into GPU-mem before completing its task.
  • The ammount of cache available to the 'CUDA cores' is significantly larger than ATI's current generation of cards.
  • The number of program threads and how they are executed is handled by a new WARP schedueler. The specifics of the schedueler is not exactly clear at this time, but it does not use true multi-threading. ATI does not use true threads either, but Intel's Larabee will.
  • Because of how much of the GPU computing can be done on the card, with proper drivers, CPU bottlenecking can be decreased(mid rage systems) or eliminated(high end systems).
  • The existance of double precision computation, and ECC may affect performance for the mainstream enthusiasts in a negative way.
  • There is speculation that the GF100 does not actually use any memory on the GPU when in a mode where it would use ECC, but instead uses ECC modules in the PC's mainboard if installed.
That's what I've gathered, and I am still hoping for a 09 release.
 
...
  • Because of how much of the GPU computing can be done on the card, with proper drivers, CPU bottlenecking can be decreased(mid rage systems) or eliminated(high end systems).

This is what I was thinking also.. If used in the same way, it would very much perform on par with the GTX295.. but when utilized correctly on the app's code level, will pretty much outdo 295..
 
This is what I was thinking also.. If used in the same way, it would very much perform on par with the GTX295.. but when utilized correctly on the app's code level, will pretty much outdo 295..

Exactly, it seems like these cards will be as good as their drivers :toast:
 
So guys... I've been going here, http://forum.beyond3d.com, a lot because these guys actually talk the tech. They are familiar with what these cards do and the differences of the architectures each company uses. Since the release of nVidia's Fermi White papers (yes there are many) these fine folks have been really getting into the gritty nits of what the chip is and isn't. In layman's here is what is mostly different from AMD's design plan.

  • The GF100, Fermi, chip uses a different kind of shader processor called a 'CUDA core'
  • The 'CUDA core' shader unit is much like their previous generation cores in that it is very programmable. This means that when people talk about specialized tessilators they are talking about the exact opposite of a 'CUDA core' which could be programmed to emulate functions which other GPU manufacturers would design specific hardware.
  • The GF100 is the first gpu since the start of DX9 gpus to have read and write access for L1 cache. Read and write access gives the 'CUDA cores' an edge over specialized hardware that has to send some data off chip and into GPU-mem before completing its task.
  • The ammount of cache available to the 'CUDA cores' is significantly larger than ATI's current generation of cards.
  • The number of program threads and how they are executed is handled by a new WARP schedueler. The specifics of the schedueler is not exactly clear at this time, but it does not use true multi-threading. ATI does not use true threads either, but Intel's Larabee will.
  • Because of how much of the GPU computing can be done on the card, with proper drivers, CPU bottlenecking can be decreased(mid rage systems) or eliminated(high end systems).
  • The existance of double precision computation, and ECC may affect performance for the mainstream enthusiasts in a negative way.
  • There is speculation that the GF100 does not actually use any memory on the GPU when in a mode where it would use ECC, but instead uses ECC modules in the PC's mainboard if installed.
That's what I've gathered, and I am still hoping for a 09 release.

This needs to be added to post 1. Thanks binge for the investigation, work, and delivery.



I would like to know more about the 300 series, specifically details regarding the initial card line up(hopefully Nov/Dec 09).
 
So guys... I've been going here, http://forum.beyond3d.com, a lot because these guys actually talk the tech. They are familiar with what these cards do and the differences of the architectures each company uses. Since the release of nVidia's Fermi White papers (yes there are many) these fine folks have been really getting into the gritty nits of what the chip is and isn't. In layman's here is what is mostly different from AMD's design plan.

  • The GF100, Fermi, chip uses a different kind of shader processor called a 'CUDA core'
  • The 'CUDA core' shader unit is much like their previous generation cores in that it is very programmable. This means that when people talk about specialized tessilators they are talking about the exact opposite of a 'CUDA core' which could be programmed to emulate functions which other GPU manufacturers would design specific hardware.
  • The GF100 is the first gpu since the start of DX9 gpus to have read and write access for L1 cache. Read and write access gives the 'CUDA cores' an edge over specialized hardware that has to send some data off chip and into GPU-mem before completing its task.
  • The ammount of cache available to the 'CUDA cores' is significantly larger than ATI's current generation of cards.
  • The number of program threads and how they are executed is handled by a new WARP schedueler. The specifics of the schedueler is not exactly clear at this time, but it does not use true multi-threading. ATI does not use true threads either, but Intel's Larabee will.
  • Because of how much of the GPU computing can be done on the card, with proper drivers, CPU bottlenecking can be decreased(mid rage systems) or eliminated(high end systems).
  • The existance of double precision computation, and ECC may affect performance for the mainstream enthusiasts in a negative way.
  • There is speculation that the GF100 does not actually use any memory on the GPU when in a mode where it would use ECC, but instead uses ECC modules in the PC's mainboard if installed.
That's what I've gathered, and I am still hoping for a 09 release.

I would not expect the mere existence of double precision units to affect performance in a negative way. Some games might actually make good use of double precision buffers, given the flexibility of the units.
 
Last edited:
Fermi is reverse direction Fusion

More Fermi news, courtesy of Fudzilla once again.
 
AMD: Fermi a "Paper Dragon" & Nails nvidia with PP slide

Rather than beating it in a PowerPoint slide, wouldn't it be better to do it with real-world trusted benchmarks when the competition is on sale?

Fudzilla


NVIDIApaperdragon.jpg
 
That's why it's on Fudzilla and should be ignored. I forsee AMD releasing more slides stating more facts*

Speculative future AMD/ATi slide that doesn't exist yet said:
-Fermi will take your job
-Fermi is bad for the environment
-You will need to buy a separate PC case just for Fermi
-Fermi will give you swine flu
-Fermi hates you
-Fermi hates America
-Hitler chooses Fermi
-etc.
 
I forsee AMD releasing more slides stating more facts*

I forsee Fermi being way overpriced and way too late! I also forsee AMD releasing a 5870x2 or 5890 (which will be an extreamley highpowered card) around the same time as Fermi is being released.
 
That's why it's on Fudzilla and should be ignored. I forsee AMD releasing more slides stating more facts*

Fudzilla is actually quoting two other sites, so it's not coming from Fudzilla. It wouldn't surprise me if AMD is spreading FUD to help further their own sales... sorta like nvidia and Intel and everyone else do. :laugh:
 
Of course it's FUD :laugh:. Specially the ones refering to performance. If we make similar slides about RV770 and GT200, the latter would look much worse than Fermi in comparison, but we all know how things really are, don't we? I would have liked that AMD had put their real FP efficiency on the slides and speculate about which one on Fermi. :rolleyes:

TBH I don't know for what they want those slides, I'd think that partners know very well the differences in architectures enough to know that Nvidia is focused on operation efficiency, while AMD is focused on area efficiency at the cost of operation efficiency (Curiously both obtaining around the same results in performance/die area. That's why I love GPUs so much.). So partners (customers) and various key people in the industry, know how useless those bold comparisons are. IMO comparing both like that is kinda lame on AMDs part, desperate if you ask me:

"I have 2.7 Tflops and they have only 1.5, wohoo! Let's just forget that the 622 GFlops* GTX280 is still faster than our 1.36 TFlops HD4890, not to mention the 708 Gflops* GTX285. That the 470 Gflops* GTS250 has a slight lead over our 1000 Gflop HD4850. Or that we need the 2 Tflop HD5850 in order to be slightly faster than the GTX285."

^^By last generation(s) comparisons we can say that Nvidia's shader efficiency is 2x that of AMD's. Peak figures never meant much anyway.


And don't make me change the HD4850 and HD4890 for the HD5750 and HD5770 (respectively) which have exactly the same Flops but are significantly slower.

* By now we know that G80/G92/GT200 couldn't really dual issue in any realistic scenario. Fermi lost the MUL, but it's not a big loss considering that it wasn't used for the most part (definately not in graphics). But AMD, based on speculation from white papers, apparently didn't mind:

1- Comparing dual issue with single issue.
2- On the comparison with previous generation they compare Fermi to GTX285, while at the same time boosting their number by comparing the HD5870 to the HD4870 instead of the HD4890, which has the same clock (GTX285 does have the same clocks as speculated Fermi clocks). Either the GTX280 or the HD4890 had to be used.

HD5870 vs HD4890 = 2x.
Fermi vs GTX285 = 2x single issue.
Fermi vs GTX280 = 1.6x dual issue, 2.4x single issue (not a comfortable number AMD?).
3- And finally they show a slide with different (INT MUL, INT ADD...) ops-per-clock figures, conveniently forgetting that Fermi shaders run almost twice as fast.
 
Last edited:
^^ I agree with all that... but there is a slight problem... where is this mythical unicorn we call gf100/gt300/ w/e..?
 
Nevermind about people on EVGA's community. They are liars and spread BS.


Also no idea where this other news is coming from, so consider it rumor:

NVIDIA representatives prefer to not speak about the announcement periods of first gaming video card with fermi architecture, but they recognize that with time this architecture will be extended on all price segments of the consumer market. According to the independent sources, in this year low-end product with fermi architecture will cost $299, and more accessible graphical solutions of this generation will appear in the following year.

Associate asserts that the more accessible representatives fermi card will appear in the first quarter, nearer to March month. Here we deal with graphical solutions, which cost less than $200.

Associate also report that the demand for video chip with fermi architecture from NVIDIA partner side is completely high.
Source: http://xtreview.com/addcomment-id-10285-view-NVIDIA-fermi-card-expected-release-date.html
 
Last edited:
^^ I agree with all that... but there is a slight problem... where is this mythical unicorn we call gf100/gt300/ w/e..?

The official word is still late November for the release, with availability in the end of Q4. There's no sense doubting that for the time being IMO. No info means they want to keep it secret as much as they can so that AMD has to finalize HD5890 (or whatever card) on their own and not based on being faster than any of Nvidia cards.

So IMO they will not show it until review time or until AMD shows what the HD5890 will be, if latter is going to be released this year. Nvidia still has to decide the final clocks, and probably want GTX380 to be faster than the supposed HD5890, while GTX360 is faster than HD5870 to repeat what happened the last two generations. AMD on the other hand is probably waiting until Nvidia reveals Fermi performance or final specs before they finish HD5890 specs. Right now Nvidia is on a very big advantage, because if Fermi has the same OCing headroom as any previous Nvidia chip (15-20% OC), Nvidia can play a lot with the place their cards will take on the stack. The same could have happened with GT200 if it released after HD4xxx. GT200 OC headroom was a 20% on stock cooling, while HD48xx was around 10%. If Nvidia had released GT200 after RV770 launch they could have decided to release the cards with a 10% higher clocks, making the GTX260 10% faster than HD4870 while having the same OC potential. The story would have been very different than it was. Remember that Ati hid the fact that RV770 had 800 SPs instead of the rumored 640 until the last minute and it's that what cought Nvidia off-ward.

Now, HD58xx doesn't overclock particularly well (not better than previous Ati generations), so if Fermi retains Nvidia's track record of OC potential*, that's a very powerful weapon they have, if they need it.

* And that's a very long time, since GF6800 days, always always around 20% OC headroom, something that IMO is not casual and is designed so that their partners can release 10% OC cards that still have a bit of OCing potential. That makes partners happy, because it's a marketing weapon they can use.

EDIT: I've made this chart for me and decided to share. It's all speculation, but I think it's very realistic.

Comparison_table.jpg


My conclusions based on the chart:

1. Nvidia can relax their harvesting strategy from 2 clusters (GTX360) to 3 (GTX350) greatly improving yields and still be competitive with HD5870 and HD5850 (GTX320).

2. That's assuming that Nvidia Fermi architecture suffers the same efficiency hit as Hd5xxx cards. Based on past generations (G92-to-GT200), Nvidia managed to "double" performance to same degree that AMD did, but while AMD used 2.5x the number of SPs (and TMUs), Nvidia did only 1.87x the SPs and 1.5x the TMUs. HD5870 with twice the units only manages a 40-50% improvement over the equally clocked HD4890 so the efficiency is ~75%. Now we are talking about a 2.16x increase in SPs for Fermi, add to that a possible higher efficiency of... let's say 80-90% (less than G92/GT200) and we would have a much much faster architecture.

3. That's assuming 1.5 Ghz SPs on all cards, which IMHO is a little bit conservative. Especially smaller cards shouldn't have a problem reaching 1.6-1.8 Ghz on 40nm. I'd expect the same kind of improvement in clocks as AMD when moving to 40nm. If we compare HD5870 to HD4870 and assume that they have a 1000 Mhz HD5890 up their sleeves that's a 15% jump. GTX285 shaders run at ~1500 and that could mean that Nvidia could reach 1725 easily.
 
Last edited:
Great discussion here guys. I'm very impressed with the research and updating you guys are providing. Thanks!

Is it to early to speculate the 300 series lineup? ie 360/380/395? Any guesses on detailed specs?
 
Status
Not open for further replies.
Back
Top