dam, i'm sure the rest of you think its strange that ATI with their newer gen cards always increase SPU's and with it performance increase, yet with fermi performance per shader has decreased drastically. the GTS 450 is most likely to compete with the HD 5770, yet even if it has 240/256 shaders. I realize they're clocked a little lower, but the GTX 460 just keeps up witht the GTX 280 and their clocks are somewhat similar, so 336shaders vs 240, and for those 336 shaders to surpass previous gen performance it had to be overclocked. Huge let down for me :/
I've been thinking about that too and my conclusion on the matter has been that G80/G92/GT200 could actually do 3 FP ops per cycle (MADD+MUL, 2+1) as Nvidia always said, despite the fact 99% of the rest of people always insisted that they couldn't, at least not in games. Fermi only does FMADD and as such it only does 2 ops/clock. To make things even regarding actual peak FP power, you can also see a GTX280/285 as having 360 SPs and then things trully start to make more sense.
In any case comparing GF10x to GT200 on a pure SP count basis, is not fair, it is as unfair as it was to compare the 320 SPs in the HD2900 to the 56 "SPs" (48 pixel shaders + 8 vertex shaders) in the X1900. It is a new architecture, a completely different way of doing things.
And that not only pertains to the actual architecture and its drivers, but also to how games are programed and the actual load balance of the content (textures, shaders, filters...). Game performance for cards such as te GTX285 and the HD5000 is what it is now thanks to the accumulated optimizations, programing efforts and content customization made by game developers since the introduction of G80 and R600. That's almost 5 years of optimizations. You are not going to start seeing a similar level of optimization until the games that have been programmed with fermi in mind since the beginning start to pop up and considering game developing cycles nowadays, that means at the very least 1-2 years in the future.
EDIT: Regarding competition against the HD5770, I did some math. I think that GF106 is going to be literally half a GF100 (but with problems fixed like in GF104) anf GF108 is going to be half a GF104. That's what the PCB suggests IMO. With 256 SPs this card would perfectly match the HD5770 if clocked at same clocks as the GTX460, because as can be seen in Wizzard's reviews, the HD5770's performance is about 75% of a GTX460 and coincidentally 256 is about 75% of 336. And that's if it's clocked at 675Mhz. IMO it's going to be clocked higher than that. There's no reason not to, for various reasons.
1- Historically the fastest SKU of mainstream chip has
always been clocked 25-50 Mhz higher than the high-end chip.
2- Smaller chips reach higher clocks. Further improved manufacturing process.
3- GTX460 hints at much higher clocks. There's absolutely no technical reason* for clocking the GTX460 as low as 675 Mhz, when every reviewer has been able to clock every sample they got past 800 Mhz on stock voltages and past 900 Mhz with slightly higher volts. technically, it just doesn't make sense to leave 40% OC headroom on your chip.
*I'm 100% confident that the reason that they clocked it low and that the full enabled SKU has not been released is because they would destroy the GTX470 and maybe even the GTX480 (this last one not on pure performance, but in the same way the 512MB 8800GTS made the 8800GTX odsotete). They will just wait until GF100 inventories have been depleted. They did exactly the same with G92. This way they also obtain higher yields for the time being, although they are probably binning the best chis for a future release too. Win win.
Well there's absolutely no reason to do this with GF106 since Nvidia has no cometing products there, so I'm pretty sure they would be clocked in the 725-750 Mhz range and that will make the GTS250 some 10% faster than HD5770.