Not everything. Quoting
myself, nVidia did decrease some things in order to increase the overall chip efficiency.
One of them would be decreasing the memory bus from 512 to 384bit, which is compensated by using GDDR5.
Ok the memory bus decreased, but bandwidth increased. Normally, when you talk about specs, you have to take the practical ones, bandwidth is and bus width is not. And what's more important, bandwidth is much higher in GT300 than in RV870, so it doesn't matter if RV870 is bottlenecked, which I think it's not.
GT200 -> 512bit/8 x 2500 = 160 GB/s
GT300 -> 384bit/8 x 4800 = 230 GB/s
RV770 -> 256bit/8 x 3600 = 115 GB/s
RV790 -> 256bit/8 x 3900 = 124 GB/s
RV870 -> 256bit/8 x 4800 = 153 GB/s
As you see even the GTX285 had higher memory bandwidth, but that doesn't mean the GTX was bottlenecked, not at all.
Also the jump from RV770 to RV870 is 153/115 = 1.33 or 33% more bandwidth. Or RV790/RV870 is 153/124 = 1.23 or 23%.
While the jump in Nvidia is 230/160= 1,43 or 44% improvement. Nothing points out to GT300 being memory bottlenecked, far from it.
And another example would be the limit of simultaneous threads, which was
30720 for the GT200 and now is down to 24576 for the GF100.
Yeah, I knew about that decrease, but it's important to note that GT200 never ever reached anything close to that number, and I mean 24576, and GT300 can. Peak <insert spec> means very little unless you are speaking about the exact same architecture. It's like RV770 had 1.2 TFlops and GT200 only had 622 Gflops or 933 with dual issue (which almost never happens), but GT200 has the ability to use them much better and it's faster.
Scheduling and threading is a lot less efficient with ATI's DX10/11 approach, yes. However, that's compensated by the sheer ammount of ALUs.
And although this sounds like a less elegant solution, the performance/transistor ratio was proven to be a lot better in the red team.
Not true.
http://forums.techpowerup.com/showpost.php?p=1575651&postcount=188
We don't know the clocks. The limitations they encountered in clocking GT200 higher could not happen now. GT200 was 65nm and RV770 was 55nm. Now both are 40nm so they can achieve clocks that are similar, they might not, but the posibility is higher than in previous generation.
The truth is that at least since DX10 cards the performance/transistors ratio has been constant in almost every chip. If you add G92 and RV670 to the equations that I made in the link, it becomes even more apparent, g92 with 756 millions more than competes with the low end of RV770. And the 667 million trans. RV670 does so with G92.
The memory bandwidth bottleneck in the HD5870 is easy to prove. Take the comparison results with the HD4870X2, for example:
Even if we imagine a 100% scalability with the two RV770s, the HD5870 is higher clocked, with the same functional units as the HD4870X2 (ROPs, TMUs, shader processors, etc).
However, the HD4870X2 beats the HD5870 in many situations.
Specs-wise, the HD4870X2 only has higher theoretical bandwidth, so that's the only possible case.
The X2 also has two schedulers, one per chip, so that can be the problem as I said.
The only way in which memory bottleneck can be proven in RV870 is taking the HD5870 and downclocking the chip while leaving the memory as is. If you downclock and performance is mantained then the card was bottlenecked and if it does perform worse then it's not.
Anyway specs don't tell all the story. My latest assumptions are not based in specs (only), they are based in all the aspects of the chip covered in the white paper and architecture previews like the one in Real World Technologies. GT300 has improved in almost every practical aspect and RV870 is basically the same chip, with twice the units and DX11 suport. Just because the latter has not scaled well, doesn't mean the former will not scale well. For instance, Nvidia has scaled much better in the past. They went from 128 SPs to 240SP, 1.875x the ammount. At the same time AMD went from 320SP to 800SP, 2.5x. None of them reached that amount of improvement but Nvidia got much closer. Looking at it that way, it doesn't surprise at all that RV870 didn't scale that well, RV770 neither did after all. And now Nvidia is doing a 2.15x increase, so chances are big for them to do much better.