so i got my 4870 running at 850 and 975 to see how the 4890 would perform. It didn't improve the frames much in FC2 (gain was about 2-3 fps) but it sure made the whole game smoother. There was little or no dips in fps.
Is this a viable representation of how the 4890 would perform ?
It very well could be. While it would be a 13% increase in TMU/ROPs, it's very possible that they could better utilize the FLOP power ATi has. The price tag combined with name seems telling not to expect TOO much.
Shaders really are not important at this juncture, as ATi can already match Nvidia's GTX285 in computational power with the 4850. (1TF). While that's great for GPGPU, not so much the case for gaming when not matched correctly with TMUs/ROPs. That leaves TMUs and ROPs as the possible detriment. When we see the match up between the 280/285 and 4870, VERY often we see that the GTX leads by the percentage it leads in ROP OR TMU power, meaning it is the bottleneck in almost every case, rather than FLoating point OPerationS (with a couple notable exceptions). Since this product, irregardless of if there are added shader arrays, will probably have 16 ROPs, it will come down to a core speed increase. To furthermore compact this theory, 4850 is proportionally slower than 4870 with it's core clock speed, not memory bandwidth. The performance you're paying for is the 750+ mhz from the extra voltage fed to the core of those parts, more-so than the extra bandwidth from the gddr5. Thank your local TPU for those bios mods...
For reference:
4850: 625mhz @ ~1.1v
4870: 750mhz @ ~1.2v
4890: 850mhz @ ~1.3v
See how that works?
Personally, I think we're looking at overclocks in the ~900mhz range. Anything more than that, and I believe AMD would've capitalized on it for stock to take on the 280, as those mhz really could be the difference.
When you break down the ROP comparison you see this:
4850: 625x16: 10
4870: 750x16: 12
4890: 850x16: 13.6
gtx260: 28x576: 16.1
gtx280: 32x602: 19.2
gtx285: 32x648: 20.7
and TMUs:
4850: 40x625: 25
4870: 40x750: 30
4890: 40x850: 34 (if 4890 is still 800sp/40tmu...if 960/48 it would sit on top of the gtx260 216)
GTX260 (192): 64x576: 36.8
GTX260 (216): 72x576: 41.4
GTX280: 80x602: 48.1
GTX285: 80x648: 51.8
Really, what it comes down to is that Nvidia requires more flops (more/higher clocked shaders) and ATi needs more/higher clocked TMUs and ROPs...but mostly ROPs.
I have this theory that everyone will arrive in the center with 3200sp; nvidia with 32x5 multi-directional shaders per array (2 flops per shader) in 10 arrays with a 40nm 512-bit GT300, and ATi producing a Rv970 with it's normal routine, 10 arrays of 320 (4+1)x2 flops, or 2(160x10) with R800, using it's 256-bit bus. Of course then we'd be talking 160 TMUs, more ROPs, and for ATi, 28nm. You'd be seeing basically the GT300 28nm die shrink matching up with Rv970 with the difference being ati having 6400flops per clock, nvidia 3200, but nvidia with twice the shader clock.