# Article: Just How Important is GPU Memory Bandwidth?



## RCoon (Jan 20, 2015)

*****
Holy crap I am tired and this is all probably totally wrong
You can see all my original data here:
https://www.dropbox.com/sh/v3vqnglktagj8tr/AADvMQeqR-nxETkn4PwJKlZBa?dl=0
*****​*Introduction*

The main reason for running into this kind of article was with the recent “exclamations” about the GTX 960’s 128bit wide memory interface. The GPU offers a 112GB/s memory bandwidth, and many believe that this narrow interface will not provide enough memory bandwidth for games. This card is primarily aimed at the midrange crowd, wanting to run modern titles (both AAA and independent), at a native resolution of 1080p.

Memory bandwidth usage is actually incredibly difficult to measure, but it’s the only way of making known once and for all, what the real 1080p requirement is for memory bandwidth. Typically using GPU-Z, what we have available to us is “Memory Controller Load”. This is a percentage figure does not accurately measure the total GB/s bandwidth that is being used. The easiest way to explain it is it acts similar to the percentage CPU utilisation Task Manager shows. Another example would be GPU Load, wherein various types of load can cause the same percentage figure measurement, but can have very different power usage readings, leading us to assume one 97% load can be much more intensive than another. Something else that only NVidia cards allow measurements of is PCIe Bus usage. AMD has yet to allow such a measurement, and thanks to @W1zzard for throwing me a test build of GPU-Z, I could run some Bus usage benchmarks. I had a fair few expectations from the figures, but the results I got were a little less than expected.

Something I need to make clear before you read on, my memory bandwidth usage figures (GB/s) *are not 100% accurate*. They have been estimated and extrapolated using performance percentages of the benchmark figures I’ve got, as such, most of this article will be relying largely on those estimations. *Only a fool would consider it as fact. *NVidia has said themselves that Bus usage is wholly inaccurate, and most of us are aware that Memory Controller Load (%) cannot represent the exact bandwidth usage (GB/s) with total precision. All loads are different.

*All of the following benchmarks were run 4 times for each game on each resolution for accuracy. Every preset is set to High where Very High is unavailable. The only graphical alteration to my video settings was turning off VSync and Motion Blur.*

*Choices of Games*

I’ve chosen to run with 4 games which I felt represented a fair array of game types. For CPU orientated, I’ve run with Insurgency. This is Source engine based, highly CPU intensive, and should cover most games running that sort of requirement. It has a reasonable VRAM requirement, but is overall quite light on general GPU usage, so it should stress the memory somewhat.

To represent the independent games, while also holding a high VRAM requirement, I’ve run with Starpoint Gemini II. This game has massive VRAM requirements, and is quite a GPU heavy game.

I’ve chosen two other games for the AAA area, one very generalised game, and one that boasted massive 4GB VRAM requirements for general high res play. Far Cry 4 felt like a good representative for the AAA genre that has balance in both general performance of the CPU, GPU, and moderate VRAM requirements. Middle Earth: Shadow of Mordor was my choice for the AAA genre to slaughter my VRAM and hopefully put my GPU memory controller and VRAM to the test.
*******​
*1440p – *Overall Correlations

I’ve started off with benchmarks running on 1440p to clearly identify what kind of GPU power is required for this resolution. I understand that the 112GB/s bandwidth we’re aiming for is designed to cope with 1080p, but hopefully you’ll see just what you need.

First off, we’ll take a look at all four games, and the performance of the GPU Core(%), Memory Controller Load(%), and VRAM Usage(MB). (*The following data has been sorted by “Largest to Smallest” PCIe Bus Usage).*



















What I expected to see was the Memory Controller Load to be in direct correlation with VRAM usage. What we can clearly see here is that Memory Controller Load is in absolute correlation with the GPU Load. VRAM usage seems to make little difference to the way either performs except in edge cases.

Next up, we’ll look directly at the correlation between PCIe-Bus Usage(%) and VRAM usage(MB).


















Besides the Insurgency graph, it appears that there is no direct correlation between the PCIe Bus and VRAM. I had to run these benchmarks multiple times, as I was a little confused that the PCIe Bus usage was always so low, or in some cases, idle.

Next let’s look at the overall correlation between Memory Controller Load (%) and the PCIe Bus usage (%)


















You can see there’s literally no particular change in PCIe Bus usage overall. When the Memory Controller Load peaks, the data for the PCIe Bus shows no reaction to the change.

Finally let’s take a look at the individual Memory Bandwidth Usage (GB/s) figures overall. *Note, these figures are not 100% accurate, and follow the 100% = 224GB/s rule.*


















We can see in most cases the Memory Bandwidth usage (GB/s) is actually extremely erratic over the period. Shadow of Mordor showed the only real case where the usage was relatively persistent throughout the benchmark. You’ll also probably notice that it hits a rather high figure at peak load.

Let’s look at what these figures equate to overall. For this I’ve used the 95th percentile rule to remove freak results from both the low and high end of the scale. *Note, these figures indicate bandwidth with Maxwell compression methods (~30%) in mind*.


















We’ll see most of these figures are relatively high, though none manage to reach the limit of my 970’s 224GB/s bandwidth available at any time. The only exception is Starpoint Gemini II, which despite eating VRAM when available, didn’t appear to put much load on the Memory Controller. If we took the Memory Controller Load figure as a good representation of actual bandwidth usage, the 970 is never really in danger of being overwhelmed. We can clearly see however that the peak figures would be too much for a 960’s 112GB/s available bandwidth. If we ran by the average figures instead, the 960 could cope with a couple of the games, but it would still choke on the big titles during average gameplay. We can’t discount the peak figures though, so you’d certainly see issues at the 1440p resolution.

For the sake of estimation and sheer curiosity, here is what the estimated Memory Bandwidth Usage would be *if Maxwell was exactly 30% efficient at compression*, without the compression.


















The 970 would still cope, except in peak cases during Shadow of Mordor, where the required bandwidth exceeds that of the available 224GB/s. *Obviously all these figures are mere estimates, so the actual cases may vary in real world examples.
*
*******​
*1080p *– Overall Correlations

These are the main benchmarks we’ll be looking at for our 112GB/s bandwidth limit on the 960. The card is aimed at this resolution, so hopefully we’ll see some post-Maxwell compression figures dropping us in that area.

Let’ take a look at the overall figures for this, and look for similarities between 1440p correlation (or lack of). The previous charts showed Memory Controller Load linked with GPU Load and not VRAM Usage.


















This surprised me a little bit. If you look relatively closed at the peaks and drops, all three measurements appear to correlate rather well at this resolution. The VRAM drops actually appear to associate with the drops in Memory Controller Load as well as GPU Usage. Certainly an interesting turn of events.

Next let’s take a look at the PCIe Bus usage and VRAM. There were no direct correlations in the 1440p benchmarks.


















This time things look a little more interesting, but unexplained. Far Cry 4 shows no real correlation at all. The rest of the games however seem to show a drop in PCIe Bus usage every time there’s a drop in VRAM usage, before the VRAM usage steadily rises before dropping again.

Next up is the Bus and Memory Controller figures.


















This time again, no real correlation. A similar result to the 1440p benchmark. No unexpected surprises there.

Here are the figures you’re more interested in however. Let’s take a look at the overall Memory Controller Usage over the benchmarks. This should show us approximate (*again inaccurately)* how much bandwidth 1080p seems to scream for.


















This time Shadow of Mordor follows suit and starts to become a little more erratic along with the rest. We can see some interesting peaks in usage, as well as a general idea of what the average is overall. The plateau at the beginning of Far Cry 4 is particularly interesting.

Next, here are those overall figures in a more pleasant representation. Here we can see exactly what the figures are. Again, using the 95th percentile rule for these results to remove the serious spikes, *these results are not 100% accurate*.


















Shadow ofMordor slaughters all, even in the average benchmark. Far Cry 4 scrapes the barrel in the average figures, but again, the peak proves to be above the 112GB/s mark. The Source engine game as well as SPG2 however prove to be completely viable solutions.

Here’s what the results would look like without the *estimated ~30% Maxwell compression.*


















Shadow of Mordor peaks within percentile points of the available bandwidth on a 770 (224GB/s), but all other games remain below to 200GB/s mark.

*Conclusion*

Something you have to bear in mind when looking at these figures (besides the fact they are most certainly not 100% accurate), is that it’s plausible memory bandwidth acts similar to VRAM. There are many occasions where people can see VRAM usages in an average game hit a certain mark, let’s say 1800MB on a 2GB card. Other people, running the same settings, but with a 4GB card may see usages above and beyond 2GB, almost as though the game is using the available VRAM simply because it can. Is it possible that games utilise memory bandwidth in a similar fashion? Possibly, but we don’t really know. It could be possible that the same benchmark, when run on a 770 which shares identical bandwidth with the 970 (224GB/s) may provide higher results due to the lack of compression, but prove to be less than the 30% assumption. Maybe the video card wouldn’t “stretch it’s legs” and would be more conservative with bandwidth usage if it had less available. It’d be an interesting benchmark to see.

If we treated these bandwidth figures as a reference (*which you most certainly should not*), we could then assume that the GTX 960’s 128bit wide memory interface simply does not provide enough bandwidth to play AAA titles at Very High (or High where not available) and Ultra Presets on 1080p. If we went by average figures, it would get by OK, but struggle at peak loads. In terms of Independent titles, along with Source engine games, it’d do just fine. It may be the case that at 1080p turning off a little eye candy would put the game within the 112GB/s limit and remove that bottleneck in AAA titles.

The main issue is that more and more AAA titles may follow the example of games like Shadow of Mordor and require more and more VRAM and eat up more bandwidth. If things plateau at that sort of figure, perhaps the 112GB/s would cope. In the event AAA titles became more advanced in their fidelity, the 960 might find itself quickly outpaced by rivals offering a more sensible bandwidth ceiling.

*Finally, I’ll leave you again with the same bold statement, that the (GB/s) figures in these benchmarks are merely estimates of a largely inaccurate form of extrapolating memory bandwidth usage figures. By no means should you base a purchase on these, as the percentage representation of memory bandwidth is open to extremely broad interpretation.*

*If anyone would be so kind as to run a benchmark of these games on a 770 and send the log over to me, I can more accurately show bandwidth usage BEFORE Maxwell compression. I’d also be delighted to see user’s benchmarks on GTX 960’s to prove these estimates horribly wrong.*


----------



## FordGT90Concept (Jan 20, 2015)

Score one for HBM?  Maybe that's why AMD is bidding its time waiting for HBM to become marketable.


----------



## Mussels (Jan 20, 2015)

this needs to be a proper front page article.


in my personal experience, too low a memory bus can cripple a card for sure - i've been hit with models in the past that had double the ram but half the bandwidth and their performance was miserable.


----------



## rtwjunkie (Jan 20, 2015)

Wow, just wow!  You outdid yourself.  That's quile alot of work, and some interesting results.

It is just estimates, which you reiterate numerous times, of the 960's abilities.  I tend to think NVIDIA's engineers knew what they were doing when they implemented a 128-bit bus, and thusly, that it probably will perform a bit better than your estimates at 1080p.

Probably it will only be able to do mostly High settings though, with Ultra out of the question, and Very High in alot of games only with AA and tesselation turned down.  For the vast majority of average gamers out there who just buy a mid-range card every year, I bet it will be good enough.


----------



## Mussels (Jan 20, 2015)

a couple of the images seem confusing - text is the same, but different results.


----------



## RCoon (Jan 20, 2015)

FordGT90Concept said:


> Score one for HBM?  Maybe that's why AMD is bidding its time waiting for HBM to become marketable.


If my results are correct (which they aren't), I think NVidia has put too much hope in Maxwell compression. There are events at which Maxwell compression goes beyond 30%, but in contrast, there are occasions when it is less than 30%


Mussels said:


> this needs to be a proper front page article.


Not my call, and it's not 100% accurate information, merely educated extrapolation. W1zzard could have done this himself quite easily, but he'd have to give up a few days to get it done (probably far prettier than I have done too)


rtwjunkie said:


> It is just estimates, which you reiterate numerous times, of the 960's abilities.  I tend to think NVIDIA's engineers knew what they were doing when they implemented a 128-bit bus, and thusly, that it probably will perform a bit better than your estimates at 1080p.


Yeah I wanted to reiterate that, because they are not accurate. NVidia said the Bus monitoring was not accurate, and W1zzard explained how memory controller load was by proxy memory bandwidth usage, but not a 1:1 represenation.


Mussels said:


> a couple of the images seem confusing - text is the same, but different results.


Those are the results for each game in order. I forgot to Title each graph to each game.
All graphs in order are
Far Cry 4
Insurgency
Shadow of Mordor
SPG2

Let me eat and I'll reup the images with titles in each case I've missed a game title.


----------



## Mussels (Jan 20, 2015)

yeah all it needs is the titles to make sense.


----------



## Ed_1 (Jan 20, 2015)

FordGT90Concept said:


> Score one for HBM?  Maybe that's why AMD is bidding its time waiting for HBM to become marketable.


Huh? , this shows the oposite of what would expect .
While all manufactures are going to go to 3d ram , it seems for mid range right now you don't need gobs of BW yet .
At least on current Nvidia cards as they don't go above 368 bus (GM2xx) .
That said I was thinking a 192bus for 960 would of been better, maybe the 960ti will be that .


----------



## RCoon (Jan 20, 2015)

Mussels said:


> yeah all it needs is the titles to make sense.



Fix'd


----------



## xfia (Jan 20, 2015)

amdnvidia

amd powers the game systems and proves the worth of a architecture years old scaling from entry level to high end and nvidia can't boast any real performance improvement on a brand new architecture outside of efficiency


----------



## Tatty_One (Jan 20, 2015)

Mussels said:


> this needs to be a proper front page article.



Agreed, It also deserves a sticky which I have done.


----------



## GhostRyder (Jan 20, 2015)

Very nice article!!!  Its good to have figures like this to at least help alleviate alot of the theoreticals and "Ifs" surrounding memory bandwidth, memory usage, etc.  Though I am a bit shocked by some of the results as I did not expect it to be so demanding at 1080p though I guess its safe to say this is thanks to new games and the ever changing realm with higher graphics and fidelity.

Nice article!


----------



## rruff (Jan 20, 2015)

Wow, thanks for doing that! Lots of work!

I have a question about the protocol.... this is a 970, correct? And you are measuring memory controller load with the 970 running full tilt at 1440p and 1080p? 

The first thing that occurs to me is that the 960 will run at slower framerates than the 970, and not because the bus is limiting it... all the specs are reduced. What you've shown is that the *970 would be memory bus limited if it was cut in half*, but since the 960 will be running slower fps anyway, it might not have this issue. As a rough guess we could scale it by shaders and say we'd expect the 960 to run ~1024/1664 or 62% of the 970. I'd expect the memory bandwidth requirement to scale similarly.


----------



## RCoon (Jan 20, 2015)

rruff said:


> Wow, thanks for doing that! Lots of work!
> 
> I have a question about the protocol.... this is a 970, correct? And you are measuring memory controller load with the 970 running full tilt at 1440p and 1080p?
> 
> The first thing that occurs to me is that the 960 will run at slower framerates than the 970, and not because the bus is limiting it... all the specs are reduced. What you've shown is that the *970 would be memory bus limited if it was cut in half*, but since the 960 will be running slower fps anyway, it might not have this issue. As a rough guess we could scale it by shaders and say we'd expect the 960 to run ~1024/1664 or 62% of the 970. I'd expect the memory bandwidth requirement to scale similarly.



You are wholly correct. It's all done on a 970, and judging by the fact I discovered that memory controller load is directly correlated with GPU load, we can assume that the lower the maximum GPU load, the lower the memory bandwidth will be. That's a wild guess on my part, and in reality could be hugely wrong.
It's one of the many reasons I wanted to test a 770, as it shares a 970's 224GB/s bandwidth, but obviously has less horsepower for a backbone. It would not only show the true difference between Maxwell compression, but also the effect a lower powered GPU load has on bandwidth.


----------



## newconroer (Jan 20, 2015)

Lovely write up though I didn't find the conclusion very..conclusive other than that too little bandwidth = problematic for performance.
Was that ever in question?

What I find more difficult to grasp is how important the speeds of GPU memory is. Often I find little real world gain from even significant over clocks except in acute situations.


----------



## EarthDog (Jan 20, 2015)

This was a lot of work i am sure. Thanks for bringing it up.

Its nice to see something, and I use this term loosely as you do essentially, 'concrete' on the issue. I though, like newconroer, find this 'proves' what people know already (but could never put their finger on it). I just wish we could have concrete numbers to base the data off of. Its a logical leap, but lord knows without actual/factual data to start with, if it extrapolates out to fact. 

People just need to know that, regardless of the bandwidth, what the FPS say is what you will get regardless. Another way to put it, I have the same 4 cars with different motors and they all run 12s 1/4 mile... one does it N/A, one boosted with a snail, the other a screw, and the other a rotary. It doesn't matter how it gets there, just that it does.


----------



## xfia (Jan 21, 2015)

so how does the compression work anyway?  is it hardware limited to 30 percent or could they improve it with drivers?


----------



## rruff (Jan 21, 2015)

RCoon said:


> I discovered that memory controller load is directly correlated with GPU load



That's a key finding right there. In that case you seem to have proved that the 960's 128bit bus will be fine at 1080p, and nearly always at 1440p. Doesn't mean it is a great card or anything, but that the 128bit bus won't be slowing it down, but rather the processor. 

The big question I have, is can you say the same for the 2GB of vram? Would that scale with GPU load as well? And is there any way to tell how much vram is really needed (vs allocated) without testing identical cards with different amounts of vram? 

You'll want to see this. Says the 960 sucks because of its 128bit bus, and at 4k it gets creamed by an R9 280. http://wccftech.com/nvidia-geforce-gtx-960-radeon-r9-280-4k-benchmarks/







Their conclusion that it will also suffer at 1080p doesn't make sense to me.


----------



## xorbe (Jan 21, 2015)

RCoon said:


> What I expected to see was the Memory Controller Load to be in direct correlation with VRAM usage.



I would expect MCL to be in direct correlation with cache eviction rate regardless of vram usage.

Also, why is it surprising that MCL increases with GPU load for typical usage?


----------



## xfia (Jan 21, 2015)

that is a ridiculous article. nothing at all is valid about it. no test setup listed. no multiple graphs at different settings and resolutions. not to mention 1 gpu is not enough for 4k and neither is 4gb depending on the game..


----------



## rruff (Jan 21, 2015)

xfia said:


> that is a ridiculous article.



Yep, shamefully weak. Even if the data is 100% real, conjuring an unrealistic situation where the 960 would suck just so you can knock it is... well, not very objective. 

How many people will be 4k gaming with a 960 or R9 280? Who cares which one *sucks a little less *at that res? The proof will be what happens at 1080p.


----------



## HumanSmoke (Jan 21, 2015)

xfia said:


> so how does the compression work anyway?  is it hardware limited to 30 percent or could they improve it with drivers?


Not all data is compressible by the same ratio, or at all in some cases. You can find out more info from the Maxwell white paper (PDF pages 10-11)
The salient points are:






@RCoon
Thanks for the time and effort. Having done a few articles myself, I can appreciate how a concept quickly morphs into leviathan proportions that you possibly didn't originally imagine.

EDIT:


rruff said:


> How many people will be 4k gaming with a 960 or R9 280?


Hey, you haven't lived (and nor will you) until you've played a FPS at 4K with a mainstream card.


----------



## vega22 (Jan 21, 2015)

xfia said:


> that is a ridiculous article. nothing at all is valid about it. no test setup listed. no multiple graphs at different settings and resolutions. not to mention 1 gpu is not enough for 4k and neither is 4gb depending on the game..



did you not read the article linked to?



> *Here is the test setup used:*
> 
> 
> Intel Core i7-3960X
> ...



great work rcoon!

if you ever get bored, or have a few nights of insomnia i would love to know what kinda of figures something like catzilla at high res uses as i think it might be more in line with lotr than source.

but as others have said the mc will work in tandem with the core more than the vram usage as the vram is only really filled or emptied, past that all the mc does is serve data from the vram to the core as its needs it (read when its under load).


----------



## xfia (Jan 21, 2015)

yes i read all of it and it barely even passes as a test setup list (to be honest i was just so baffled by the whole article when i typed that) 

look at the chart itself..  a 960 is relative to 100 percent performance at 4k 

if anything they proved that you will certainly hit a vram wall with only 2gb at 4k and is a no brainier so they should have put the the 960 against the 285

im not really a fan of shrinking bus width like this thus far but to just try and bash it in this way is just silly

@HumanSmoke  thanks for sharing


----------



## Xzibit (Jan 21, 2015)

Nevermind I think I'm disoriented watching the SOTUA


----------



## rruff (Jan 21, 2015)

HumanSmoke said:


> Hey, you haven't lived (and nor will you) until you've played a FPS at 4K with a mainstream card.



NO AA?! I need AA... I'll take the fps hit...


----------



## RCoon (Jan 21, 2015)

EDIT: Adjusted first group of graphs. SPG2's overall data graph was mistakingly replaced with the PCIe Bus/Memory usage graph.



rruff said:


> The big question I have, is can you say the same for the 2GB of vram? Would that scale with GPU load as well?



Judging by the few games I've tested, Modern AAA titles would cripple the 2GB VRAM in a vaccuum. As we know, VRAM tends to operate differently when the quantities are altered. One game will use 2.5GB VRAM on a 4GB card, the same game on the same settings will use 1.8GB VRAM on a 2GB card. Sometimes cards let games stretch their legs when there's excess VRAM available. That's not the case in every single game, but it often happens. I don't know whether the game holds itself back for the lowend cards or stretches on the high end cards, or both.

If we took my results of VRAM usage (which ARE accurate), then the 2GB would certainly hold the card back on Very High (or High)/Ultra settings on 1080p. You'd have to start toning down the aliasing and textures.



rruff said:


> And is there any way to tell how much vram is really needed (vs allocated) without *testing identical cards with different amounts of vram*?



Answered your own question  You'd have to keep your eye on FPS drops, and system RAM usage too, to see exactly where it spills over and starts affecting performance.


----------



## HammerON (Jan 21, 2015)

Thanks @RCoon for taking the time to do this
I find your results very interesting to say the least.


----------



## The N (Jan 21, 2015)

The way Article brought up, definitely @RCoon put up grand effort on it. i do Appreciate your work. thats really informative for us.

The Memory bandwidth does have role in overall gaming, from your Testing process and results i can say, lesser memory bandwidth does create a bottleneck even n 1080p ont by huge but to some extend. , specifically 960 placed with 112GB/s which is already little lower. and when you applied higher AA, it does create  impact on overall performance while gaming. MCL correlated with GPU LOAD, so if assume not 100% but if 60% still it will make difference not by huge but still there will be.

960 with 112GB/s and 2GB VRAM performance would be equal or better to 7950/r9 280, +/- Difference. but both elements seems would create some bottleneck for ULTRA or very HIGH gaming @1080p. Especially VRAM is low, as these days games consumes more than 2GB vRAM, 3GB atleast good 2 go.


----------



## Mussels (Jan 21, 2015)

EarthDog said:


> This was a lot of work i am sure. Thanks for bringing it up.
> 
> Its nice to see something, and I use this term loosely as you do essentially, 'concrete' on the issue. I though, like newconroer, find this 'proves' what people know already (but could never put their finger on it). I just wish we could have concrete numbers to base the data off of. Its a logical leap, but lord knows without actual/factual data to start with, if it extrapolates out to fact.
> 
> People just need to know that, regardless of the bandwidth, what the FPS say is what you will get regardless. Another way to put it, I have the same 4 cars with different motors and they all run 12s 1/4 mile... one does it N/A, one boosted with a snail, the other a screw, and the other a rotary. It doesn't matter how it gets there, just that it does.




i think thats due to hardware design limitations.

The ram always has to be in preset amounts, such as 128 bit, 256, 384, 512 etc.

*Purely theoretical with made up numbers*
What if they design a card that works awesome with 256 bit and say 2GHz ram - but the 2GHz ram has supply issues, so they move to 384 bit 1.5GHz ram - suddenly they have more ram bandwidth than the card needs, but its the only financially viable redesign option at that point. The ram could OC really well, but provide no gains at all. 
This kinda thing could also happen in reverse, when they halve the memory bandwidth on a mid-range or entrey level card, but due to budget they cant use fast enough ram, and suddenly ram OC gives large benefits to that model.


----------



## The N (Jan 21, 2015)

Mussels said:


> This kinda thing could also happen in reverse, when they halve the memory bandwidth on a mid-range or entrey level card, but due to budget they cant use fast enough ram, and suddenly ram OC gives large benefits to that model.



Such as kepler's 660Ti, 192bit GB/s and have immense overclocking potential on RAM side. Midrange card do have great overclocking potential that let you get performance near or equal to next card of series.


----------



## rruff (Jan 21, 2015)

RCoon said:


> You are wholly correct. It's all done on a 970, and judging by the fact I discovered that memory controller load is directly correlated with GPU load, we can assume that the lower the maximum GPU load, the lower the memory bandwidth will be.



Since I have a 750, I figured I'd see if I could work this from the other end. The 960 is exactly 2x a 750 in everything but bandwidth. It has the same 128bit bus, but 7Ghz vs 5Ghz vram, so the 960 has about 40% higher bandwidth. 

I ran a couple of games and Heaven on high settings (1080p) and got the same max MCU load in each case... 78%. GPU load was at the max, as well as the allocation of 1GB of vram. Since I got the same number every time, I'm guessing that might really be as high as it goes. 

My card is overclocked. Compared to reference it's 1310/1085 or 21% on the GPU and 5900/5010 or 18% on the vram. The memory bandwidth is 94.4 GB/s. 

If I understand your method correctly, you multiply the MCU load by 70% in an attempt to account for Maxwell compression. I assume this is because the MCU reading from the card is in error by this factor? I don't see where you mentioned that, but might have missed it. If I apply that factor I get a max MCU load of only 55%, or 51.5 GB/s. 

So if the 960 processor is 2x as fast as the 750, then* I'd expect it to need a bandwidth of 103 GB/s before it hits the processor limit*. Does that make sense?


----------



## RCoon (Jan 21, 2015)

rruff said:


> The memory bandwidth is 94.4 GB/s.
> 
> If I understand your method correctly, you multiply the MCU load by 70% in an attempt to account for Maxwell compression. I assume this is because the MCU reading from the card is in error by this factor? I don't see where you mentioned that, but might have missed it. If I apply that factor I get a max MCU load of only 55%, or 51.5 GB/s.



I took the total memory bandwidth (your case 94.4GB/s), divided it by 100 (100%). I then multiplied it by MCU (your case 78%). I then divided my figure by 70, and multiplied it by 100 in order to get my "pre-compression figure".
(94.4 / 100) * 78 = X  (X = Bandwidth usage *with* maxwell compression)
(X / 70) * 100 = Y (Y= Bandwidth usage *without* maxwell compression (Assuming it is exactly to 30%))
So by those accounts, your normal bandwidth usage would be 73.6 GB/s on the 750.
Without Maxwell compression, the figure would be 105.1GB/s


----------



## rruff (Jan 21, 2015)

Oh, I see... it's just .78x94.4 or 73.6 GB/s. That would mean the 960 would need 147.2 GB/s for double the speed. I don't think the vram will overclock that much (>30%).


----------



## RCoon (Jan 21, 2015)

rruff said:


> Oh, I see... it's just .78x94.4 or 73.6 GB/s. That would mean the 960 would need 147.2 GB/s for double the speed. I don't think the vram will overclock that much (>30%).



Not necessarily, the correlation between GPU speed and memory bandwidth usage probably isn't a linear 1:1 ratio. If it were people probably would have discovered all that by now


----------



## EarthDog (Feb 2, 2015)

Mussels said:


> i think thats due to hardware design limitations.
> 
> The ram always has to be in preset amounts, such as 128 bit, 256, 384, 512 etc.
> 
> ...


Ram speed and its 'bits' don't have anything to do with each other really. 

I see the point of this post however. It is what is going on right now for all intents and purposes. AMD put a massive 512bit bus and slow ram. While NVIDIA is using a slower bus and faster ram IC's.


----------



## rruff (Feb 3, 2015)

EarthDog said:


> AMD put a massive 512bit bus and slow ram. While NVIDIA is using a slower bus and faster ram IC's.



My GTX 750 isn't bandwidth limited, but increasing the vram clock 20% resulted in a 7% speed increase. I'm thinking that faster vram is a better solution.


----------



## EarthDog (Feb 13, 2015)

rruff said:


> My GTX 750 isn't bandwidth limited, but increasing the vram clock 20% resulted in a 7% speed increase. I'm thinking that faster vram is a better solution.


Missed this reply... 

It depends on what you are playing and its resolution. 1080p + AA on 128 bit, it could be a factor. It also only has 1GB of vRAM. I wouldn't even call it a gaming card in the first place...


----------



## xfia (Feb 13, 2015)

750 is ok for gaming..  will hurt in some games especially with the 1 gb of vram at 1080p but its pretty good at lower resolutions.


----------



## andrewsmc (Feb 14, 2015)

What program do you guys use to make all the graphs?


----------



## rruff (Feb 14, 2015)

EarthDog said:


> It depends on what you are playing and its resolution.



My point is that it is better acheive a given bandwidth with faster ram rather than a wider bus.


----------



## kn00tcn (Feb 14, 2015)

i like charts, great investigation

i still wonder how important the size of vram is... i put hardline beta on ULTRA (except AA & AO) on my 570m, the game quickly goes to 1.5gb vram usage, no problems!

my new 660, also no problems other than 4gb system ram is unplayable in the beta, plus the same 1.5gb vram usage due to a similar situation as the 970 hysteria

yet the requirements recommend 3gb & people say 'oh you have to turn down settings for 2gb cards' ... i am on an overclocked MOBILE FERMI with only 1.5gb getting 30fps & beyond at 1080p

(while yes i switched to FXAA & SSAO, it's not like the vram usage lowered, MSAA still had max gpu usage, simply lower fps from ~30 to ~20 as if it was entirely gpu processing power related, not an issue of stuttering from lack of vram or bandwidth... the 660 was simlar, from ~60fps to ~50fps when MSAA is turned on, same 1.5gb vram usage)

what kind of crappy game engine needs to preallocate everything? it should be STREAMED so that you can have pseudo unlimited worlds & so that consoles can load the data as needed

now that i think about it some more, i have seen what happens when you're out of vram, but only on my 4870x2 with 1gb when GTA4 has its view distance too high or in crysis2 with the hq textures enabled, both cases resulted in stuttering & also the crysis textures lost their filtering so they were pixelated (actually i had the same pixelation when using texture packs in rfactor on a 128mb 9800pro)

so maybe i should rephrase... what kind of crappy modern AAA game engine with GBs of assets still ends up preallocating so much vram (over 2gb) that a very large amount of mainstream customers will fail?




andrewsmc said:


> What program do you guys use to make all the graphs?


office (excel)

any spreadsheet app should let you, i'm sure there are also standalone charting tools out there as the raw logged data is the same (time + some value, one on each line)


----------



## RCoon (Feb 14, 2015)

andrewsmc said:


> What program do you guys use to make all the graphs?



Excel, except the default graphs are horribly ugly. I probably spent the best part of an hour or two making them look not faceless and boring.

I also occasionally use jfiddle and Google jscript to make pretty interactive charts.


----------



## EarthDog (Feb 23, 2015)

rruff said:


> My point is that it is better acheive a given bandwidth with faster ram rather than a wider bus.


How did you come to that conclusion? I mean I see the graphs between the two cards, and the performance is what it is... so, how did you come to that conclusion considering the results?



> 750 is ok for gaming.. will hurt in some games especially with the 1 gb of vram at 1080p but its pretty good at lower resolutions.


I can make that argument for any card if I play at 640x480 and no AA... 

But we are talking majority here who are at 1080p or even 1440x900. 1GB of ram with most any MSAA is going to cripple a 128bit but more specifically a 1GB card.


----------



## rruff (Feb 24, 2015)

Because higher ram speed increases FPS even when bandwidth isn't an issue.

The GTX 750 has enough vram for its processing power. 1080p with AA. If you are experiencing low framerates, it won't be because your 1 GB of vram is maxed out.


----------



## EarthDog (Feb 24, 2015)

But when you run out of ram, it doesn't matter how fast it is. Which is part of my point. 1GB isn't enough in most titles at 1080p with 4xMSAA or greater.

I need AT LEAST 4xMSAA at 1080p to get rid of the jaggies...

This was a test at my home site done in 2012.. Tell me 1GB is enough again after seeing this: http://www.overclockers.com/forums/showthread.php/718118-How-much-GDDR-do-I-need-to-run-my-game

The 750 is a budget card to put an image on the screen to me. It could play most all games at lower than 1080p with some AA well. Otherwise, it sucks and TPU's results also show that for any half modern game (not getting over 30 FPS): https://www.techpowerup.com/reviews/ASUS/GTX_750_OC/7.html

If you are not into AAA titles or can handle less than optimal image quality, it ok.


----------



## rruff (Feb 24, 2015)

EarthDog said:


> This was a test at my home site done in 2012.. Tell me 1GB is enough again after seeing this: http://www.overclockers.com/forums/showthread.php/718118-How-much-GDDR-do-I-need-to-run-my-game



"Unless otherwise noted *all tests were run with the maximum settings allowed by the game or benchmark*. Gpu memory usage was logged with MSI Afterburner. The maximum value from the log file excluding the last measurement of the run is what's posted. Testbed was the 3dvision gaming pc in my sig."


The GTX 750 isn't made to run recent games *at max settings *and still get good framerates. Is that surprising for a <$100 card? It will even have trouble getting good framerates in some titles with MSAA... also not surprising. But in most games you can get decent framerates with a little AA enabled and it will look good. Not stellar, but good. *At no point will the 1GB of ram be your limiting factor. *

Most benchmarks use standard (usually max settings) to test cards. This is appropriate for high-end cards. They keep them the same for all cards tested for consistency and to compare performance over all cards, but it isn't realistic for the low end cards. The FPS will be unrealistically low, and the vram usage unrealistically high.


----------



## EarthDog (Feb 24, 2015)

I don't see much of a point in playing a PC game if I can't use AA and it looks worse than a console. To me, its not remotely an adequate gaming card because of those points.

*



			At no point will the 1GB of ram be your limiting factor.
		
Click to expand...

BOLOGNA. *You have to make it NOT be a factor with that card by sacrificing IQ in modern titles. Nobody, with a half decent budget, would get a 1GB card these days.


----------



## rruff (Feb 24, 2015)

EarthDog said:


> *BOLOGNA. *You have to make it NOT be a factor with that card by sacrificing IQ in modern titles.



No I don't. The IQ is already sacrificed by the card's low processing power. *Adding vram wouldn't help in the slightest. *You do realize that the shaders, ROPs, and TMUs are basically half that of a 960 and 1/4th that of a 980...? Amazingly the 960 has twice the vram and the 980 has 4x as much... hmmm... could there be a connection?


----------



## EarthDog (Feb 24, 2015)

I get rruff...more so than you realize.


----------



## rancur3p1c (Mar 15, 2015)

Would love to see this between screen resolutions!!! I.e. draw a correlation between pixels and bandwidth. Need estimate per game of how much bandwidth goes to actually rendering. Then can make purchasing decision for 1600p monitor!!! Playing a couple older games. Tender capacity fine just worried about bandwidth


----------



## rruff (Mar 16, 2015)

What do you mean by "render capacity is fine"? What is your card and system and what games are you playing? 

Just scale by number of pixels. Should be a decent guess. What does that give you? And if that is too much you can probably just reduce the AA or AF settings.


----------



## OneMoar (Mar 16, 2015)

rruff said:


> What do you mean by "render capacity is fine"? What is your card and system and what games are you playing?
> 
> Just scale by number of pixels. Should be a decent guess. What does that give you? And if that is too much you can probably just reduce the AA or AF settings.


pro-tip don't question earth-dog
1. hes been doing this wayyyy longer then you
2. hes right I would't buy anything with less then 3GB at this stage, very few titles will live happly at under 1GB of vram at highish settings at >=1080p and any card that comes with 2GB is very likely to be useless at 1080P anyway
as for raw bandwidth it matters A LOT especially once you start piling on the AA and cranking the res beyond 1440p


----------



## rruff (Mar 16, 2015)

OneMoar said:


> pro-tip don't question earth-dog



Can you answer the question? 

If the 980's specs are cool and the 960 has half the shaders, ROPs, and TMUs, then why would it need more than half the Vram and bandwidth? It only makes sense to me if your desire is to run high eye candy and low frame rates. Maybe the 960 isn't good enough for you "once you start piling on the AA and cranking the res beyond 1440p", but it isn't a problem that more Vram would solve. It doesn't have the processing power to give decent fps regardless.


----------



## Steevo (Mar 16, 2015)

rruff said:


> Can you answer the question?
> 
> If the 980's specs are cool and the 960 has half the shaders, ROPs, and TMUs, then why would it need more than half the Vram and bandwidth? It only makes sense to me if your desire is to run high eye candy and low frame rates. Maybe the 960 isn't good enough for you "once you start piling on the AA and cranking the res beyond 1440p", but it isn't a problem that more Vram would solve. It doesn't have the processing power to give decent fps regardless.




You have to understand how a GPU works and actually processes data, a texture is loaded into memory, and it takes lets say 4K of memory, now irregardless of what resolution you are running that texture HAS to be there or else there will be a stall as its fetched from either system memory or disk. Add in thousands of textures, or in modern games a large compressed file with the textures for an area already present that is 2Gb when decompressed. The frame may only need 2K of the texture, but the whole thing has to be in memory due to how DX works currently, and since it only needs 2K the memory controller can then sort out the location of where the actual data requested is and fetch it using only the bandwidth required. 
Its called the bend of the knee, at a certain memory size it is a waste to add more as performance doesn't usually increase linearly with added memory, but there are always exceptions, and as exceptions go it will happen and people will get pissed when a card is using 60% of the GPU power and waiting for paged data.


----------



## EarthDog (Mar 16, 2015)

> but it isn't a problem that more Vram would solve. It doesn't have the processing power to give decent fps regardless.


I dont think anyone said 2560x1440 (at least I didn't). The 750 only has enough horsepower for 1080 on down. Here is my first reply...



EarthDog said:


> It depends on what you are playing and its resolution. 1080p + AA on 128 bit, it could be a factor. It also only has 1GB of vRAM. I wouldn't even call it a gaming card in the first place...



Rruff, you may want to read the chronology of posts again to get a better handle on the context. I was specifically talking about 1080p or less as I mentioned.

Point is, for a card that is marketed as a 1080p gamer, the 750, it doesn't have enough ram capacity in the first place to hold textures and AA info. Before you worry about getting the water(data) in and out of the bucket(frame buffer/vRAM), you need to have a big enough bucket(frame buffer/vRAM) in the first place. Moving it faster doesn't matter if its paging out to system memory.


----------



## rruff (Mar 16, 2015)

Steevo said:


> You have to understand how a GPU works and actually processes data, a texture is loaded into memory, and it takes lets say 4K of memory, now irregardless of what resolution you are running that texture HAS to be there or else there will be a stall as its fetched from either system memory or disk. Add in thousands of textures, or in modern games a large compressed file with the textures for an area already present that is 2Gb when decompressed. The frame may only need 2K of the texture, but the whole thing has to be in memory due to how DX works currently, and since it only needs 2K the memory controller can then sort out the location of where the actual data requested is and fetch it using only the bandwidth required.



I understand this. And if I have a GTX 960 and select textures that are half the size of what works in a GTX 980, then what would the problem be? I should be able to run the same FPS and not have an issue with vram or bandwidth, right? The reason I wouldn't mind doing that is because the 960 doesn't have the *processor* to run that game at high settings anyway and still get decent FPS. 



> Its called the bend of the knee, at a certain memory size it is a waste to add more as performance doesn't usually increase linearly with added memory, but there are always exceptions, and as exceptions go it will happen and people will get pissed when a card is using 60% of the GPU power and waiting for paged data.



But in practice the 960 does have enough and so does the 750 with only 1GB. If there was a rare instance where the page file size was the limiting factor you can reduce the textures. Nvidia's software will probably do this automatically. 

I looked at a bunch of reviews of the GTX 750 and 750 Ti. In addition to 1GB more vram, the 750 Ti has 20% more shaders and TMUs and a 8% faster vram clock. I was curious to see if there was any evidence that the 750 was hobbled by it's lack of vram. Most of the tests used highest settings which aren't realistic for these cards, and there were a couple of games where the 750 dropped behind more than you'd expect, but on average it scored only 13% slower than 750 Ti (average of many tests and many reviews). I bought a 750 and I monitor vram usage and a bunch of other things, and I haven't been limited by a lack of vram yet. 

Tons of the user reviews out on the 750 and quite a few on the 960 as well. People aren't "pissed". They might even be the most highly reviewed cards you can get.


----------



## EarthDog (Mar 16, 2015)

Sometimes, it doesn't always manifest itself as a FPS issue though... 



> I bought a 750 and I monitor vram usage and a bunch of other things, and I haven't been limited by a lack of vram yet.


Because, as you said, you reduce textures which puts less in vRAM. 

The bottom line is, 1GB is not enough to run games at their highest quality on 1080p.


----------



## bubbleawsome (Mar 16, 2015)

Psh, 2GB isn't enough any more to run 1080p maxed. I easily hit 2.5GB in elite dangerous maxed out with AA and everything, still keep a solid 60fps except in the most unoptimized places. 

I saw a defined lack of bandwidth on the HD 7770. If I upped the memory clock to higher speeds (I don't remember, it was really high. Golden chip, 1300mhz core) I would see a 5fps increase in some games, and more in benchmarks. I realize nvidia has the compression tech, but there is only so much magic nvidia dust they can use before they have to increase bus size.


----------



## rruff (Mar 16, 2015)

EarthDog said:


> The bottom line is, 1GB is not enough to run games at their highest quality on 1080p.



Right, I get that being true for most games. But the GTX 750 only has 1/4 the processor the 980 has, and 1/2 the processor the 960 has, so it can't do it anyway.


----------



## Steevo (Mar 17, 2015)

rruff said:


> Right, I get that being true for most games. But the GTX 750 only has 1/4 the processor the 980 has, and 1/2 the processor the 960 has, so it can't do it anyway.


 But 1GB is before the curve of the knee, and even though it may not be able to fully utilize the memory in a single pass it can still use a huge amount more to prevent bottlenecks, and to act as a buffer against caching issues. 

Put it this way, you have to tow a 10,000Lb trailer, would you buy a truck that tows exactly 10,000 pounds and then cry about why it goes so slow in hills? Or would you buy the truck that tows 15,000 for 5% more cost and goes the same in hills (poorly optimized games) as it does on flat ground?

Vmem is the same, sure some cards will never find the use in 2GB memory, but if the whole texture file for a level or area is 1.7GB and you had 1.5 the occasional hiccups from fetching data sure would suck right?


----------



## rruff (Mar 17, 2015)

Steevo said:


> if the whole texture file for a level or area is 1.7GB and you had 1.5 the occasional hiccups from fetching data sure would suck right?



I'd reduce textures to prevent the problem. Which I'd want to do anyway because the card is weak in other ways. 

Sure, it's nice to have extra vram, if it's free. But everything costs money. On a budget card like the 750 adding 1GB would have cost >10% more. Not worth it. If I keep playing that game I might as well get a 980 and be done with it. There's always something a little better for a little more. Why settle for anything less than the best?


----------



## Steevo (Mar 17, 2015)

rruff said:


> I'd reduce textures to prevent the problem. Which I'd want to do anyway because the card is weak in other ways.
> 
> Sure, it's nice to have extra vram, if it's free. But everything costs money. On a budget card like the 750 adding 1GB would have cost >10% more. Not worth it. If I keep playing that game I might as well get a 980 and be done with it. There's always something a little better for a little more. Why settle for anything less than the best?




Wow, just reduce textures, lets all just reduce textures. Call up the ol boys at Valve and say "You chaps know what, I want to reduce the textures in the game, never you mind sir that means a whole other texture pack to add to this game just for me, you louts owe me, now make it snappy" Then we will all go down to the speakeasy and have ourselves some devil juice and let the girlies dance for us whatcha say?

Ever notice how they have "minimum specifications" for games? And most of the time those are for a slide show, no AA, no AF, 1024X768 laptops with integrated Intel for people who don't mind 15ishFPS, and then they have "recommended" and that is for middle of the road 1080 2XAA, 4-8XAF at 30-60FPS. Sure you do ole boy, now lets quit playing this cat and mouse game and get on down to the speakeasy for them girlies see.


----------



## rruff (Mar 17, 2015)

Steevo said:


> Wow, just reduce textures, lets all just reduce textures.... Ever notice how they have "minimum specifications" for games?



If the 750 isn't powerful enough to run it, then I can't run it. It isn't a vram issue. Why would I want to pay for more vram just so I can run at 10 fps without a page file limit? If I want to play a demanding game with a good experience, then I should have bought a faster card. 

And if you are interested take a look at the pro reviews, user reviews, and screen captures for the GTX 750, and see what people are actually getting in games.


----------



## EarthDog (Mar 17, 2015)

Steevo, you can lead the horse to water my man... we tried.


----------



## nl_bugsbunny (Jun 22, 2015)

RCoon said:


> *****
> Holy crap I am tired and this is all probably totally wrong
> You can see all my original data here:
> https://www.dropbox.com/sh/v3vqnglktagj8tr/AADvMQeqR-nxETkn4PwJKlZBa?dl=0
> ...


Hi can u make some test on directx12, im just curious because NVIDIA say this card was make thinking in the new techs like dx12 and MFAA what in reality are not heavy and don't go affect memory's supplied from 128bit bus I think or they say must be checked for who have questions in the sites from NVIDIAor youtube where they talk. but think with me, if I have one 970 with 256bit. now if I sli I can run perfect 4k on 512bit right? now just divide 4k for 4 and 512bit as well and give what? 1080p? and what bus? 128bit? at least mathematic don't fail here  or we are just fine with single 128bit card on 1080p?


----------



## RCoon (Jun 22, 2015)

nl_bugsbunny said:


> if I have one 970 with 256bit. now if I sli I can run perfect 4k on 512bit right?



'Fraid memory bandwidth doesn't double in SLI, it just replicates data on each card, or takes it in turns. You'll still only have 256bit bus.

~200GB/s is the kind of bandwidth you're going to want for 1080p currently.

As and when DX12 releases and actually gets a signed driver with it, I'll test DX12 games and publish the results. I'll probably run an update when I finally buy a 4K monitor too, which should be soon™


----------



## RejZoR (Jun 22, 2015)

Also be aware that what you see as specs for GTX 9xx and R9-285/R9 Fury is hardware raw bandwidth. It is a lot higher when you take framebuffer compression into account, but no one really knows how high it is after that since it depends on the rendered image...


----------



## RCoon (Jun 22, 2015)

RejZoR said:


> Also be aware that what you see as specs for GTX 9xx and R9-285/R9 Fury is hardware raw bandwidth. It is a lot higher when you take framebuffer compression into account, but no one really knows how high it is after that since it depends on the rendered image...



+1, exactly this. While it's advertised at 30% off the total figure, in some games it can be as low as 1%.


----------



## Mussels (Jun 22, 2015)

and DX12 doesn't technically add the ram bandwidth either, no ones totally sure yet but it may well vary between titles how multi GPU is handled. It may allow more to be used (by assigning tasks to each GPU) but that could mean the tasks arent perfectly split, meaning you gain some more RAM at the cost of GPU power... DX12 is going to require some testing


----------



## xfia (Jun 22, 2015)

Mussels said:


> and DX12 doesn't technically add the ram bandwidth either, no ones totally sure yet but it may well vary between titles how multi GPU is handled. It may allow more to be used (by assigning tasks to each GPU) but that could mean the tasks arent perfectly split, meaning you gain some more RAM at the cost of GPU power... DX12 is going to require some testing


well if you can reduce latency by at least 100 percent that we know is more than that then you can render twice as much in the same amount of time. imagine if your gpu was commanded more like a cpu and was better at multitasking.. preload? sorta but different or just more fps.. dx12? no doubt.


----------



## ASOT (Oct 22, 2015)

So .. in your opinion is it good enough as the 280x or below ?


----------



## RCoon (Oct 22, 2015)

ASOT said:


> So .. in your opinion is it good enough as the 280x or below ?



The 280X has 288GB/s memory bandwidth, while the 960 has a mere 112.2GB/s. In terms of memory bandwidth, the 280X is the better card. In terms of raw performance, the 280X is only very slightly above the 960 (~10% faster).


----------



## ASOT (Oct 22, 2015)

The 280X has 288GB/s memory bandwidth, while the 960 has a mere 112.2GB/s. In terms of memory bandwidth, the 280X is the better card. In terms of raw performance, the 280X is only very slightly above the 960 (~10% faster)

Thank you,one more thing to ask u..to upgrade my card to R9 290 is it worth?


----------



## RCoon (Oct 22, 2015)

ASOT said:


> to upgrade my card to R9 290 is it worth?



Eyes of the beholder. If you have money burning a hole in your pocket and you're not getting the FPS you want, then it's worth it.
If you're scraping together upgrade money, and are quite happy with your current performance, then only you can make that decision.


----------



## cdawall (Oct 22, 2015)

This shouldn't surprise anyone. 128bit GDDR3 had this exact same issues with midrange cards of the past. The midrange has been memory bandwidth limited for a while...


----------



## RejZoR (Oct 22, 2015)

Talking about bandwidth with low end cards is pointless waste of time. We all know bandwidth matters at high resolutions and high FSAA levels. Meaning none of mid and low end cards can even do it at usable framerates. Only place when discussion about it even makes sense is high and enthusiast level of cards. Those that will actualyl run high res, high settings games at usable framerate. Meaning R9-290X/390X and GTX 980 and above.


----------



## cdawall (Oct 22, 2015)

RejZoR said:


> Talking about bandwidth with low end cards is pointless waste of time. We all know bandwidth matters at high resolutions and high FSAA levels. Meaning none of mid and low end cards can even do it at usable framerates. Only place when discussion about it even makes sense is high and enthusiast level of cards. Those that will actualyl run high res, high settings games at usable framerate. Meaning R9-290X/390X and GTX 980 and above.



Even the 280X (7950) can run high res at high settings, especially in crossfire/SLi. Those cards if you will notice still have 384bit memory busses, GPU wise the GTX960 competes with the 7950,  yet in high resolution environments the 7950 takes the lead. That is why threads like this exist, it is showing how the memory bandwidth cripples an otherwise good GPU.


----------



## Vayra86 (Oct 28, 2015)

RejZoR said:


> Talking about bandwidth with low end cards is pointless waste of time. We all know bandwidth matters at high resolutions and high FSAA levels. Meaning none of mid and low end cards can even do it at usable framerates. Only place when discussion about it even makes sense is high and enthusiast level of cards. Those that will actualyl run high res, high settings games at usable framerate. Meaning R9-290X/390X and GTX 980 and above.



Wait, whut?!

Bandwidth matters on every single GPU. Back in the Kepler days there were several versions of the GTX 660 with different memory subsystems. Going lower down the price tiers, you have similar named cards with 64 bit and 128 bit DDR3 and GDDR5 versions. Tell me again bandwidth doesn't matter. Especially on the lower end of the spectrum you can get totally raped if you don't investigate carefully what you buy. In comparison, all high end cards perform far more similarly and are generally very well balanced. The only outlier here is the AMOUNT of memory, where 'bigger is better' is still a very popular marketing strategy.

Talking about lower end hardware may be pointless to you, or us, but especially when you are trying to maximize the benefit of that lower end hardware, memory bandwidth is an essential piece of the puzzle.


----------



## RejZoR (Oct 28, 2015)

Overclock VRAM and you'll get a tiny fps bump. Overclock GPU and the fps with jump like crazy. VRAM only really makes difference in very specific conditions. GPU makes difference in every situation.


----------



## EarthDog (Oct 28, 2015)

A 280x to a 290? I wouldn't make that small jump, no. 290x, or Fury, or GTX 980/980Ti. Make the jump worth it from a performance standpoint.


----------



## cdawall (Oct 28, 2015)

You mean 7950 to 7970 doesn't make sense to you


----------



## xorbe (Oct 28, 2015)

RejZoR said:


> Overclock VRAM and you'll get a tiny fps bump. Overclock GPU and the fps with jump like crazy. VRAM only really makes difference in very specific conditions. GPU makes difference in every situation.



Doesn't that entirely depend on the bottleneck.  For instance, isn't the 960 severely bottlenecked by its 128-bit vram?


----------



## xvi (Oct 28, 2015)

RejZoR said:


> Overclock VRAM and you'll get a tiny fps bump. Overclock GPU and the fps with jump like crazy. VRAM only really makes difference in very specific conditions. GPU makes difference in every situation.


There are exceptions (typically in the low end). I can (near-)double the FPS I get out of my work PC's GeForce 8400 GS by overclocking VRAM alone. Overclocking GPU actually nets almost nothing at all.

Now, the obvious reply here is "Who cares about FPS on lower end cards" and I would agree. Still, from a purely academic perspective, it is possible.

Edit:
Stock





Mem-only OC




Mem+GPU OC (Edit: Shader clock was lower than mem-only, revised)


----------



## phanbuey (Oct 28, 2015)

Ram and ram bandwidth only matter when theres not enough, and then they matter ALOT.  When there is enough, then going higher / more does nothing.


----------



## EarthDog (Oct 28, 2015)

cdawall said:


> You mean 7950 to 7970 doesn't make sense to you


280x = 7970. Did I miss something I assume?


----------

