• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Works on At Least Three Radeon RX Vega SKUs, Slowest Faster than GTX 1070?

That's where you have to balance the length of the pipeline vs the clock speed you want to reach, it's a balancing act. That's where good branch predictors come into play. If your branch predictor is good or at least can learn along the way much like Ryzen's branch predictor can, you can have a long pipeline and not incur a performance penalty. However, if you have a bad branch predictor like what the old Intel Pentium 4 Prescott had a long pipeline can result in a sever loss in performance.
 
That's where you have to balance the length of the pipeline vs the clock speed you want to reach, it's a balancing act. That's where good branch predictors come into play. If your branch predictor is good or at least can learn along the way much like Ryzen's branch predictor can, you can have a long pipeline and not incur a performance penalty. However, if you have a bad branch predictor like what the old Intel Pentium 4 Prescott had a long pipeline can result in a sever loss in performance.

I still love my Pentium 4 3.4 EE Gallatin :D
 
That's where you have to balance the length of the pipeline vs the clock speed you want to reach, it's a balancing act. That's where good branch predictors come into play. If your branch predictor is good or at least can learn along the way much like Ryzen's branch predictor can, you can have a long pipeline and not incur a performance penalty. However, if you have a bad branch predictor like what the old Intel Pentium 4 Prescott had a long pipeline can result in a sever loss in performance.
In fact, you don't even need branches or cache misses for more stalls occur due to pipeline length. Code is full of data dependencies, let's say you'll have a simple calculation like this:
Code:
d = a + b + c;
e = a + d;
f = e + b;
g = f + c;
You have multiple dependencies here, which has to be resolved sequentially. Each dependency has to wait for the instruction to be completely executed, meaning the length of the pipeline will affect the length of the stall. CPUs have since the 90s tried to work around this by out-of-order execution, and longer pipelines also means the dependencies has to be executed even earlier to prevent stalls. But eventually this means that branching is going to become a even larger problem, since each misprediction causes all calculations to be discarded. So if there are dependencies after the branching, you'll not only get a larger stall because of the flushing, but also because you'll then have to execute multiple dependencies without any benefit of out-of-order execution. This is why the penalties of long pipelines and mispredictions multiply.

Skylake does in fact have better branch prediction than Ryzen, even old Sandy-Bridge does it better. But branch prediction can only help a bit, since it's basically just statistics about which conditionals usually evaluates to true and which does not. If a conditional is 99% true and 1% false, it will start guessing 99% correct after a few iterations. But if a conditional is ~50% true and ~50% false, the CPU will only guess half of them correctly, and that is in fact the theoretical maximum. If you want to improve performance beyond this, you're left with trying to reduce the penalty costs, or rewriting the software :p

And one final note; the branch predictor (and prefetcher in general) were much better in Prescott than Athlon64, but the severe penalties of the super-long pipeline outweighed the benefits of a better prefetcher. There are limits to what a good prefetcher can do, so even with the best prefetcher Intel was crushed by a much more simple design.
 
Last edited:
In fact, you don't even need branches or cache misses for more stalls occur due to pipeline length. Code is full of data dependencies, let's say you'll have a simple calculation like this:
Code:
d = a + b + c;
e = a + d;
f = e + b;
g = f + c;
And one final note; the branch predictor (and prefetcher in general) were much better in Prescott than Athlon64, but the severe penalties of the super-long pipeline outweighed the benefits of a better prefetcher. There are limits to what a good prefetcher can do, so even with the best prefetcher Intel was crushed by a much more simple design.

It wasn't about the pipeline or prefetcher mainly that intel failed. It was Intel underestimation of the clock speed and the heat and voltage that would be needed to mitigate the longer pipeline in Prescott to match Athlon's speed. Longer pipeline gave more clock speed but the heat and voltage increased also. They didn't realize it would be that much and that is why Prescott would burst into flames without good cooler which in those times were not that efficient as they are now.
After that Intel dropped netburst architecture.
 
Are they expensive? Hell yes!

Are they over-priced? Not at all.

So long as people keep buying them at their current prices the market will allow for Nvidia to charge more and more with each new generation.

If they were not selling and stores were left with stock on the shelves and in their warehouses, THEN they would be over-priced. That's just how free-markets work. If you don't like it, speak with your money and hope others follow suit.
Will have to respectfully disagree. Nvidia doesn't have competition in the high end and enthusiast line. Yet they price these cards by picking RipOff prices from the sky.

Intel has been doing the very same thing for many years, Over Pricing it's CPU's because Bulldozer wasn't competitive.

Look what Ryzen did to those Over Priced Intel Processors. Intel is in damage control. Has been since ZEN.

Nvidia has absolutely no measure in how to price it's High end GPU's. AMD's GPU Line Up RX480/580 aren't competitive enough. Don't confuse a company ripping people off with over prices to actual economics.

If you disagree, then we will agree to disagree.
 
Pro Duo is not a consumer gaming card...
Nobody said that. The article even called it what it actually is: a halo product.
 
Nobody said that. The article even called it what it actually is: a halo product.
Nobody said also that it will be. Halo means more of a top notch product best of whatever is offered by AMD but it doesn't mean it must be a consumer card. Might as well be business purpose or both. Who knows.
 
Back
Top