# AMD Radeon HD 5870 PCI-Express Scaling



## W1zzard (Sep 16, 2009)

AMD's latest generation of graphics cards offers unprecedented single GPU performance levels. Such performance requires an optimum environment, especially important is PCI-Express bandwidth. We investigate if these cards can still deliver on a PCI-E x8, x4 or even x1 link.

*Show full review*


----------



## newtekie1 (Sep 23, 2009)

Very interesting.  I wasn't surprised by the x8 performance, I expected next to no performance loss.  However, the x4 performance was surprising, I expected a much bigger drop there, but 5% wouldn't even be noticeable.


----------



## dir_d (Sep 23, 2009)

Gives me confidence in a P55 board with 2 5850s. 1 5850 now then when they drop in price after nvidia answers pick up another 

Edit.. Itchy trigger finger made me buy a 5870 lol


----------



## Athlon2K15 (Sep 23, 2009)

this review makes me more comfortable with my new build thanks w1zzard!


----------



## RaPiDo987 (Sep 23, 2009)

nice!!


----------



## human_error (Sep 23, 2009)

Wow i was expecting to see a bigger range between the different pci-e speeds - good info to know so thanks for the review w1zz


----------



## A Cheese Danish (Sep 23, 2009)

Very nice! Definitely putting the 5870 x2 on my to get list. If/when they release it.


----------



## Sihastru (Sep 23, 2009)

This means 2 x 58x0 CF setups are viable even on the old P45 motherboards (2 x 8x PCIe 2.0). I am very surprised about the 4x PCIe 2.0 performance... it's too good.


----------



## stupido (Sep 23, 2009)

nice review...
10x


----------



## Tuvok (Sep 23, 2009)

some assumptions are wrong, imho...
if single card 8x 2.0 is 99% of 16x 2.0 you can't say the same for dual cards, since you got much more overhead in crossfire


----------



## tzitzibp (Sep 23, 2009)

great stuff... thanks W1z!


----------



## tkpenalty (Sep 23, 2009)

hey w1zz mind running 1x  and 4x with an overclocked PCI-E bus speed of like 110?


----------



## jagd (Sep 23, 2009)

This is insane ,how could you manage publish both tests same day ?Thanks ,anymore tests on the way ?


----------



## W1zzard (Sep 23, 2009)

yes 5870 crossfire, we'd had that too but customs took the card and wanted to play crysis with it


----------



## newtekie1 (Sep 23, 2009)

I assume the HD4870 corssfire review will be on the current testbed.  I'm wondering if you could do a follow up review to this one, that shows crossfire PCI-E scaling, by taping off both cards and limitting them to x8, x4, and x1 speeds.  It would give a good indication of how much more important the PCI-E bus is in multi-card setups.  And really give a good indication of what people should expect when running two of these cards in a P55 or P45 board.


----------



## btarunr (Sep 23, 2009)

Yes, that's part of the CFX review. Both cards will be taped off at PCI-E 2.0 x8 for one set of scores across all tests.


----------



## mdm-adph (Sep 23, 2009)

Interesting proof of concept!  Nice to see some scientific reviews in the community.


----------



## PVTCaboose1337 (Sep 23, 2009)

Very cool way to do it without cutting up the card like that idiot did with a 7800gs! (yeah it was photoshopped but still...)


----------



## newtekie1 (Sep 23, 2009)

btarunr said:


> Yes, that's part of the CFX review. Both cards will be taped off at PCI-E 2.0 x8 for one set of scores across all tests.



Excellent!  I can't wait for the review!


----------



## Sasqui (Sep 23, 2009)

W1z - you rock - what a great article with suprising results.


----------



## lemonadesoda (Sep 23, 2009)

Extremely useful article and analysis. Thanks. This will make good reference material for the future.

I was also surprised that you get 95% of the performance with x4 until I remembered that x4 v2.0 is the same as x8 v1.x...But good to know, and also good to know that the figures still hold true even on a top end i7 system.

(And that dual CPU server boards with "just" x8 connectors will still make good workstations)

***

Just to kick some dust, if an x4 is good enough (x8 v1.x) then so is AGP! Yes, PCIe is a better *general * scalable format, but, from a technical standpoint, AGP and it's x8 speed equivalence and lower latency would still be in the running if high end products were still available. This thought really surprised me!


----------



## 5ilvgearX (Sep 23, 2009)

Thank you for the awesome scaling review


----------



## mdm-adph (Sep 23, 2009)

lemonadesoda said:


> ust to kick some dust, if an x4 is good enough (x8 v1.x) then so is AGP! Yes, PCIe is a better *general * scalable format, but, from a technical standpoint, AGP and it's x8 speed equivalence and lower latency would still be in the running if high end products were still available. This thought really surprised me!



As the people with Q6600's and 3850's on AGP getting 3dmark06 scores of 12000 have been able to prove.


----------



## e6600 (Sep 24, 2009)

but whats to say a future dx 11 game will not be able to thrash the x8 pcie bandwidth?  
is it possible that at a now common res of 1920x1080, with some aa/16af applied, the next gen of games may prove p55 with electrical x8 pcie a bottleneck?


----------



## lemonadesoda (Sep 24, 2009)

Remember what the AGP/PCIe interface is doing... transferring data.  I think PCIex8 is going to be fine for a while.

Then remember what kind of data it is transferring

1./ textures
2./ coordinates
3./ calculated renderings

So long as there is sufficient memory on the card, then textures are loaded once or pre-loaded.

So long as the number of coordinates is a reasonable order of magnitude (hundreds of thousands, not hundreds of millions) then we are OK

So long as the GPU does the graphics work, not the CPU, then this doesnt happen.

So actually, the more memory on the GPU, the less bandwidth you need, because assets are preloaded and remain on the card.

Why are x16 benchmarks better than x8 better than x4 etc, even if the difference is only small. I bet the MOST OF IT is actually uploading new textures etc.  If we looked as the MODE distribution frames-per-second http://en.wikipedia.org/wiki/Mode_(statistics) then I bet the results would be even closed.  Remember that the "average" gets hit by just a few of the worst frames (loading new textures), whereas the "mode", the framerate 95% of the time, is probably the same.

***

An analysis by w1z of http://en.wikipedia.org/wiki/Mode_(statistics) FPS would be extremely enlightening and would put the discussion to bed.


----------



## Mussels (Sep 24, 2009)

i tried telling people i only took a 10% performance hit on a 4x slot when i ran a 16x/4x crossfire setup, and no one beleived me.

that was with 4870s on a 1.1 motherboard, at least i finally have someone else with similar findings now.


----------



## W1zzard (Sep 24, 2009)

i'll have the cf x8 and x4 numbers today as well


----------



## tzitzibp (Sep 24, 2009)

W1zzard said:


> i'll have the cf x8 and x4 numbers today as well



*anxiously waiting*


----------



## qubit (Sep 24, 2009)

Really useful review, thanks. 

If the difference between 8x & 16x is so small on this new beast of a card, I'm sure that it practically disappears on older, slower cards (eg 8800 GTX, HD4850 etc) which most people will currently have. Therefore, we don't have to worry about losing performance in 8x mode.


----------



## W1zzard (Sep 24, 2009)

in x8 2.0 .. x8 1.1 is equivalent to x4 2.0


----------



## T3kl0rd (Sep 24, 2009)

Great article.  Can't wait for the CF results.  Seeing as how I am choosing a X58 mobo over P55, it will be interesting to see the percentage decrease, to say the least.  10% or so is reason enough for me to go X58.


----------



## yogurt_21 (Sep 25, 2009)

very interesting. I find it comforting that my 16x 1.1 bus will not penalize me nearly at all should I end up getting one of these cards. based on current budgeting, it seems to be ahard push to get anew gpou and upgrade cpu/mobo/memeory.

thanks!


----------



## sethk (Sep 25, 2009)

This is interesting because it gives you a preview of how well it could work in a ExpressCard docking station solution for a laptop, where you could conceivable game with a 5850/5870 connected to your latop through ExpressCard (essentially PCIe 1x), with better framerates than almost anything available for most laptops.


----------



## the_pharaoh (Sep 26, 2009)

"Buy one of these accelerators now, and give them a neighborhood when they become more affordable, and you will have secured yourself future-proofing for quite long."

Wow, what on earth does that even mean? Proof-reading please.


----------



## Dan848 (Sep 26, 2009)

W1zzard said:


> in x8 2.0 .. x8 1.1 is equivalent to x4 2.0



I have a version 1.0 Gigabyte P35-DS3R that uses PCIe x16, so, that would be the same as x8 in version 2.0 PCIe?

Thank you,

Dan


----------



## Deleted member 67555 (Sep 27, 2009)

Dan848 said:


> I have a version 1.0 Gigabyte P35-DS3R that uses PCIe x16, so, that would be the same as x8 in version 2.0 PCIe?
> 
> Thank you,
> 
> Dan


Yes


----------



## [I.R.A]_FBi (Sep 27, 2009)

Sihastru said:


> This means 2 x 58x0 CF setups are viable even on the old P45 motherboards (2 x 8x PCIe 2.0). I am very surprised about the 4x PCIe 2.0 performance... it's too good.



My P45 has three .. no more worries ..


----------



## Bo_Fox (Sep 28, 2009)

Awesome article..  havent seen an article like that in years!  Kudos, W1z and btarunr!


----------



## stupido (Oct 1, 2009)

well, if possible, I would like to see the impact of  chipset too... I mean compared P45, P55 and X58?
this review was done on i7 machine, but what on i5 machine or Core2Quad?


----------



## Mussels (Oct 1, 2009)

stupido said:


> well, if possible, I would like to see the impact of  chipset too... I mean compared P45, P55 and X58?
> this review was done on i7 machine, but what on i5 machine or Core2Quad?



it would be irrelevant. CPU bottlenecks would be the only thing that changed the results.

Obviously, on a weaker system the maximum FPS would be lower, therefore the performance differential would decrease across the tests.


----------



## matrices (Oct 10, 2009)

I'm a little confused by the analysis. Does your x4 simulation accurately emulate the actual x4 on the P55 PCH? In a review of x16/x4 P55 boards (Asus and Gigabyte), you wrote:

"The top slot gets a full 16 lanes of PCIe 2.0 from the CPU, while the second slot harnesses four of the not-quite-second-gen PCI Express lanes built into the PCH. Intel labels the P55's eight PCI Express lanes as 2.0, but they only signal at 2.5GT/s—gen-one speed. That's still plenty of bandwidth for Gigabit Ethernet and most reasonable auxiliary storage configurations. However, it might be a hindrance to CrossFire configurations."

I can't decipher that. Four "not-quite-second-gen" PCI-E lanes? So these are x4 PCI-E 1.0 lanes, not x4 PCI-E 2.0 lanes? Are they the same type that you measured in this simulation? And since some P55 boards have 8/8/4, is the "4" from there the same speed as the "4" in the board you reviewed above?

Anandtech has already done the CF test with ATI 5870 in 16/16 versus 8/8, and we already know the difference is negligible there. What would be truly interesting is to test Tri-Fire and CF with Physx (with the hack) in 16/16/16 versus 8/8/4 (whatever "4" that may be).

Also, while the continuing usefulness of the x8 slot means that older 750i/680i SLI boards and Intel P35 boards will not be bottlenecked by PCI-E bandwidth, the C2D and C2Q are definitely bottlenecked to some degree by SLI and CF, period. Very few people have done the analysis but there's an article at Guru3D conclusively demonstrating that in some games, dual-card CF/SLI is 50% more effective on the i7 than the C2D/Q. And Tri-SLI (and therefore presumably Tri-CF) is almost completely wasted on non i7/i5 chipsets, according to the numbers.


----------



## sideeffect (Oct 12, 2009)

Nice article, it was very informative.  

Out of interest if you run PCI-E 2.0 8x at 125MHz Bus speed does that increase the bandwidth to 5 GB/s?


----------



## Bo_Fox (Oct 27, 2009)

Once again, an awesome article!


----------



## Calle2003 (Nov 14, 2009)

What's up with Dawn of War 2 1280x1024 2xAA 8xAF?
x4 and x8 fares *better* than x16!


----------



## Bo_Fox (Nov 16, 2009)

^^  Hmm, what do you mean?


----------



## Calle2003 (Nov 24, 2009)

Bo_Fox said:


> ^^  Hmm, what do you mean?



Check out Page 8: http://www.techpowerup.com/reviews/AMD/HD_5870_PCI-Express_Scaling/8.html


----------



## Geofrancis (Nov 25, 2009)

i am glad somone did an article like this it proves that my hd 4850 runs fine in my J&W MITX 780G board that only has 4x pci-e 2.0 slot along with all the next generation graphics cards.

on another note why are none of the new ati graphics cards single slot? that was one of the great features of the card.


----------



## zed011 (Dec 11, 2009)

W1zzard said:


> i'll have the cf x8 and x4 numbers today as well



Nowhere to be found?


----------



## Bo_Fox (Dec 12, 2009)

Calle2003 said:


> Check out Page 8: http://www.techpowerup.com/reviews/AMD/HD_5870_PCI-Express_Scaling/8.html



Sorry I missed your post..  

This is interesting..  perhaps it's a benchmark variation?  Sometimes you have to consider a +/-3% margin of error.  Well, I guess!


----------



## TAViX (Dec 27, 2009)

Guys, one question.

I have an Asus P5Q Deluxe mobo with 1 PCI-E x16 and 2 PCI-E x8. I'm running now an 5770 on the main slot, but I'm thinking to buy another card and Crossfire-it. The question is, are those 2 cards going to run on x8 or only one of them? I'm also curious what's the performance drop between x16/x16, x16/x8 and x8/x8

thanks.


----------



## Mussels (Dec 27, 2009)

TAViX said:


> Guys, one question.
> 
> I have an Asus P5Q Deluxe mobo with 1 PCI-E x16 and 2 PCI-E x8. I'm running now an 5770 on the main slot, but I'm thinking to buy another card and Crossfire-it. The question is, are those 2 cards going to run on x8 or only one of them? I'm also curious what's the performance drop between x16/x16, x16/x8 and x8/x8
> 
> thanks.



both run on 8x. performance hit will be next to nothing.


----------



## TAViX (Dec 27, 2009)

Good to know. I was asking because all the Crossfire review I've read they use mobos with 2xPCI-E x16, so I was kinda reluctant about this...


----------



## Roberto72 (Dec 28, 2009)

@w1zzard. I have some doubts with the conclusion of the review.

_Some motherboard manufacturers are offering a third PCI-Express x16 slot that is electrically x4. The results show that the performance drop isn't as bad as one would imagine, so we will green-signal installing a third accelerator for some 3-way ATI CrossfireX action, or 2-way CrossfireX on entry-level Intel P55 motherboards with the second x16 slot electrically x4 (running in 1.0 mode). If you're crazy enough to mod a PCI-Express x1 slot (by carefully cutting its end to let it seat a PCI-Express graphics card), then the scores should really dishearten you. Buy one of these accelerators now, add one later, and you will have secured yourself future-proofing. _

- Isn't it so that 2 card in xfire need more PCI-e bandwith in comperison to 2 seperate cards? (cards in xfire mode exchange data between each other = more bandwith needed?)
- The reason why the second (or third) PCI-e x16@x4 is PCI-e 1.1 is because these PCI-e lanes are connected to the P55 chipset (and not like the PCI-e x16@x16 (2.0) connected to the CPU). The bandwith between chipset and CPU (DMI) is 2GB/s = equal to PCIe X4 (1.1)







So the bottleneck won't be the X16slot@X4 but the DMI which has to share it's bandwith with SATA, USB, VGA, LAN etc.


----------



## EarthDog (Dec 31, 2009)

First, GREAT write up! 

I have a question though. Doesnt the use of AA increase bandwidth needs? Be it between the cards or other interfaces? What would this test look like if you put varying AA levels on it?

Thank you in advance for your reply.


----------



## EarthDog (Jan 3, 2010)

BUMP... its been days since i posted this... 

Anyone... especially the person who wrote this article.......HELLO??


----------



## Mussels (Jan 3, 2010)

w1zzard is a busy fellow.

Even if he wanted to test this, it'd probably take him a few days to get the test system set up for it again.


Roberto72: you're forgetting the crossfire/SLI bridges between the cards, most of the data goes across there directly.


----------



## Steevo (Jan 3, 2010)

EarthDog said:


> First, GREAT write up!
> 
> I have a question though. Doesnt the use of AA increase bandwidth needs? Be it between the cards or other interfaces? What would this test look like if you put varying AA levels on it?
> 
> Thank you in advance for your reply.



I can reply.


First, use google.

Second, AA only uses more PCIe bandwidth if it has to offload texture data to main system meory to make more room for the AA cache, if you are doing that then you have other problems. With how fast the new vmem is, and the bandwidth available to the GPU die, and the implementation of the stream process for handling AA and other filters, the effect on performance is minimal.


C) AA is less of a issue at larger resolutions, implementing multiple types of AA on objects in a scene to remove that 1 pixel wide jaggie does nothing. So burn your bandwidth on something useful, like a game worth enjoying.


----------



## btarunr (Jan 4, 2010)

Mussels said:


> Roberto72: you're forgetting the crossfire/SLI bridges between the cards, most of the data goes across there directly.



CFX bridges pass 0.9 GB/s of data.


----------



## EarthDog (Jan 5, 2010)

Steevo said:


> I can reply.
> 
> 
> First, use google.
> ...


Thanks for that utterly BRILLIANT suggestion to use google genius. I did try, however my search turned up nothing that directly answered my question. For pete's sake its not like I asked "how to overclock" and had the answered plastered all over 2342423 stickies at 23423542 sites...

Aside from that, I sincerely appreciate the response. 



Mussels said:


> w1zzard is a busy fellow.
> 
> Even if he wanted to test this, it'd probably take him a few days to get the test system set up for it again.
> 
> ...


Yeah I wasnt looking for re testing or anything. A reply like Speedo (minus the BS "google" reply) was what I was looking for. Thanks!!!


----------



## SummerDays (Jan 5, 2010)

BTA, can you please provide a reference for the 0.9GB/s on the crossfire link?  Thank You


----------



## cadaveca (Jan 5, 2010)

here ya go:






Source: http://www.yougamers.com/forum/showthread.php?t=86879 (but image is watermarked by ChipHell)


----------



## SummerDays (Jan 5, 2010)

This is the PCI interconnect for a X2 card.


----------



## cadaveca (Jan 5, 2010)

Top orange link is Crossfire interconnect(CFBI), listed @ 0.9 GB/s....slide shows one shared between the cards on same pcb, the other the one to link to other cards.


----------



## Mussels (Jan 5, 2010)

thats a confusing graph since it mentions sideport, but it makes sense.

0.9GB/s per bridge.


----------



## cadaveca (Jan 5, 2010)

Yeah, not the best, but as soon as I saw the question, I thought of it, as I'm not too sure of anywhere else where AMD has released this information. I've seen it, but cannot remember where.

Important to me, as running 2560x1600, each frame avgs...hmm..probably like 16MB, so 100FPS of frames transfered is not possible with just one bridge. Max would be around 55 FPS or so...

1920x1080 should use half that, or allow for double the frames on one interconnect.


----------



## Mussels (Jan 5, 2010)

cadaveca said:


> Yeah, not the best, but as soon as I saw the question, I thought of it, as I'm not too sure of anywhere else where AMD has released this information. I've seen it, but cannot remember where.
> 
> Important to me, as running 2560x1600, each frame avgs...hmm..probably like 16MB, so 100FPS of frames transfered is not possible with just one bridge. Max would be around 55 FPS or so...
> 
> 1920x1080 should use half that, or allow for double the frames on one interconnect.



its quite likely related to why the lower PCI-E bandwidth isnt as killer as we thought - the most critical data goes over the bridges.


----------



## cadaveca (Jan 5, 2010)

As far as I know, only the rendered frames are sent via the bridge, and PCI-E is used for other communications...

(and how nV disables Crossfire via bios on nV chipsets...they simply disable the inter-gpu pci-e link, and the boards offer no other link{as Crossfire can use DMI as it did on P965})

...PCI-E is also used for R5xx-gen software Crossfire, which uses NO bridges, so PCI-E is MORE IMPORTANT depending on resolution used(and the app being able to run over 55FPS on a single gpu, as currently most OEMs are telling people only one bridge is required, which is not entirely true), and hardware(or lack of it, as in software Crossfire).

I tried explaining this to more than one XFX tech...they wouldn't budge on it though...still they recommend only using one bridge(seemingly as a way to curb microstutter caused by frames being sent over both connects, and causing sync issues).


----------



## Mussels (Jan 5, 2010)

cadaveca said:


> As far as I know, only the rendered frames are sent via the bridge, and PCI-E is used for other communications...
> 
> (and how nV disables Crossfire via bios on nV chipsets...they simply disable the inter-gpu pci-e link, and the boards offer no other link{as Crossfire can use DMI as it did on P965})
> 
> ...



i suffered from that microstutter issue, but get it regardless of one bridge or two. go figure.

When i tested on a 16x/4x platform (PCI-E 1.1 to boot, P35) - there was zero performance difference.

I honestly beleive only one bridge can be used with two GPU's (and two bridges with three)


----------



## btarunr (Jan 5, 2010)

Mussels said:


> thats a confusing graph since it mentions sideport, but it makes sense.
> 
> 0.9GB/s per bridge.



Using two bridges between two cards doesn't up that to 1.8 GB/s. CFBI has a usable bandwidth of 0.9 GB/s between two GPUs.


----------



## Mussels (Jan 5, 2010)

btarunr said:


> Using two bridges between two cards doesn't up that to 1.8 GB/s. CFBI has a usable bandwidth of 0.9 GB/s between two GPUs.



i know. i should have specified - i was thinking triple cards at the time.


----------



## cadaveca (Jan 5, 2010)

Mussels said:


> I honestly beleive only one bridge can be used with two GPU's (and two bridges with three)



If that was teh case, AMD wouldn'r recommend conencting both, but Catalyst driver recomends connecting both bridges, unless this is meant for bridges on both cards...


----------



## cadaveca (Jan 5, 2010)

Ah, here's another place where they say it, in the manual:



> ATI CrossFireX Ready motherboard and one ATI CrossFireX Bridge Interconnect cable per graphics card (included) are required



Sry for double posting, but it's been some time since my last post, and new info is present.


----------



## Bernerosky (Jan 10, 2010)

Very nice review!      Cleared me some doubts.




Mussels said:


> both run on 8x. performance hit will be next to nothing.



1.1 or 2.0 ?????

Question: 

 I have one 16x 1.1 slot and other 8x 1.1 slot = (8x 2.0 / 4x 2.0),  I'm thinking about to install two HD 5870 in crosffire mode, what is the final speed with this configuration, 8x or 4x 1.1? 

tnks for the help.


----------



## rck1984 (Mar 18, 2010)

Hi there, first of all great post.. i am really suprised seeing some numbers. 2nd, excuse me for digging up an old thread.

Now this.. atm i own a Asus p7p55d-le which has, with 1x XFX 5770.
I wasnt planning on Crossfire, so didnt really care about the PCI-Express slots, being 16x or whatever.

Now after a while, i do wanna go Crossfire 5770.. But i am wondering how much of a performance hit i would get, because 1 slot will be PCI-Express 2.0 x16, the other one will be PCI-Express 2.0 x4 (as far as i understood, correct me if wrong)

How much of a hit would i get with 2x 5770 on a p55 board? will it be the ~5% stated in the graph, which is "nothing" or will it be (much) more?

Thanks in advance!


----------

