@ wizard,
"The most surprising find to me is the huge performance hit some of the latest games take when running on limited PCIe bandwidth. The real shocker here is certainly Ryse: Son of Rome, based on Crytek's latest CryEngine 4. The game seems to constantly stream large amounts of data between the CPU and GPU, taking a large 10% performance hit by switching to the second-fastest x16 3.0 configuration. At x4 1.1, the slowest setting we tested, performance is torn down to less than a third, while running lower resolutions! Shocking!
Based on id's idTech5 engine, another noteworthy title with large drops in performance is Wolfenstein: The New Order. Virtual Textures certainly look great in-game, providing highly detailed, non-repeating textures, but they also put a significant load on the PCI-Express bus. One key challenge here is to have texture data ready for display in-time. Sometimes too late, it manifests as the dreaded texture pop-in some users have been reporting.
Last but not least, World of Warcraft has received a new rendering engine for its latest expansion Warlords of Draenor. While the game doesn't look much different visually, Blizzard made large changes under the hood, changing to a deferred rendering engine which not only disallows MSAA in-game, but also requires much improved PCI-Express bandwidth."
All you're really proven is there is a slight gain or loss in certain scenarios, but you don't go into depth as to why PCIe 16x 2.0 performs equal to or better than 3.0. Doubt it really matters. Though, I agree with some points of your message, but you aren't really proving any more than a drop or gain in average of what, around 10% in any of the other scenarios. It seems informative, but also a waste of your own time. In addition to that, games that are either MMOs or highly-progressed games like Crysis 3, BF4, Wolfenstein 3D, and others, will make better use of 3.0 over 2.0. One good example of this will probably be Star Citizens in the not to distant future. I would highly suggest using Planetside 2 and EQN for upcoming benches. For the higher resolutions (above 1080p), you'll probably see a higher use in 3.0 if you enabled more AA at 4k resolutions.
Here's an idea. Instead of sitting in the Shrine in World of Warcraft, why don't you conduct test during a Garrosh 25 man fight at Ultra Settings. Tell us what the results are of the PCIe Lane Saturations after that. I would think that would be more vital information than just staring at a wall to stare at the in-game FPS meter to see how high your FPS can get. Also, why don't you measure the same games with 2-way, 3-way, and 4-way SLI. It's not like NVidia has anything to hide right...