Friday, March 3rd 2017

AMD Responds to Ryzen's Lower Than Expected 1080p Performance
The folks at PC Perspective have shared a statement from AMD in response to their question as to why AMD's Ryzen processors show lower than expected performance at 1080p resolution (despite posting good high-resolution, high-detail frame rates). Essentially, AMD is reinforcing the need for developers to optimize their games' performance to AMD's CPUs (claiming that these have only been properly tuned to Intel's architecture). AMD also puts weight behind the fact they have sent about 300 developer kits already, so that content creators can get accustomed to AMD's Ryzen, and expect this number to increase to about a thousand developers in the 2017 time-frame. AMD is expecting gaming performance to only increase from its launch-day level. Read AMD's statement after the break.AMD's John Taylor had this to say:
"As we presented at Ryzen Tech Day, we are supporting 300+ developer kits with game development studios to optimize current and future game releases for the all-new Ryzen CPU. We are on track for 1000+ developer systems in 2017. For example, Bethesda at GDC yesterday announced its strategic relationship with AMD to optimize for Ryzen CPUs, primarily through Vulkan low-level API optimizations, for a new generation of games, DLC and VR experiences.
Oxide Games also provided a public statement today on the significant performance uplift observed when optimizing for the 8-core, 16-thread Ryzen 7 CPU design - optimizations not yet reflected in Ashes of the Singularity benchmarking. Creative Assembly, developers of the Total War series, made a similar statement today related to upcoming Ryzen optimizations.
CPU benchmarking deficits to the competition in certain games at 1080p resolution can be attributed to the development and optimization of the game uniquely to Intel platforms - until now. Even without optimizations in place, Ryzen delivers high, smooth frame rates on all "CPU-bound" games, as well as overall smooth frame rates and great experiences in GPU-bound gaming and VR. With developers taking advantage of Ryzen architecture and the extra cores and threads, we expect benchmarks to only get better, and enable Ryzen excel at next generation gaming experiences as well.
Game performance will be optimized for Ryzen and continue to improve from at-launch frame rate scores."
Two game developers also chimed in.
Oxide Games, creators of the Nitrous game engine that powers Ashes of the Singularity:
"Oxide games is incredibly excited with what we are seeing from the Ryzen CPU. Using our Nitrous game engine, we are working to scale our existing and future game title performance to take full advantage of Ryzen and its 8-core, 16-thread architecture, and the results thus far are impressive. These optimizations are not yet available for Ryzen benchmarking. However, expect updates soon to enhance the performance of games like Ashes of the Singularity on Ryzen CPUs, as well as our future game releases." - Brad Wardell, CEO Stardock and Oxide
And Creative Assembly, the creators of the Total War Series and, more recently, Halo Wars 2:
"Creative Assembly is committed to reviewing and optimizing its games on the all-new Ryzen CPU. While current third-party testing doesn't reflect this yet, our joint optimization program with AMD means that we are looking at options to deliver performance optimization updates in the future to provide better performance on Ryzen CPUs moving forward. "
Source:
PC Perspective
"As we presented at Ryzen Tech Day, we are supporting 300+ developer kits with game development studios to optimize current and future game releases for the all-new Ryzen CPU. We are on track for 1000+ developer systems in 2017. For example, Bethesda at GDC yesterday announced its strategic relationship with AMD to optimize for Ryzen CPUs, primarily through Vulkan low-level API optimizations, for a new generation of games, DLC and VR experiences.
Oxide Games also provided a public statement today on the significant performance uplift observed when optimizing for the 8-core, 16-thread Ryzen 7 CPU design - optimizations not yet reflected in Ashes of the Singularity benchmarking. Creative Assembly, developers of the Total War series, made a similar statement today related to upcoming Ryzen optimizations.
CPU benchmarking deficits to the competition in certain games at 1080p resolution can be attributed to the development and optimization of the game uniquely to Intel platforms - until now. Even without optimizations in place, Ryzen delivers high, smooth frame rates on all "CPU-bound" games, as well as overall smooth frame rates and great experiences in GPU-bound gaming and VR. With developers taking advantage of Ryzen architecture and the extra cores and threads, we expect benchmarks to only get better, and enable Ryzen excel at next generation gaming experiences as well.
Game performance will be optimized for Ryzen and continue to improve from at-launch frame rate scores."
Two game developers also chimed in.
Oxide Games, creators of the Nitrous game engine that powers Ashes of the Singularity:
"Oxide games is incredibly excited with what we are seeing from the Ryzen CPU. Using our Nitrous game engine, we are working to scale our existing and future game title performance to take full advantage of Ryzen and its 8-core, 16-thread architecture, and the results thus far are impressive. These optimizations are not yet available for Ryzen benchmarking. However, expect updates soon to enhance the performance of games like Ashes of the Singularity on Ryzen CPUs, as well as our future game releases." - Brad Wardell, CEO Stardock and Oxide
And Creative Assembly, the creators of the Total War Series and, more recently, Halo Wars 2:
"Creative Assembly is committed to reviewing and optimizing its games on the all-new Ryzen CPU. While current third-party testing doesn't reflect this yet, our joint optimization program with AMD means that we are looking at options to deliver performance optimization updates in the future to provide better performance on Ryzen CPUs moving forward. "
126 Comments on AMD Responds to Ryzen's Lower Than Expected 1080p Performance
FX didn't get special compiler treatment because that was putting lipstick on a pig.
I easily beat the benchmarks for an 8350 because all 8 cores are running 4.2 as a non turbo boosted speed and I have none of these throttling issues you mentioned even when running linpac for hours.
Running at 5ghz is not going to make DCS or Star Citizen or any number of other games that have issues with AMD CPU's run any better for me.
Its great that AMD kinda caught up to Intel with Ryzen.
But if its going to be the same as the FX series where certain applications perform worse specifically because of AMD CPUs like DCS World and Cryengine games that's a major issue that cant just be ignored.
Also you need to stop repeating that devs are all making games for 8 cores and using that as a reason your "8 core" cpu is still ok, its well known fact that there are 4 full cpu cores and 4 limited cpu cores on the FX series and this makes a HUGE difference in performance when comparing it with a true 8 core cpu.
intel on the other hand had micro ops and could if wanted use a hole cores(2 intel cores worth) on one thread leaveridging a wider execution pipe micro ops and better cache plus two node swaps lower ,but they are all old advantages and its clear amd have the raw per core and multicore performance so a few tweaks here and there on this brand new uarch and im sure it will be fine.
Then i might buy one ,but as i said if i bought one , running two 480s and 4k , i could not do better buying intel anything in any metric , apparently , so i could happily dodge 1080p my whole life but alas im skint so im dreamin still.
Depending on how well Ryzen sells, I can see plenty of recentish games getting patches.
Trouble is game devs will find little incentive to do so for past projects ... and for the new ones, compilers will get tuned as time goes by because of the zen in the console space
Need to take into account that 8 core chip is actually 4x4 (OS would do that) with their own L3 is a conspiracy theory?
"lower than expected" is a fact nowadays? Expected by whom?
I have seen Starcraft 2 benchmarks with Ryzen doing min 16 average 31 fps (on 980), are you freaking kidding me?
This is plain and outright bullshit, there is no desktop CPU that is less than 4 years old that would score like that in that game.
There is an expected single thread advantage that Intel's 4 cores have, and AMD has voiced it actually.
AMD states they they are 6% behind Skylake IPC, taking higher clock into account it's flat 20% advantage for 7700k in single core tasks, who "expected" something, pretty please?
Haswell was an "unlikely but hopefully" target. It ended up on Broadwell levels, jeez.
/double facepalm
If a compiler were to eliminate some branching, the CPU has to have some new unique instructions allowing certain conditionals to be converted into branchless code. Otherwise, a compiler can't help here.
Data cache misses usually occur because of traversal of lists, and the only way to eliminate this would be to rewrite the whole codebase to align the data in a native array, no compiler can ever do this. This is largely a result of how the developer chose to do OOP.
Code cache misses is once again usually a result of the code structure, OOP and lists of arbitrary elements is the greatest challenge here. Once again the solution is to restructure the code which is outside the realm of a compiler. A kind of optimization I can think of which would help here would be to inline small functions, but compilers already do that, like GCC with -O2 which enables -finline-small-functions.
Just the rendering of a single frame will process several hundred MBs, and at 60 FPS there is a lot of data flowing through.
With 512 kB of L2 cache, and 8 MB of shared L3 cache it's not like even 1% of the data is in there at any point. So since FPS is "stable", there is not branch mispredictions and cache misses? I'm sorry, but you clearly don't even know at which scale this things even happen. We are not talking of single stalls causing ms of latency known as stutter, no we are talking about clock cycles which are in ns scale, and since there are so many thousands of them every second they add up to a steady performance drop rather than noticeable stutter. A single local branch misprediction causes ~20 clocks of idle, a non-local adds a cache miss as well(code cache miss), so +~250 clocks. A data cache miss is ~250 clocks on modern CPUs.
When there is falloff in framerate on the intel side and the AMD side stays flat at higher res... that shows a cpu bottleneck plain and clear. If the gap does anything other than stay constant... the difference is more than the gpu.
Wondering how many you believe is 'quite a few'..
GTA5, Sniiper Elite is a showcase of it...
idk, how many AAA titles does it take? I am sure there are more... and as DX12 and vulkan become more prevalent I am sure that is the trend.
Point stands... If the gap does anything other than stay constant when you change resolutions... the difference is more than the gpu.
Even on the games that just hit 4-6 threads hard... having spare threads means if anything hiccups in the background doesn't hurt you.
For example, because Ryzen won't oc much. Clock them both @ 3.9ghz ~ 4.1ghz, 4c/8t. I know we are gimping the i7 7700k but i'm just curious to know the result of "almost the same" setup would be. Gaming & productivity benches needed
Fluctuations around 1-2 ms is very noticeable, and I would claim anything below ~0.2 ms is hard to notice.
For comparison 0.2 ms = 200 μs = 200,000 ns. For both of you;
Multithreading in games mainly comes down to freeing up the rendering thread to work undisturbed building a queue. Granted, Direct3D 12 allows you to use multiple threads to build a single queue, but there's really not any point to it. Having several threads querying the driver this way will create a number a synchronization issues, so the gains will be minimal. So the gains of multiple threads will mostly be limited to having one thread per queue, and since most games use 1-2 queues for most of the load, there will not be a huge potential here. It's not like we can just throw four threads at it and scale nicely.
If a game has a problem with a bottlenecked CPU, it's usually caused by the computations done between each API call. So e.g. precalculating animations in a different thread can help a bit, but of course it mostly comes down to the code structure in the game engine. This is why I started by mentioning "freeing up the rendering thread". Too little, too late…
This is all about PR, sending out some dev kits is not going to make developers rewrite their games over night. In ~99% of cases reducing the bloat would require a major rewrite, which is not something that can be done in 10 hours or so.