That proves my point, different applications favour different architecture, how else would you explain that Zen2 is faster in Cinebench both in single and multi core, while it loses in Photoshop, the same goes for decompression in 7zip.
What specifically proves your point?
If you think that because a piece of software performs better on one CPU than another, it proves it's optimized for that CPU? That makes absolutely no sense whatsoever.
In order for software to be optimized for a piece of hardware it needs to be intentionally designed to utilize either a
specific feature or a
specific characteristic of that hardware.
PC software relies on the
same ISA whether it's running on AMD or Intel. And with the exception of AVX-512 and a few other features, Zen and Skylake has pretty much feature parity. All modern x86 designs relies on microoperations which we can't target, so we can't write truely low-level code for either of these architectures. These CPUs are also highly superscalar, but it's not explicit, so software can't control or optimize for it directly.
Software scales differently on different CPUs because their architectures have different strengths in terms of resource balancing. Skylake has a stronger front-end with better branch prediction and has a larger instruction window, it has lower latency in the memory controller and there are some differences in the caches. Zen/Zen2 have a different configuration of execution ports which can have a slightly higher peak combined int/vec performance under the right conditions. Zen 2 is also on a more energy efficient node, which helps a lot under those heavily threaded benchmarks where Skylake throttle much more, but this has nothing to do with software optimization. So the the only thing software developers can do to "optimize" for one microarchitecture or the other is to shuffle around the assembly code and see if they get a minor performance difference. Since they are not explicitly superscalar, executes out-of-order and we can't control or debug the microoperations, this is pretty much a pointless effort that probably yields <5% gains, and the gains will not be consistent. Pretty much no software, and especially games, do low-level assembly code anyway. Software today is mostly high-level bloated code, and such code generally performs a tiny bit better on Intel hardware, not due to optimizations (rather lack thereof), but due to a stronger front-end.
There is no software out there "optimized for Intel" (unless you count custom software relying on features AMD have not implemented yet).
But I've seen a case where a library intentionally runs slower code on AMD hardware in runtime, but this is not optimization, this is sabotage, and is not playing fair.
Thing is, do they need to beat Intel in gaming performance?
No, they need to be
close enough, and with Zen 3 they might be within the margin of error in many cases.
I'm not going to say no to more gaming performance, but we do have to remember how unrealistic and unrepresentative of actual gaming the CPU game testing methodologies are. Nobody, and I mean nobody dropping $3000+ on a water-cooled, overclocked i9 with a 240Hz+ monitor and 2080Ti is playing games at 720p.
I know 720p or 1080p at low or medium is pointless with an high-end card, that's only interesting for "academic discussions", not buying recommendations.
But then consider, if you're buying a gaming machine, and there are two mostly "equal" options in your budget, while one has ~3% more gaming performance, would you say no to it?
Another argument which most ignores is that Zen 2 (for now) needs overclocked memory to become "competitive" in gaming, while Intel can run stock memory speeds and still perform better. I'll take the long-term stability please.
Realistically, the more cores your CPU has, the more chance there is of a stable framerate since background OS tasks, and even background game-engine threads are likely to be finished sooner and without interrupting or causing any kind of resource conflict with the ultra-crucial game-engine thread that is the current bottleneck to lower frame times. That's felt in the minimum or 99th percentile numbers.
Sure,
any time the OS scheduler kicks out any of the game's threads, it can cause stutter, at the scale of ~1-20ms for Windows. But then again, a faster core will finish sooner, so other threads waiting for it will get working earlier and finish with a larger margin before the "deadline". So it's a complicated balancing act.