It has been a busy week of testing and while this article has interesting results, no doubt, it also leads to many more questions. Why is the SMT scaling of Zen 5 so different? AMD didn't mention anything about SMT in their press briefings, other than the fact that it exists. If they made significant changes wouldn't they have told us about them? Is this a bug or a feature? Is AMD able to look at multithreaded scaling and make improvements? I'm quite sure that is the case, especially the gains for gaming performance are tempting. While those are somewhat game-specific, it shouldn't be too hard to come up with some logic to address this. In the briefings we heard some vague comments that AMD is using AI-trained networks inside the CPU to optimize operation, maybe that mechanism could be used for better thread placement? Just to clarify, there is no training while the CPU is running in your system, the network is generated during design-time in AMD's labs. However, I would expect that it is upgradeable through microcode somehow. In the past AMD has made changes to the behavior of their processors through AGESA BIOS updates, I think they will address the scheduling similarly.
I don't think that a software-based solution is the way to go. On the Ryzen 9 7950X3D, AMD put the 3DV Cache on one CCD only, so a software driver hooked into Windows Game Mode detection to detect games and push those onto the big-cache-cores, while applications end up on the high-frequency cores. The problem is that this solution is quite brittle and that it has a lot of moving parts that can and will break, especially the ones designed by Microsoft.
On the topic of Microsoft, the Windows operating system is executing the actual thread placement, the processor just hints at what it thinks is an optimal placement. Intel uses a hardware component called Thread Director for this, AMD uses a slightly different approach, but their BIOS does provide guidance to the OS, like for example the CPPC2 Preferred Cores mechanism. There are also a number of undocumented settings for Windows Power Plan, which could be another way to improve the scheduling.
While it is certainly possible that application and game developers further optimize their coding for modern processors, I doubt that this will happen. Hyper-Threading has been around for over 20 years, there are several ways for software to learn about the CPU core layout and the internals, yet a majority of games isn't making optimal use of CPU resources—why would they—the OS is responsible for these scheduling decisions. We've seen terrible game releases full of bugs that were rushed out, just to make some publisher date, optimization for CPU cores and architectures is just not going to happen with such schedules.
Earlier this year, Intel made a surprising announcement that they are discontinuing Hyper-Threading in their upcoming Lunar Lake and Arrow Lake processors. The official reason for this shift is to simplify the core design and save space, which could potentially lead to improved IPC and higher frequencies. However, I believe recent security concerns may also have influenced this decision. While the long-term effects remain to be seen, it will be intriguing to see how this change impacts performance in applications and games.
In conclusion, AMD's Zen 5 architecture offers promising advancements, with particularly impressive single-threaded performance, but the unexpected SMT scaling raises many questions. While we hope AMD can refine these aspects and improve performance, it will likely take time to plan, implement, and validate. As they continue to innovate, it will be interesting to see how these changes unfold in future updates and what new strategies AMD will introduce to push the boundaries of CPU technology even further.