Thursday, November 21st 2019
AMD Admits "Stars" in Ryzen Master Don't Correspond to CPPC2 Preferred Cores
AMD in a blog post earlier today explained that there is no 1:1 correlation between the "best core" grading system displayed in Ryzen Master, and the "preferred cores" addressed by the Windows 10 Scheduler using CPPC2 (Collaborative Power and Performance Control 2). Deployed through BIOS and AMD chipset drivers, CPPC2 forms a middleware between OS and processor, communicating the system's performance demands at a high frequency of 1 ms (Microsoft's default speed for reporting performance states to processors is 15 ms). Ryzen Master, on the other hand, has had the ability to reveal the "best" cores in a Ryzen processor by ranking them across the package, on a CCD (die), and within a CCX. The best core in a CCX is typically marked with a "star" symbol on the software's UI. The fastest core on the package gets a gold star. Dots denote second fastest cores in a CCX.
Over the past couple of months we've posted several investigative reports by our Ryzen memory overclocking guru Yuri "1usmus" Bubly, and a recurring theme with our articles has been to highlight the discrepancy between the highest performing cores as tested by us not corresponding to those highlighted in Ryzen Master. Our definition of "highest performing cores" has been one that's able to reach and sustain the highest boost states, and has the best electrical properties. AMD elaborates that the CPPC2 works independently from the SMU API Ryzen Master uses, and the best cores mapped by Ryzen Master shouldn't correspond with preferred cores reported by CPPC2 to the OS scheduler, so it could send more workload to these cores, benefiting from their higher boosting headroom.The "best cores" as defined by SMU and reported by Ryzen Master are hence decided on the basis of electrical properties, and hard-coded at the time of die binning in the factory. The "preferred cores" as defined by CPPC2 are those cores to which AMD wants the OS scheduler to send the most traffic to, not just on the basis of their superior physical or electrical properties, but also being optimal for Windows scheduler core rotation policy. Windows scheduler is programmed to not keep a long application work thread allocated to a particular core indefinitely, but to periodically rotate it between a pair of two cores. The rationale behind this is thermal management (spreading the heat across two cores that are spatially apart).
On monolithic multi-core chips such as the i9-9900 or i9-9980XE, in which all cores not only sit on the same die, but are also part of the same group (no CCX here), core rotation works as intended, as all cores share the L3 cache, and a relieving core can pick up work from where its rotation pair partner has left off, by pulling data from the L3 cache.
AMD's "Zen" multi-core topology complicates this, as not all cores share the same L3 cache; and in 12-core, 16-core, or Threadrippers, not all cores sit on the same die. This is where CPPC2 fits in, giving Windows the awareness of the topology it needs, so it can rotate threads among cores without hurting performance by forcing workloads onto a core that uses a separate instance of cache, which forces data reloads from RAM. So how does CPPC2-reported "favored cores" fit into the scheme of things? CPPC2 deliberately misreports "favored cores" to the Windows scheduler — to build core rotation pairs within localized groups of cores, rather than picking cores from different CCXs or CCDs to build rotation pairs.
"Ryzen Master, using firmware readings, selects the single best voltage/frequency curve in the entire processor from the perspective of overclocking. When you see the gold star, it means that is the one core with the best overclocking potential. As we explained during the launch of 2nd Gen Ryzen, we thought that this could be useful for people trying for frequency records on Ryzen," reads the AMD blog on the discrepancy between Ryzen Master "best cores" and CPPC2 Preferred Cores. "Overall, it's clear that the OS-Hardware relationship is getting more complex every day. In 2018, we imagined that the starred cores would be useful for extreme overclockers. In 2019, we see that this is simply being conflated with a much more sophisticated set of OS decisions, and there's not enough room for nuance and context to make that clear. That's why we're going to bring Ryzen Master inline with what the OS is doing so everything is visibly in agreement, and the system continues along as-designed with peak performance," it adds. "Best cores" and "preferred cores" are hence both "right." The former refers to a physically high-quality core, while the other is more "circumstantial", for better performance.
Sources:
Reddit, Anandtech
Over the past couple of months we've posted several investigative reports by our Ryzen memory overclocking guru Yuri "1usmus" Bubly, and a recurring theme with our articles has been to highlight the discrepancy between the highest performing cores as tested by us not corresponding to those highlighted in Ryzen Master. Our definition of "highest performing cores" has been one that's able to reach and sustain the highest boost states, and has the best electrical properties. AMD elaborates that the CPPC2 works independently from the SMU API Ryzen Master uses, and the best cores mapped by Ryzen Master shouldn't correspond with preferred cores reported by CPPC2 to the OS scheduler, so it could send more workload to these cores, benefiting from their higher boosting headroom.The "best cores" as defined by SMU and reported by Ryzen Master are hence decided on the basis of electrical properties, and hard-coded at the time of die binning in the factory. The "preferred cores" as defined by CPPC2 are those cores to which AMD wants the OS scheduler to send the most traffic to, not just on the basis of their superior physical or electrical properties, but also being optimal for Windows scheduler core rotation policy. Windows scheduler is programmed to not keep a long application work thread allocated to a particular core indefinitely, but to periodically rotate it between a pair of two cores. The rationale behind this is thermal management (spreading the heat across two cores that are spatially apart).
On monolithic multi-core chips such as the i9-9900 or i9-9980XE, in which all cores not only sit on the same die, but are also part of the same group (no CCX here), core rotation works as intended, as all cores share the L3 cache, and a relieving core can pick up work from where its rotation pair partner has left off, by pulling data from the L3 cache.
AMD's "Zen" multi-core topology complicates this, as not all cores share the same L3 cache; and in 12-core, 16-core, or Threadrippers, not all cores sit on the same die. This is where CPPC2 fits in, giving Windows the awareness of the topology it needs, so it can rotate threads among cores without hurting performance by forcing workloads onto a core that uses a separate instance of cache, which forces data reloads from RAM. So how does CPPC2-reported "favored cores" fit into the scheme of things? CPPC2 deliberately misreports "favored cores" to the Windows scheduler — to build core rotation pairs within localized groups of cores, rather than picking cores from different CCXs or CCDs to build rotation pairs.
"Ryzen Master, using firmware readings, selects the single best voltage/frequency curve in the entire processor from the perspective of overclocking. When you see the gold star, it means that is the one core with the best overclocking potential. As we explained during the launch of 2nd Gen Ryzen, we thought that this could be useful for people trying for frequency records on Ryzen," reads the AMD blog on the discrepancy between Ryzen Master "best cores" and CPPC2 Preferred Cores. "Overall, it's clear that the OS-Hardware relationship is getting more complex every day. In 2018, we imagined that the starred cores would be useful for extreme overclockers. In 2019, we see that this is simply being conflated with a much more sophisticated set of OS decisions, and there's not enough room for nuance and context to make that clear. That's why we're going to bring Ryzen Master inline with what the OS is doing so everything is visibly in agreement, and the system continues along as-designed with peak performance," it adds. "Best cores" and "preferred cores" are hence both "right." The former refers to a physically high-quality core, while the other is more "circumstantial", for better performance.
67 Comments on AMD Admits "Stars" in Ryzen Master Don't Correspond to CPPC2 Preferred Cores
Why AMD doesn't just come out with this info is beyond me, it's not like people won't understand that MS does a shitty job of making Windows run the best these days.
You dumb asshole, Axaion.
The difference, however, is that the Windows scheduler (being used on the Ryzen and Windows plans) likes to default to Core 0for light loads that don't require more cores. My best cores are 7, 5, 1, 3 in that order (the even numbered ones), indicated in Ryzen Master and verified in how they boost under load and chosen by the scheduler.
Problem is, Core 0 is by and far the worst core on die. It's ranked #8 in perf, and never exceeds 43.3x peak clock. And some applications, like CPU-Z in evaluating single-thread performance, aren't written to leverage any core except Core 0, regardless of power plan.
Because of the variations in production, many other Ryzen 3000 chips have Core 0 amongst their best 4 cores, but many others are unfortunately like mine. And when it comes to the Windows scheduler or simplistic programs, this results in a percentage of chips that either can't make use of maximum available single thread performance, or can't benchmark to reflect their true performance.
An example of how mediocre 7nm yields and binning combine with terrible Windows scheduling and core assignment to create one hell of a silicon lottery.
All 1909 means is that Windows will try to allocate load to the best cores. Keyword: try. A lot of the time, Windows doesn't even try. The custom plan tries to force Windows to use best cores, which is pretty visible.
There's also the Windows practice of shuffling load between cores all the time.
Whatever Microsoft claims to be doing, it doesn't change the fact that the Windows scheduler is trash. Which is why I find certain other sites' unfounded ridicule of the 1usmus plan hilarious; YOU as the reviewer may have gotten a golden 3900X with an absolutely shining Core 0, meaning you don't see much benefit from the plan, that doesn't mean EVERYONE got the same return on their investment.
Ryzen 3000 CPUs should be the first mainstream CPUs that have "best performing cores" in this way. This was part of the discussion/argument/controversy about Ryzen 3000 boost clock speeds. So far, all mainstream CPUs have had spec boost speeds than any core can reach, making need for scheduler optimization to keep work on specific high-performing cores secondary.
@others, eh i only read the post here so yeah, is it another thing like nvidia boxes being a requirement? :P
Because when you "prefer" something that is not "best", something's definitely amiss.
Also I never said that I would like to describe myself, the way you described yourself.