Tuesday, August 27th 2024

AMD Ryzen Branch Prediction Optimizations Now Available to Windows 11 23H2

AMD announced that its Ryzen processor branch prediction optimization that provide gaming performance uplifts, is now available for Windows 11 23H2, through an optional update. This update applies to AMD Ryzen processors based on the "Zen 3," "Zen 4," and "Zen 5" microarchitectures, and essentially yields the kind of performance you get in the real Administrator account, on regular Windows accounts, especially non-local (online) accounts. Users should look for "Cumulative Update Preview for Windows 11 Version 23H2 for x64-based Systems (KB5041587)" in Windows Update, which should begin showing up as an optional update. This update requires a system restart to apply.

With this update in place, gaming performance uplifts between Windows 11 23H2 and 24H2 should be identical. "We wanted to let you know that the branch prediction optimization found in Windows 11 24H2 has now been backported to Windows 11 23H2. Users will need to look for KB5041587 under Windows update > Advanced options > Optional updates. We expect the performance uplift to be very similar between 24H2 and 23H2 with KB5041587 installed," AMD said in a statement to Wccftech.
Source: Wccftech
Add your own comment

106 Comments on AMD Ryzen Branch Prediction Optimizations Now Available to Windows 11 23H2

#76
529th
How do you force update from 22H2 to 23H2? I did get the KB5041587 update but not sure if it offers the same benefits.
Posted on Reply
#77
DemonicRyzen666
atomsymbolThe above is misinformation, probably caused by not having access to a Zen5 CPU on which you could test your ideas before posting. A single thread on my Zen5 CPU can execute 2 *taken* branch instructions per cycle, if the outcomes of both branch instructions have been successfully predicted. This can be verified by writing a short program in assembly language. On the other hand, my Zen3 CPU can execute 2 branch instructions in the same clock cycle only in a subset of scenarios. A table comparing Zen3 vs Zen5 looks something like this:

Branch instruction 1Branch instruction 2Zen3Zen5
N (branch wasn't taken)N1 clock cycle1 clock cycle
NT1 clock cycle1 clock cycle
T (branch was taken)N2 clock cycles1 clock cycle
TT2 clock cycles1 clock cycle
DemonicRyzen666chipsandcheese.com/2024/07/15/a-video-interview-with-mike-clark-chief-architect-of-zen-at-amd/

This is the transcript for the chips and cheese interview.

Mike Clark. suggest that under certain events every thing in the whole front-end can be used by a single thread inside of the zen 5 core. Expcet he never says how or, what triggers it?

In chips in cheese test on zen 5 it never does trigger this ever with a single thread runnng throught the core which is causing low single thread gains. It doesn't even trigger when SMT is dsiabled when it has no reason not to use it

EIther, it's a design limitation, hardware bug, patch bug, or micro-code bug?
That was from the engineer who worked on zen 5.

Most types of shared resources inside of a cpu have ended with bad prefromance.
Posted on Reply
#78
forman313
kondaminthat's not How that works.

if a Mercedes sucks out of the factory but is only a decent car after it’s tuned by brabus for an additional couple of hundred thousand dollar the original car isn’t a good car.
you even need to have the hundred thousands worth of modification done again after filling it car up with a fresh tank of gas which is what happens when you update to a new feature release.
Always mind your surroundings. This is TPU, not PCWorld or bleepingcomputer.com. If it were, I would totally agree with you. However, In here with this bunch of tweakers and compulsive hardware torturers , I have to agree with lexluthermiester.

Picking things apart and improve it, or turning it into something completely different is great fun and you learn a lot. You can even make a living out of it. Buying something and then just use it unmodified and in stock/original condition is what we do with a toaster or a car for the wife.

That being said, you have a point. MS deserves to be punished. Office/365 software becoming PWA when customers have paid for optimized stand-alone applications, using customers as beta testers , hiding products or making them impossible to buy or reinstall after "upgrades" etc. For work, all MS products are frustrating, with one exception. MS Visual Studio Code is just brilliant.
windows 10 and 11 are a horrible intrusive mess people put up with because they have to
Or because they have failed to do very basic research.

www.microsoft.com/en-us/evalcenter/evaluate-windows-11-iot-enterprise-ltsc
Posted on Reply
#79
GoldenX
_roman_In my personal opinion since Windows 95, Windows wasted more hardware than Gnu Userspace and the linux kernel. Means hardware has to be bought for an acceptable Windows experience. Nothing changed.

Windows 10 is out of support range in my personal opinion. Nothing new. I doubt there will be any new features for W10.



Not really.

sys-devel/gcc hardly has optimizations for Ryzen 5000 / 7000 cpus.
I also check what is new in the kernel version increases because I Build my kernel myself. (Nothing fancy - Same commands to build a kernel since 2006, still same gnu gentoo linux installation from 2006.)
There are a few more options recently. A lot of work is being done for scheduler and those security stuff.

Most newbies with their binary distros - do not know what I talk about. They use prebuild software and generic kernel in their binary distros. Which hardly utilize or benefit the hardware at all.

I moved my hardware from Ryzen 5800X -> Ryzen 3 3100 -> Ryzen 7600X from January 2023 till April 2023. Therefore I had to recompile my hole box. And I saw the compile times. A performance indicator is, how long a package compiles. gcc / libreoffice / and so on. I delete from time to time the log files, still I see the difference over the package versions over the time with different cpus.

I also recompile with the corresponding new cpu flags for the Ryzen 7600X. Barely a difference for around 1400 to 1700 installed packages.

The file system also has a big impact on performance. E.G. tmpfs - file system stored in the DRAM.

Summary: gcc is in my point of view far behind in regards of optimizations of current hardware. That means as of now in my point of view Ryzen 7000 or newer. Binary distros with binary generic kernels hardly benefit.

--

Phoronix benchmarks - forget them. Just a pile of numbers for clickbait. Faster compile times -> that is easy to measure and to see over several times of package compiles over time. Less backup time and so on, that is performance I see and can verify.
The kernel doesn't need specific per arch fixes, it just has a better scheduler period.
Posted on Reply
#80
atomsymbol
DemonicRyzen666That was from the engineer who worked on zen 5.
  • You slightly misinterpreted what was said in that interview
  • It doesn't matter what you believe is supposedly true or false about Zen5 if I can disprove the claim by running actual code on my Zen5 CPU
    • For example, I can measure an IPC of ~8.84 on Zen5 in a synthetic single-threaded benchmark which is avoiding the integer renamer, fuses 1 CMP+Jcc pair into 1 µop and uses a combination of ALU+JUMP+AVX instructions, which clearly implies that single-thread dispatch is 8-wide. In terms of µops, it is ~7.84 (this number can be obtained by dividing the perf event 'de_src_op_disp.all' by the perf event 'cycles' on my Zen5 CPU), which means that the CPU has to be dispatching 8 µops per clock cycle when running the synthetic benchmark or otherwise I wouldn't be able to measure any number above 7.
DemonicRyzen666Most types of shared resources inside of a cpu have ended with bad performance.
It is impossible for me to determine what kind of a shared resource you are referring to.
Posted on Reply
#81
DemonicRyzen666
atomsymbol
  • You slightly misinterpreted what was said in that interview
  • It doesn't matter what you believe is supposedly true or false about Zen5 if I can disprove the claim by running actual code on my Zen5 CPU
    • For example, I can measure an IPC of ~8.84 on Zen5 in a synthetic single-threaded benchmark which is avoiding the integer renamer, fuses 1 CMP+Jcc pair into 1 µop and uses a combination of ALU+JUMP+AVX instructions, which clearly implies that single-thread dispatch is 8-wide. In terms of µops, it is ~7.84 (this number can be obtained by dividing the perf event 'de_src_op_disp.all' by the perf event 'cycles' on my Zen5 CPU), which means that the CPU has to be dispatching 8 µops per clock cycle when running the synthetic benchmark or otherwise I wouldn't be able to measure any number above 7.
It is impossible for me to determine what kind of a shared resource you are referring to.
No I didn't even chips and cheese has shown it doesn't work in his testing, in his review.
Posted on Reply
#82
AnotherReader
DemonicRyzen666No I didn't even chips and cheese has shown it doesn't work in his testing, in his review.
You should reread the article. I'll just reproduce the diagram that shows instruction fetch bandwidth. As you can see, as long as the instructions can be found in the micro-op cache, more than 4 instructions per cycle can be fetched. In another article on Zen 5, they note:
To further speed up instruction delivery, Zen 5 fills decoded micro-ops into a 6K entry, 16-way set associative micro-op cache. This micro-op cache can service two 6-wide fetches per cycle. Evidently both 6-wide fetch pipes can be used for a single thread.
Posted on Reply
#83
OkieDan
lexluthermiesterActually that kinda proves you either never used it or knew nothing about how to make it stable. System Restore was responsible for many of the instabilities to begin with. Turn it off and your mostly golden. But whatever oh master of the snug-fit.


Sure it is..


Very possible.
Lol I did phone support for a large OEM from '96 to '01, over 17k calls. I know what a steaming pile WinME is, way more issues than its predecessors and successors. Being in the tiny group of people that think WinME was a good OS... well let's just say that's only elite club in your head.
Posted on Reply
#84
trparky
Dave Plummer of Dave's Garage on YouTube talked about the Ryzen performance issues a little bit in the above video. If you don't know who he is, he was one of the early architects of Windows from back in the day. He went onto explain that the Spectre and Meltdown mitigations were at fault for causing the performance issues and that they were incompatible with AMD's branch predictor.
Posted on Reply
#85
DemonicRyzen666
trparky
Dave Plummer of Dave's Garage on YouTube talked about the Ryzen performance issues a little bit in the above video. If you don't know who he is, he was one of the early architects of Windows from back in the day. He went onto explain that the Spectre and Meltdown mitigations were at fault for causing the performance issues and that they were incompatible with AMD's branch predictor.
Ironic, I mentioned that might have been problem on the level1tech's zen 5 video video.
Don't both those patches also have bios mitigations too?
Posted on Reply
#86
mechtech
All in all, it's just another branch in the wall................................
Posted on Reply
#87
529th
This is a job for a tree trimmer
Posted on Reply
#88
Shadowsnight
0.01% difference in Horizon Zero Dawn, no difference in CPU or Average FPS.
I ran the bench 3 times each, these were the best of each
Posted on Reply
#89
HD64G
Shadowsnight0.01% difference in Horizon Zero Dawn, no difference in CPU or Average FPS.
I ran the bench 3 times each, these were the best of each
The game says that you are using win 10 pro though.
Posted on Reply
#90
Tomorrow
HD64GThe game says that you are using win 10 pro though.
Could be misreading Win11 as Win10. I did a quick search and every single image of the benchmark results i came across said some variation of Win10, even when searching new images. Im guessing the developer has not updated the game to recognize Win11 and the game takes it's information from the registry that in several places says Win10.

For example HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion says ProductName as Windows 10 Pro and ReleaseId as 2009 despite the fact that BuildLabEx says 22621.1.amd64fre.ni_release.220506-1250. Even that is wrong as im on 23H2 that should read as 22631, not 22621. What a mess.
Posted on Reply
#91
leezhiran
TomorrowMy small test.
23H2. System admin account. VBS off. 1440p max settings. Not cpu limited.
Newer times are with KB5041587 installed. 2080 Ti. GoT had FSR3 FG enabled. Others were native res/no upscaling.

That's very helpful, I didn't find any reviews on Zen 3 CPUs with this patch else where.:toast:

Those 0.1% percentile gains looks really impressive...
22H2 support has ended. Upgrade to 23H2.
Got it.

However 22H2 is also getting the patch , with that in mind this problem may have exisited for a loooooong time.
Posted on Reply
#92
Chrispy_
ImoutoMaybe because it's an immutable system? Is it because it's Linux?


What Arch desktop mode? Do you mean KDE? KDE the desktop environment?

Do you really have any difficulties with it? Because I think that totally disqualifies you for any serious discussion about the matter. Also the fact that you seem to be googling the stuff we are talking about with each post.
Typo; I meant Steam desktop mode, so uh- Plasma I guess, which is just stripped-down KDE, right?

As for "googling stuff", that was a famous quote from the same infamous Finnish university conference 12 years ago where Torvalds flipped the bird and said "f*ck you, Nvidia" right to the camera. Nobody needs to google it, it's a meme - and I've posted, quoted, or sourced excerpts from that day several times here on TPU over the years.

We're getting off topic but the difficulties of Linux for most people are the fragmentation. There's almost infinite choice, but that also means almost infinite unfamiliarity: I've used Plasma, Cinnamon, MATE and GNOME over the years, plus more that I can't name because I didn't bother to check whether they were new DEs or just skins/themes of one I'd already used. Basic stuff like having an interface that's familiar to people isn't a feature of Linux. Once you've used a few distros you can get an idea of where things are going to be but it's not like someone familiar with Mint can explain to someone new to Ubutu how to do stuff. Package managers exist in Linux because there's this extra level of fragmentation that must be account for that doesn't really exist at the same scale in Windows or OSX.

We could discuss the nuances of Windows vs Linux all day, but the ultimate point of this branch of discussion is that Windows users use Windows because it's popular, cohesive, compatible, and familiar right out of the box. Stuff just works and the software you download from the web just installs (almost) every time without needing any extra effort. Perhaps that simple paradigm is falling apart with Proton successfully converting Windows DX12 gaming calls to Vulkan and emulating any leftovers whilst at the same time Apple is emulating x86 where apps aren't recompiled for their ARM-based silicon, and Microsoft have made a second, more serious effort at making an emulator to get x86 running on ARM this year. If Valve, Apple, and Microsoft can broaden their hardware horizons, maybe the Linux community can come together a bit from the other end of the spectrum?
Posted on Reply
#93
Laurijan
I tested before and after update with Cinebench R23.2 but there is no change in points.
Posted on Reply
#94
cerulliber
if you delete and this comment, delete my account, too.
I think at this point AMD community from zen3 and above might not get yet 100% performance of the table. warring below.
at this point, testing methods matter 100%. HUB used 24H2 which is beta. most users also use optional update which is also beta. either way everybody testing beta and it's a gimmick to push windows 11 to gaming community. moreover, HUB stated that he tested with core isolation, memory integrity disabled and SVM to manual. they also use perhaps most advanced and expensive thermal pad on the market.
everything else is already known for fanbase-cooling, environment and subtimings.
hub1 hosted at ImgBB — ImgBB (ibb.co)
Posted on Reply
#95
Super XP
GoldenXThis is Microsoft's fault, not AMD's. Linux has benefited from this for years.

The crime here is the rewrites that 24H2 implemented taking so long to reach users.
A lot has to do with Microsoft's and Intel's relationship, now going on for YEARS. It seems as though AMD keeps getting the shaft most of the time. How about Microsoft once and for all "Recognize" AMD CPUs & ensure they run flawlessly on Windows OSs.
cerulliberif you delete and this comment, delete my account, too.
I think at this point AMD community from zen3 and above might not get yet 100% performance of the table. warring below.
at this point, testing methods matter 100%. HUB used 24H2 which is beta. most users also use optional update which is also beta. either way everybody testing beta and it's a gimmick to push windows 11 to gaming community. moreover, HUB stated that he tested with core isolation, memory integrity disabled and SVM to manual. they also use perhaps most advanced and expensive thermal pad on the market.
everything else is already known for fanbase-cooling, environment and subtimings.
hub1 hosted at ImgBB — ImgBB (ibb.co)
There's nothing wrong with ZEN3 or ZEN4 performance. As for ZEN5 its a Microsoft issue and their OS which has a monopoly on the industry. And ever since the 1st CPU, we've all been beta testers for decades, because each and everybody's computer configuration is different. And therefore you will always run into issues every so often.
Posted on Reply
#96
billeman
529thHow do you force update from 22H2 to 23H2? I did get the KB5041587 update but not sure if it offers the same benefits.
Just download the 23H2 iso on the microsoft website.

www.microsoft.com/software-download/windows11

bottom of page.

And then click on the iso, <open> and run setup.exe

Windows update never offered me 23H2 on my desktop and laptop :-/
Posted on Reply
#97
Tomorrow
LaurijanI tested before and after update with Cinebench R23.2 but there is no change in points.
CB is not the best test for Branch Prediction.
cerulliberif you delete and this comment, delete my account, too.
I think at this point AMD community from zen3 and above might not get yet 100% performance of the table. warring below.
at this point, testing methods matter 100%. HUB used 24H2 which is beta. most users also use optional update which is also beta. either way everybody testing beta and it's a gimmick to push windows 11 to gaming community. moreover, HUB stated that he tested with core isolation, memory integrity disabled and SVM to manual. they also use perhaps most advanced and expensive thermal pad on the market.
everything else is already known for fanbase-cooling, environment and subtimings.
hub1 hosted at ImgBB — ImgBB (ibb.co)
Gaming community has already embraced Win11 by looking at Steam numbers (as flawed as they are being opt-in).
SVM manual is an interesting wording. Must have been added because it used to be either Auto or Disabled. I assume Manual follows OS guidance then?
billemanJust download the 23H2 iso on the microsoft website.

www.microsoft.com/software-download/windows11

bottom of page.

And then click on the iso, <open> and run setup.exe

Windows update never offered me 23H2 on my desktop and laptop :-/
If you have 22H2 and have not received 23H2 then there's no need to download the whole ISO. You just need to install the enablement package KB5027397. This is not available trough Windows Update or manually trough Windows Update Catalog.

Also not offering 23H2 seems to be happening on systems where users have either disabled TPM and/or circumvented other requirements to install Win11. At least that's my impression so far, as my own system was one of those (it's perfectly capable of supporting TPM etc but i chose to disable it).

www.elevenforum.com/t/kb5027397-enablement-package-for-windows-11-version-23h2-feature-update.19372/
Since this requires login for download i uploaded KB5027397 to our local file hosting site some time ago instead so users could download it without creating a throwaway account: www.upload.ee/files/16852384/windows11.0-kb5027397-x64_3a9c368e239bb928c32a790cf1663338d2cad472.zip.html
Posted on Reply
#98
Laurijan
TomorrowCB is not the best test for Branch Prediction.
In Star Wars Outlaws it seems like performance is better now be like 5fps. But that game only uses 25% of CPU performance
Posted on Reply
#99
529th
Does anyone know if KB5041587 offers the same benefits on 22H2 as 23H2? I got that small update but failed to get baseline bench results before updating for comparison.
Posted on Reply
#100
cerulliber
TomorrowGaming community has already embraced Win11 by looking at Steam numbers (as flawed as they are being opt-in).
Steam Survey July 2024 Update: Windows 10 Usage Records Uptick, Windows 11 Drops | TechPowerUp
TPU wroted that Windows 10's share rose to 47.69%, marking a significant uptick that contrasts with Windows 11's decline to 45.73%
once they push KB5041587 as mandatory update or release 24H4 home/pro rtm situation will change after win11 is faster youtube videos.
KB5041587 it's an optional update, so they not pushing it yet to public.
HUB benchmarks arent' by updating stable windows, they benchmarked by downloaded beta 24H2 which is not stable yet. moreover, for 24h2 iot enterprise rtm(factory debloated) there is no KB5041587 . this one is the only 24H4 I've found not beta (RTM).
if you want to replicate HUB methology, first step is to install an beta windows.which I don't want to.
also, please note that in hardwareluxx benchmarks, 5800x3d benefits too.
Posted on Reply
Add your own comment
Aug 29th, 2024 16:21 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts