Monday, February 12th 2024
AMD Zen 5 Details Emerge with GCC "Znver5" Patch: New AVX Instructions, Larger Pipelines
AMD's upcoming family of Ryzen 9000 series of processors on the AM5 platform will carry a new silicon SKU under the hood—Zen 5. The latest revision of AMD's x86-64 microarchitecture will feature a few interesting improvements over its current Zen 4 that it is replacing, targeting the rumored 10-15% IPC improvement. Thanks to the latest set of patches for GNU Compiler Collection (GCC), we have the patch set that proposes changes taking place with "znver5" enablement. One of the most interesting additions to the Zen 5 over the previous Zen 4 is the expansion of the AVX instruction set, mainly new AVX and AVX-512 instructions: AVX-VNNI, MOVDIRI, MOVDIR64B, AVX512VP2INTERSECT, and PREFETCHI.
AVX-VNNI is a 256-bit vector version of the AVX-512 VNNI instruction set that accelerates neural network inferencing workloads. AVX-VNNI delivers the same VNNI instruction set for CPUs that support 256-bit vectors but lack full 512-bit AVX-512 capabilities. AVX-VNNI effectively extends useful VNNI instructions for AI acceleration down to 256-bit vectors, making the technology more efficient. While narrow in scope (no opmasking and extra vector register access compared to AVX-512 VNNI), AVX-VNNI is crucial in spreading VNNI inferencing speedups to real-world CPUs and applications. The new AVX-512 VP2INTERSECT instruction is also making it in Zen 5, as noted above, which has been present only in Intel Tiger Lake processor generation, and is now considered deprecated for Intel SKUs. We don't know the rationale behind this inclusion, but AMD sure had a use case for it.Next, we have a larger pipeline design. The Zen 5 integer unit has six ALUs compared to the four found in Zen 4. The Address Generation Unit (AGU) count is also higher, going from three to four. The floating point store pipelines are now doubled, and they are 256-bit each to handle a 512-bit floating point store from a single cycle. Some other instructions like cmov/setcc and floating point shuffles can now be handled by all ALUs in Zen 5, whereas in Zen 4, it was handled only by two ALUs. Apparently, the Zen 5 uArch is now handling most of the AVX-512 operations as a single slot pipeline cycle, rather than the old double pumping, which halved AVX-512 instructions into two 256-bit ones for processing on the 256-bit wide ALUs. Lastly, the patch notes that, once again, there will be no difference between Zen 5 and Zen 5c cores ISA-wise, same with Zen 4 and Zen 4c cores, where the latter only implemented smaller caches.
Sources:
Phoronix, AnandTech Forums
AVX-VNNI is a 256-bit vector version of the AVX-512 VNNI instruction set that accelerates neural network inferencing workloads. AVX-VNNI delivers the same VNNI instruction set for CPUs that support 256-bit vectors but lack full 512-bit AVX-512 capabilities. AVX-VNNI effectively extends useful VNNI instructions for AI acceleration down to 256-bit vectors, making the technology more efficient. While narrow in scope (no opmasking and extra vector register access compared to AVX-512 VNNI), AVX-VNNI is crucial in spreading VNNI inferencing speedups to real-world CPUs and applications. The new AVX-512 VP2INTERSECT instruction is also making it in Zen 5, as noted above, which has been present only in Intel Tiger Lake processor generation, and is now considered deprecated for Intel SKUs. We don't know the rationale behind this inclusion, but AMD sure had a use case for it.Next, we have a larger pipeline design. The Zen 5 integer unit has six ALUs compared to the four found in Zen 4. The Address Generation Unit (AGU) count is also higher, going from three to four. The floating point store pipelines are now doubled, and they are 256-bit each to handle a 512-bit floating point store from a single cycle. Some other instructions like cmov/setcc and floating point shuffles can now be handled by all ALUs in Zen 5, whereas in Zen 4, it was handled only by two ALUs. Apparently, the Zen 5 uArch is now handling most of the AVX-512 operations as a single slot pipeline cycle, rather than the old double pumping, which halved AVX-512 instructions into two 256-bit ones for processing on the 256-bit wide ALUs. Lastly, the patch notes that, once again, there will be no difference between Zen 5 and Zen 5c cores ISA-wise, same with Zen 4 and Zen 4c cores, where the latter only implemented smaller caches.
29 Comments on AMD Zen 5 Details Emerge with GCC "Znver5" Patch: New AVX Instructions, Larger Pipelines
Nice to see AVX to be added though.
170W CPUs requiring 360mm AIO is prohibitive.
This means that 9000 will be slower than 7000 because of the lower TDP (even if 10-15% IPC increase).
What a ridiculous - super hot CPUs, whiny small and hot M.2 PCIe 5.0 SSDs, super gigantic graphics cards.
All of them - no-go.
Zen 5 will have a 10-15% IPC increase and clocks up to 6.0 GHz. No one is asking for lower TDP for enthusiast CPUs and most likely the full turbo clocks will use the full 220W of the AM5 socket.
www.techpowerup.com/review/amd-ryzen-9-7950x-cooling-requirements-thermal-throttling/
BTW the X3D parts are as efficient and low power as you can possibly ask for.
www.techpowerup.com/review/amd-ryzen-9-7950x3d/
7800X3D :p
Oh, and wait till Lisa Su decides to offer normal IHS on top these CPUs. :roll:
Anything else is an epic fail.
Uh, I was responding to this exact part of your statement. I have no idea where you got the “slower” part from. I assume, seeing how you brought up X3D yourself, that you are fully aware that there are ways of increasing performance without going overboard on power. Efficiency sweet spots are a thing. So… I have no idea what you are even arguing at this point.
So if it hits 90 degree's it's good; it's doing what it should. A 90 degree core is'nt going to melt down your room.
Its simply due to their performance target. When you let it it will do within scope of what it is capable of. Thats why you buy a performance cpu in the first place.
IHS is there for compatibility to previous AM4 coolers. Now that it's established, I'd say it's basically guaranteed not to change until AM6. Efficiency sweet spots are a thing of the past. You're more than allowed to underclock your CPU to hit that sweet spot(AMD makes it easy with ECO mode), but Intel & AMD aren't going to sell lower power CPU capable of doing much more, for less money, like they did in the past.
But it requires of the reviews. This is in-line with the pre-Zen4 rummor mills.
I sincerely apoligize for derailling the thread, but: Lol. What are you talking about? The Zen4 CPU will consume as much as user set it to consume. From what I've seen and read, the difference between various "ECO" modes and uncapped is not that big. Heck, one can put an intel shitty box cooler, and the CPU will envelop exactly to the cooling capacity. The throttling is not a problem, this is just the result of CPU being uncapped of it's max performance. And it won't get damaged, unless the reckless and lame "A...STek" motherboard vendor puts 1.45V on default, along with enormous current. Again, the CPU itself will just regulate it's performance, accordingly to it's cooling. Example. Another one.
Apply the same TDP limits as such that been used to Zen3, and it will perform the same, or even more efficiently.
And this isn't new stuff. There was an old video, might be Paul's Hardware, where the cooler on the editor's Threadripper has gone, and the clocks were just lowered. CPU was completely fine. And PC did everything it was thrown at.
You can set the 65, or even 35W limit to any modern Ryzen CPU, and will be cool and efficient. And it's just couple clicks. Some TPU users have shown, that intel "counterparts" can be efficient as well, but it's more pain to do so. However, out of the box, all last intel CPUs since 9K are just too hot and thirsty. They all just basically repeat that Pentium D moment, all over again.
But I completely agree, that CPU and GPU vendors, push their HW too far beyond it's sweet point. Just like it was with Vega, all HW, is better to set to UV, for just absolute majority of basic tasks most people do. And enthusiasts... well they know what they're doing, and what they want to achieve.
However, again, the Zen based CPUS, power envelop just goes accordingly to it's cooling capacities. There's no issues. They can be capped at e.g. 89W and still have the 90-95%-ish (don't quote me on that) of total performance. You've got the idea. With intel... I'm not really sure.
As about tiny M.2. Yes, they've deliberately chopped the lanes, and that's unacceptable. But go, show me M.2 5.0 on Z690, for example. ;) It's your preference, which I respect, and none can have judgegement for that. That's one of the possible alternates. The better one is 7800X3D. Since, Ze4 despite being slightly more powerhungry, it' is more powerefficient as well, and the only major difference compared to Zen3 is just, that Zen4 is basically uncapped, due to more rigid and powerful socket.
But if to consider the motherboard pricing, and their limited feature sets, then yeah, the AM4 boards have hugely better one. But then, the DDR4 RAM comes into play, which was more sensitive point for AM4 CPUs, then DDR5 to Zen4 ones. And considering it's stock shrinking, the amount of better RAM goes even pricier, and thus it's harder to find suitable kit. At this point, 64GB of DDR5 6000, can be a cheaper and better variant, considering DRAM cartel plans to rise prices for all RAM, for 20% each quarter, the DDR4 is a no go. Like the ones this?
I can get that IHS could be a lot thinner, and thus have better heat spread and dissipation. As much as it's less of an issue, when buying into new platform/PC.
But I also can get the reasoning behind this move, as, there's no need in change of cooler mount, as a result it is way less pain for the cooler manufacturers, to supply all the existing users the new mounting sets/kits, for basically the same TDP requirements. Contrary to the pointless change of socket specs, like several times for basically the same CPU and socket (115x drama), instead of using just same LGA1152/1156 straight from the beginning. Just because they could.
And everyone could see, how this cocket frequent change BS of "blue" team, ceased just when the competition appeared.
The average lift from the IPC curve is measured based on many practical and scientific tests.
If I remember correctly, the average Zen3 IPC increase is +19% with an increase curve of +1% to +46%. Including one specific load where Zen3 shows a +109% increase in IPC.
SunnyCove average IPC increase +18% from +3-4% to +40% curve
GoldenCove average IPC increase +19% from +3% curve to +60%.
MLID will get information from someone that in some one test out of the entire field of tests, the Zen5 IPC growth curve shows a 30% IPC increase, and MLID will rudely announce that Zen5 has a +30% IPC increase.
And it won't be a lie because everyone will only misinterpret it or present it without more knowledge.