• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

We found the Missing Performance: Zen 5 Tested with SMT Disabled

Joined
Feb 1, 2019
Messages
3,052 (1.51/day)
Location
UK, Midlands
System Name Main PC
Processor 13700k
Motherboard Asrock Z690 Steel Legend D4 - Bios 13.02
Cooling Noctua NH-D15S
Memory 32 Gig 3200CL14
Video Card(s) 4080 RTX SUPER FE 16G
Storage 1TB 980 PRO, 2TB SN850X, 2TB DC P4600, 1TB 860 EVO, 2x 3TB WD Red, 2x 4TB WD Red
Display(s) LG 27GL850
Case Fractal Define R4
Audio Device(s) Soundblaster AE-9
Power Supply Antec HCG 750 Gold
Software Windows 10 21H2 LTSC
This basically just shows you how stupid the Windows scheduler actually is. It makes no sense to assign a heavy workload to a fully occupied physical core's virtual core. Microsoft should know and detect the difference between a physical core and a virtual one. They should at least make this an option in the power settings or something.

I can't help but wonder if all this anti-SMT stuff is a result of Intel's push to remove SMT from their CPU's, and Microsoft is deliberately nerfing performance to help make a case in the minds of consumers to get rid of it.

But another thing wouldn't surprise me, AMD knows their architecture is cache starved, and that enabling SMT also puts more pressure on the tiny L2 cache. 1MB is a joke.
I expect its been like this a while, you can probably find old posts or whatever on the net which have observed similar, someone even posted a old Zen1/2 SMT off TPU review pic a few posts up.

It can be made into a minor inconvenience, either add automated affinity for games that benefit from it, or setup soft parking to disable HTT/SMT on a custom power profile in windows, and then use a tool like AutoPowerOptionsOk to allow quick switching to and from the profile as and when needed.

Could even be a bug since the order is different between unparking and thread allocation.

Unparking priority.

First logical core of favoured cores.
First logical core of rest of cores in same performance class.
Second logical core of favoured cores (so only after all physical cores have a logical core unparked)
Second logical core of rest of cores in same performance class.

Thread allocation priority.
*Asterisk means skipped over as an option if unparked core limit is set to 50%

First logical core of favoured cores.
*Second logical core of favoured cores
First logical core of rest of cores in same performance class.
*Second logical core of rest of cores in same performance class.
 
Last edited:
Joined
Apr 30, 2020
Messages
919 (0.58/day)
System Name S.L.I + RTX research rig
Processor Ryzen 7 5800X 3D.
Motherboard MSI MEG ACE X570
Cooling Corsair H150i Cappellx
Memory Corsair Vengeance pro RGB 3200mhz 16Gbs
Video Card(s) 2x Dell RTX 2080 Ti in S.L.I
Storage Western digital Sata 6.0 SDD 500gb + fanxiang S660 4TB PCIe 4.0 NVMe M.2
Display(s) HP X24i
Case Corsair 7000D Airflow
Power Supply EVGA G+1600watts
Mouse Corsair Scimitar
Keyboard Cosair K55 Pro RGB
I'm suspecting there is an issue with the dual predictor then that maybe caused by the microcode updates for the bios, or the first batches are just junky spins of the design that the dual predictor just isn't working at all.
 
Joined
Jan 14, 2019
Messages
11,019 (5.39/day)
Location
Midlands, UK
System Name Nebulon B
Processor AMD Ryzen 7 7800X3D
Motherboard MSi PRO B650M-A WiFi
Cooling be quiet! Dark Rock 4
Memory 2x 24 GB Corsair Vengeance DDR5-4800
Video Card(s) AMD Radeon RX 6750 XT 12 GB
Storage 2 TB Corsair MP600 GS, 2 TB Corsair MP600 R2, 4 + 8 TB Seagate Barracuda 3.5"
Display(s) Dell S3422DWG, 7" Waveshare touchscreen
Case Kolink Citadel Mesh black
Audio Device(s) Logitech Z333 2.1 speakers, AKG Y50 headphones
Power Supply Seasonic Prime GX-750
Mouse Logitech MX Master 2S
Keyboard Logitech G413 SE
Software Windows 10 Pro
I feel like the low power usage of the X3D parts was explained some time ago, they use power differently to protect their stacked cache giving them lower power use.
That's not the point. The point is that playing with SMT gives you different results even on X3D that isn't restricted by power.
 
Joined
Apr 30, 2020
Messages
919 (0.58/day)
System Name S.L.I + RTX research rig
Processor Ryzen 7 5800X 3D.
Motherboard MSI MEG ACE X570
Cooling Corsair H150i Cappellx
Memory Corsair Vengeance pro RGB 3200mhz 16Gbs
Video Card(s) 2x Dell RTX 2080 Ti in S.L.I
Storage Western digital Sata 6.0 SDD 500gb + fanxiang S660 4TB PCIe 4.0 NVMe M.2
Display(s) HP X24i
Case Corsair 7000D Airflow
Power Supply EVGA G+1600watts
Mouse Corsair Scimitar
Keyboard Cosair K55 Pro RGB
If there is an issue an AGESA update would fix, IMO the 9000 series should've been delayed another few weeks.

But could an AGESA update fix a problem like when it could be a clock generator overlap for the predictors? The question is it an internal clock gen or external?
I mean if both predictors are working ok fine, but continually picking the same answers for the predictions, then one of them is just trashing its predictions out as misses.
I mean just staggering one clock cycle/ instruction from the other predictor could easily solve that problem, but then what if the other predictor was supposed to be for the SMT?
You can't stagger the clock generator for it as it's part of the same core. The only answer I can conclude is since the architectural dive already mentions dual pipelines it's not part of the SMT so the minimal gains you see are normal, or they would be far higher.
 
Last edited:
Joined
Nov 21, 2010
Messages
2 (0.00/day)
Processor i5 750@3800
Motherboard MSI P55 GD65
Cooling Thermaltake Contac 29
Memory 4GB DDR3 CAS7 @1440
Video Card(s) Twin Palit 9800GTX+'s SLI
Storage Spinpoint F3, WD Caviar Blue 640
Display(s) LG Flatron W2252TQ
Case KINGWIN KT-436BK BLACK
Audio Device(s) X-FI Xtreme Gamer
Power Supply Antec Truepower New 650
Software Win 7 and XP
Be shure to check the minimums page, might tell a different story for your processor
 
Joined
Jun 20, 2022
Messages
29 (0.04/day)
Location
ACCESS DENIED
System Name Who tf is playing megalovania over the mic?
Processor Ryzen 7 5700x
Motherboard ASUS ROG STRIX X570-E GAMING
Cooling Noctua NH-U12S REDUX
Memory 32Gb Corsair vengeance LPX 3200MHz CL16
Video Card(s) MSI RTX 2070S GAMING X
Storage Samsung 980 PRO 2TB
Display(s) ASUS TUF Gaming VG27AQ
Case NZXT H710
Audio Device(s) Logitech G PRO
Power Supply Seasonic PRIME Ultra 650 platinum
Mouse Logitech G604 / G pro wireless (modded)
Keyboard Corsair K70 RGB MK.2 (cherry MX silent) (tape/foam mod)
Benchmark Scores The hell is a benchmark?
What's the point of know the that difference if no one uses the hardware like that?

How about actually testing at settings that some (any?) people will use, so they can make an informed buying choice based on the review? Unless the actual purpose of the review is to mislead people, so they'll think it's better. But in the real world at their settings, it isn't.

People spending the money for a 4090 and Zen 5 are surely using at least 1440p, maybe in ultra wide, probably at high Hz, maybe 4K or even higher res than that once screens exist. That's what should be tested, obviously along with the CPU focussed stuff like rendering, encoding and the other stuff people would actually buy such kit for.


1723379428288.jpeg

...
 

sfjuocekr

New Member
Joined
Jan 23, 2024
Messages
9 (0.04/day)
Wow, this article is the perfect example of the misunderstanding what SMT is and what it can or can NOT do.

It doesn't just magically double your threads for double the performance, it uses instructions it can execute at the same time that are left unused.

If your 8 core CPU with 16 threads is utilized for 50%... that might as well mean it is running at 100% capacity.

Games are especially bad at SMT workloads because most game programming is procedural, you can run specific tasks on different threads but you are still running a single game loop of functions that are dependent on one another!

Video encoding is easy for SMT as it executes a lot of functions that don't change between data and thus are easily translated into instructions to fill utilization of the CPU.

30 years ago game programmers were still chasing performance optimizations, today there is only b4dc0d3.
 
Joined
Jan 6, 2013
Messages
62 (0.01/day)
dear @W1zzard , the av1 encoding test should specify what version of the encoder you are using and what preset. They have been updating it continuously to improve the encoding times and video quality. Last time I did an av1 transcode I used preset 2.

I am also sort of confused by the wording: "We use the SVT-AV1 encoder with a 4K video source to compress to AV1" are you transcoding from h.265 4K to AV1 4K, or AV1 4K to AV1 1080p ? Same question with the wording for the h.265 bench.

I also think it could be useful to know the overall length of the video file or see the results in FPS in addition to the time. Just seeing the time isn't that useful if you are trying to estimate how long it could take to re-encode a 2 hour long movie.
 
Last edited:
Joined
Mar 16, 2017
Messages
1,943 (0.72/day)
Location
Tanagra
System Name Budget Box
Processor Xeon E5-2667v2
Motherboard ASUS P9X79 Pro
Cooling Some cheap tower cooler, I dunno
Memory 32GB 1866-DDR3 ECC
Video Card(s) XFX RX 5600XT
Storage WD NVME 1GB
Display(s) ASUS Pro Art 27"
Case Antec P7 Neo
Well, they'll also have vastly more powerful e-cores too this time around. If Intel don't fcuk up Arrow Lake, they do actually deliver on reduced power consumption, then I would take an i5, let alone i7 over a 9700X any day of the week. i5 will have 14 real cores so 14 threads vs 9700X's 8 cores and 16 threads. It will be rare those extra 2 threads will net them a win. I know most people are hyped for X3D models, but for me gaming is 25% max what I do, so unless they have degimped productivity with Zen v-cache models hard to see why one wouldn't choose Intel unless gaming is all you care about.

Still it's Intel, and who knows if Arrow lake will now even ship this year.


Yeah, I doubt TPU will have convinced Hardware Unboxed to call off the hounds LOL.


Steve actually alluded to other reviews and basically dismissed their results. He's likes to jab the needle into AMD whenever he gets the chance. Maybe Lisa touched him inappropriately when he was a child.


Well we can try that on Strix as it's two ccd's one with 4 Zen 5 cores and one with up to 8 Zen 5c cores and you can turn off SMT in the bios I presume. It's a shame we don't have a Strix model with 6 + 6 config to compare to 9600X at same power.
I dunno, my work PC with all those E cores sure doesn't seem to load them sometimes. I wish I could go back to the quad core that was all one type, TBH.
This basically just shows you how stupid the Windows scheduler actually is. It makes no sense to assign a heavy workload to a fully occupied physical core's virtual core. Microsoft should know and detect the difference between a physical core and a virtual one. They should at least make this an option in the power settings or something.

I can't help but wonder if all this anti-SMT stuff is a result of Intel's push to remove SMT from their CPU's, and Microsoft is deliberately nerfing performance to help make a case in the minds of consumers to get rid of it.

But another thing wouldn't surprise me, AMD knows their architecture is cache starved, and that enabling SMT also puts more pressure on the tiny L2 cache. 1MB is a joke.
I doubt it's intentional, but I don't doubt that Intel has offered a lot of assistance with getting P+E support into Windows. It's probably not as simple as "if Windows detects P+E, use this type of scheduler, if just P cores, use the classic one." Intel is still the marketshare leader, and they also offer more compiling tools and support than AMD. Head over to Linux, and I think we see a more even approach to support. This is only further reinforced by the 10-15% performance gains that Zen 5 is showing over Zen 4. No need to disable anything.

Isn't in interesting that when Qualcomm designed Snapdragon X, they made it up to 12 P cores, with no E cores? Qualcomm has been producing P+E Arm chips for about a decade now, yet they skipped that design choice entirely for their Windows entry.
 

ARF

Joined
Jan 28, 2020
Messages
4,568 (2.74/day)
Location
Ex-usa | slava the trolls
I don't think that a software-based solution is the way to go.

Why?
Exactly an appropriate BIOS version, AMD Ryzen chipset driver and Windows power plan driver should resolve the issues, these should take the control over whether or not multi-threading is enabled or disabled. Even a simple list of applications which are tested in the laboratory should be entered in these software packages, to help to improve the performance.
 
Joined
Aug 23, 2013
Messages
569 (0.14/day)
Zen5 was obviously built to improve handheld and console gaming. Everyone else should skip it if they have anything remotely modern.
 
Joined
Sep 4, 2022
Messages
247 (0.35/day)
In a CPU comparison you need to remove as much GPU bound situations as possible so your data isn't ruined. And your opinion about what you think someone else will do with their system is irrelevant considering there's 1080p 480Hz oled monitors out and you're still going to need fast hardware to push those numbers.
It's true plus another thing we can extrapolate is the maximum and minimum performance at those lower resolutions of the cpu. If the cpu peaks at 240 fps at 1080p and has a minimum of let's say 190 then you are not going to get a locked 240 fps at any given resolution even with a 5090 or greater gpu average game stack ( outside of a few ouliers)
Usually for every 1 cpu upgrade their are multiple gpu upgrades and nothing on the market will be able to give you those 480hz and even 4k 240hz solid experience at most titles even with a better gpu. By testing a cpu benchmark at 4k those 4090 owners will not know what the potential uplift would be with a future gpu upgrade unless they look at the 1080p/720p data.
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
27,426 (3.70/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
dear @W1zzard , the av1 encoding test should specify what version of the encoder you are using and what preset. They have been updating it continuously to improve the encoding times and video quality. Last time I did an av1 transcode I used preset 2.

I am also sort of confused by the wording: "We use the SVT-AV1 encoder with a 4K video source to compress to AV1" are you transcoding from h.265 4K to AV1 4K, or AV1 4K to AV1 1080p ? Same question with the wording for the h.265 bench.

I also think it could be useful to know the overall length of the video file or see the results in FPS in addition to the time. Just seeing the time isn't that useful if you are trying to estimate how long it could take to re-encode a 2 hour long movie.
it's not the goal of this test to tell you how long it takes to encode a 2 hour movie. This is an in in-house benchmark that's only used in my reviews and not comparable to other results, so I intentionally don't publish some information so that people don't start comparing

but if you must know, v1.8.0-12-g4a154220 GCC 13.2.0 64 bit, 4K H.264 to 4K AV1, 60 seconds, 24 FPS, preset 6
 
Joined
Sep 4, 2022
Messages
247 (0.35/day)
For real?

HUB has been called the biggest AMD fanboy reviewers for years.

Personally I think neither is true, but AMD CPU benchmarks are usually a bit better at HUB than TPU.

I don't blame TPU for that.
Wait for 9800x3d then it will be back to Intel bashing. lol.
 
Joined
Sep 20, 2021
Messages
315 (0.30/day)
Processor Ryzen 7 9700x
Motherboard Asrock B650E PG Riptide WiFi
Cooling Underfloor CPU cooling
Memory 2x32GB 6400MT/s
Video Card(s) RX 7900 XT OC Edition
Storage Kingston Fury Renegade 1TB, Seagate Exos 12TB
Display(s) MSI Optix MAG301RF 2560x1080@200Hz
Case Phanteks Enthoo Pro
Power Supply NZXT C850 850W Gold
Mouse Bloody W95 Max Naraka
For real?

HUB has been called the biggest AMD fanboy reviewers for years.

Personally I think neither is true, but AMD CPU benchmarks are usually a bit better at HUB than TPU.

I don't blame TPU for that.
HUB are useful idiots, their reviews/results are sometimes "too close" to average users = misleading.
TPU give you few different results including OC - I hope that you can see now the difference ;)
 
Joined
Apr 12, 2013
Messages
7,245 (1.75/day)
For real?

HUB has been called the biggest AMD fanboy reviewers for years.

Personally I think neither is true, but AMD CPU benchmarks are usually a bit better at HUB than TPU.

I don't blame TPU for that.
Everyone's been called everything at some point in time, including Toms/AT/TPU/[H] or any number of review sites from 20 years back! It doesn't mean anything as long as your data's accurate.
 
Joined
Sep 10, 2018
Messages
6,321 (2.91/day)
Location
California
System Name His & Hers
Processor R7 5800X/ R7 7950X3D Stock
Motherboard X670E Aorus Pro X/ROG Crosshair VIII Hero
Cooling Corsair h150 elite/ Corsair h115i Platinum
Memory Trident Z5 Neo 6000/ 32 GB 3200 CL14 @3800 CL16 Team T Force Nighthawk
Video Card(s) Evga FTW 3 Ultra 3080ti/ Gigabyte Gaming OC 4090
Storage lots of SSD.
Display(s) A whole bunch OLED, VA, IPS.....
Case 011 Dynamic XL/ Phanteks Evolv X
Audio Device(s) Arctis Pro + gaming Dac/ Corsair sp 2500/ Logitech G560/Samsung Q990B
Power Supply Seasonic Ultra Prime Titanium 1000w/850w
Mouse Logitech G502 Lightspeed/ Logitech G Pro Hero.
Keyboard Logitech - G915 LIGHTSPEED / Logitech G Pro
For real?

HUB has been called the biggest AMD fanboy reviewers for years.

Personally I think neither is true, but AMD CPU benchmarks are usually a bit better at HUB than TPU.

I don't blame TPU for that.

Different configurations, Different test scenes.

When I start to question results is when a cpu that is typically 10% faster than X cpu is 30% faster in a particular reviewers results.
 
Joined
Jan 19, 2008
Messages
327 (0.05/day)
Location
Planet Earth
System Name V I K I N G
Processor Intel Core i5 11400F
Motherboard Gigabyte AORUS Z590 PRO AX
Cooling ASUS ROG Strix LCII ARGB 360mm AIO + Corsair Commander PRO + Corsair QL 120 x 3 + Node Pro
Memory 32 GB / Corsair VENGEANCE RGB PRO SL @ 3200Mhz
Video Card(s) ASUS Republic of Gamers RTX 3060 Ti 8GB OC edition
Storage 500GB WesternDigital BLACK PCI Gen 4 M.2 NVMe
Display(s) ASUS TUF 27" 180Hz Fast IPS Gaming Monitor
Case Thermaltake View 51 TG Snow ARGB
Audio Device(s) ASUS ROG Delta S Quad DAC Hi-Res Audio + Samsung 1000W modded Home teather speakers
Power Supply CORSAIR RM750x SHIFT series - 750W
Mouse Corsair SABRE RGB PRO Champion series
Keyboard Corsair K70 TKL Champion series
Software Windows 11 PRO X64
according to Intel overclockers:
-we have to disable cool & quiet
-PBO
-XMP
-pch
-rebar
-IGD/shared memmory
-ErP
-power fault protection
-usb standby power at s4/s5
-ccpt
-cpu overheat protection
-legacy usb
-CSM
-UEFI
-secure boot
-fTPM
-core isolation
-spectre/meltdown protection
-chasis intruder
-SMT
-we'll end up disabling AMD

but we'll gain 10 cinebench points /s
hahaha This comment is GOLD on this harsh CPU times
 
Joined
Mar 14, 2014
Messages
1,343 (0.35/day)
Processor i7-4790K 4.6GHz @1.29v
Motherboard ASUS Maximus Hero VII Z97
Cooling Noctua NH-U14S
Memory G. Skill Trident X 2x8GB 2133MHz
Video Card(s) Asus Tuf RTX 3060 V1 FHR (Newegg Shuffle)
Storage OS 120GB Kingston V300, Samsung 850 Pro 512GB , 3TB Hitachi HDD, 2x5TB Toshiba X300, 500GB M.2 @ x2
Display(s) Lenovo y27g 1080p 144Hz
Case Fractal Design Define R4
Audio Device(s) AKG Q701's w/ O2+ODAC (Sounds a little bright)
Power Supply EVGA Supernova G2 850w
Mouse Glorious Model D
Keyboard Rosewill Full Size. Red Switches. Blue Leds. RK-9100xBRE - Hate this. way to big
Software Win10
Benchmark Scores 3DMark FireStrike Score : needs updating
It's not that big of a difference. Unless you have a 4090 and you game at 1080p or lower, I wouldn't bother.

Still interesting results, though.
Where the real difference is in the 1% lows.

Hogwarts Legacy and The Last of Us act totally different from one another. One loves SMT while the other struggles with it.

It looks like a game by game setting and not so much "found missing performance."

It seems like AMD needs to implement a similar program like Intel's APO.
 
Joined
Sep 20, 2021
Messages
315 (0.30/day)
Processor Ryzen 7 9700x
Motherboard Asrock B650E PG Riptide WiFi
Cooling Underfloor CPU cooling
Memory 2x32GB 6400MT/s
Video Card(s) RX 7900 XT OC Edition
Storage Kingston Fury Renegade 1TB, Seagate Exos 12TB
Display(s) MSI Optix MAG301RF 2560x1080@200Hz
Case Phanteks Enthoo Pro
Power Supply NZXT C850 850W Gold
Mouse Bloody W95 Max Naraka
What do you mean with now? What did you think I meant with "not blaming TPU" lol
You still haven't understood, I'll try to explain in another way.

I have a 5600x with which I did Cinebench 23 with a score of 12300 points, and a friend of mine did with his 5600x a score of 9600 points, he is an "average user" as I pointed up.

Now I hope to be "clearer"?

HUB are not a fanboy of this company or that, they are a fanboy of their followers.
 
Joined
Jun 30, 2019
Messages
42 (0.02/day)
Processor Ryzen 7 7700X
Motherboard ASRock B650E PG Riptide WiFi
Cooling Noctua NH-D15
Memory Kingston Fury Beast 32GB 5600 MHz CL36 @ 6200 MHz
Video Card(s) AMD Radeon RX 6600
Case Fractal Design Define R5
Power Supply Corsair RM550x
@W1zzard What are the results with SMT on, PBO max, CO compared to SMT off, PBO max, CO?
 
Joined
Apr 29, 2014
Messages
4,231 (1.12/day)
Location
Texas
System Name SnowFire / The Reinforcer
Processor i7 10700K 5.1ghz (24/7) / 2x Xeon E52650v2
Motherboard Asus Strix Z490 / Dell Dual Socket (R720)
Cooling RX 360mm + 140mm Custom Loop / Dell Stock
Memory Corsair RGB 16gb DDR4 3000 CL 16 / DDR3 128gb 16 x 8gb
Video Card(s) GTX Titan XP (2025mhz) / Asus GTX 950 (No Power Connector)
Storage Samsung 970 1tb NVME and 2tb HDD x4 RAID 5 / 300gb x8 RAID 5
Display(s) Acer XG270HU, Samsung G7 Odyssey (1440p 240hz)
Case Thermaltake Cube / Dell Poweredge R720 Rack Mount Case
Audio Device(s) Realtec ALC1150 (On board)
Power Supply Rosewill Lightning 1300Watt / Dell Stock 750 / Brick
Mouse Logitech G5
Keyboard Logitech G19S
Software Windows 11 Pro / Windows Server 2016
Interesting results, I guess that means there is some needed improvements on the platform to fix these issues.
 
Top