• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Talks Zen 4 and RDNA 3, Promises to Offer Extremely Competitive Products

Frick

Fishfaced Nincompoop
Joined
Feb 27, 2006
Messages
19,581 (2.86/day)
Location
Piteå
System Name White DJ in Detroit
Processor Ryzen 5 5600
Motherboard Asrock B450M-HDV
Cooling Be Quiet! Pure Rock 2
Memory 2 x 16GB Kingston Fury 3400mhz
Video Card(s) XFX 6950XT Speedster MERC 319
Storage Kingston A400 240GB | WD Black SN750 2TB |WD Blue 1TB x 2 | Toshiba P300 2TB | Seagate Expansion 8TB
Display(s) Samsung U32J590U 4K + BenQ GL2450HT 1080p
Case Fractal Design Define R4
Audio Device(s) Plantronics 5220, Nektar SE61 keyboard
Power Supply Corsair RM850x v3
Mouse Logitech G602
Keyboard Cherry MX Board 1.0 TKL Brown
Software Windows 10 Pro
Benchmark Scores Rimworld 4K ready!
One only has to take a look at the recently released M1 Chip by Apple to see that big IPC improvements are apparently quite doable still. The M1 chip can keep up or even exceed Intel and AMDs best at singlethreaded performance while the M1's clock speeds is quite a bit lower.

Apples to oranges, literally.
 
Joined
Jun 16, 2015
Messages
36 (0.01/day)
Processor Ryzen 9 5800X3d
Motherboard Gigabyte X570 I Aeorus Pro Wifi
Cooling Noctua NH-U12A
Memory G.SKILL 32GB KIT DDR4 3600 MHz CL16 Trident Z @3666MHz tuned by Ryzen calculator
Video Card(s) EVGA 3080Ti XC3 ULTRA@1800MHz 0.8v
Storage Samsung 980 PRO 2 TB, ADATA XPG SX8200 Pro 2TB
Display(s) 42" LG C2 OLED
Case Cooler Master MasterBox NR200P
Audio Device(s) Grado
Power Supply Corsair SF750
Mouse Logitech G PRO X Superlight
Keyboard custom
Probably there is some of that, however Geekbench itself is without question tuned for their chips.



No it doesn't, show me an example where an x86 processor is decode bound.



There isn't a single reason to believe they would generate the same amount of instructions. And that wouldn't even mean anything, the problem is with the optimizations that the compilers themselves apply.



IPC fluctuates according to architecture, in fact it even fluctuates within the same architecture. A processor never has a constant IPC, that's quite literally impossible.

You can come up with an "average IPC" but that wouldn't mean much either.
M1 has 50% more IPC than latest Ryzen, can you see that bound ? There is no way that next Ryzen will have more than 4 decoders, there is no way that next Ryzen will have 50% IPC, can you see that bound ? This is the main change M1 has.
 
Joined
Jan 8, 2017
Messages
9,438 (3.27/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
M1 has 50% more IPC than latest Ryzen, can you see that bound ? There is no way that next Ryzen will have more than 4 decoders, there is no way that next Ryzen will have 50% IPC, can you see that bound ? This is the main change M1 has.

I don't think you understood the question.

Can you prove that the decoder is a limitation on an x86 processors and that it is indeed a problem ? If it was it would have been impossible to increase IPC by widening the back end of these CPUs which is exactly what AMD and Intel are doing.

On the other hand ARM inherently needs a higher decode throughput, for example, because of the lack of complex addressing modes.
 
Last edited:
Joined
Jul 21, 2016
Messages
144 (0.05/day)
Processor AMD Ryzen 5 5600
Motherboard MSI B450 Tomahawk
Cooling Alpenföhn Brocken 3 140mm
Memory Patriot Viper 4 - DDR4 3400 MHz 2x8 GB
Video Card(s) Radeon RX460 2 GB
Storage Samsung 970 EVO PLUS 500, Samsung 860 500 GB, 2x Western Digital RED 4 TB
Display(s) Dell UltraSharp U2312HM
Case be quiet! Pure Base 500 + Noiseblocker NB-eLoop B12 + 2x ARCTIC P14
Audio Device(s) Creative Sound Blaster ZxR,
Power Supply Seasonic Focus GX-650
Mouse Logitech G305
Keyboard Lenovo USB
With a price increase of only +50 $/€ MSRP, so our new 6 core cpus will be available at only 400 $/€

/sarcasm

Not really.
 
Joined
Jun 10, 2014
Messages
2,987 (0.78/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
I cannot "see" any big IPC improvements from now on. 10% max from gen to gen is my prediction. Zen3 made a huge jump. Clocks and efficiency will determine the progress until new materials for transistors are used that will allow big clock jumps (graphite anyone?).
Upcoming Sapphire Rapids / Golden Cove is a major architectural overhaul, greatly extending the CPU front-end beyond Sunny Cove. I would be surprised if AMD wouldn't attemt something comparable.

AMD expect clock speeds to decrease over the next nodes, so don't expect much there.

But even if materials would allow significantly higher clock speeds, the current bottlenecks would just become more apparent. Pretty much all non-SIMD workloads scales towards cache misses and branch mispredicions. Cache misses have a nearly fixed time cost (the memory latency), so increasing the CPU clocks actually increases the relative cost of a cache miss. The cost of branch mispredicions in isolation is fixed in clock cycles, but they can cause secondary cache misses. My point is, if we don't reduce these issues, performance would be greatly hindered, even if we were able to increase clock speed significantly.

The changes we know Intel will bring is more of the same as both Sunny Cove and Zen 3 brought us; larger instruction windows, larger uop caches, larger register files, and possibly more execution ports. IPC gains are certainly still possible, I believe further generational jumps in the 20-30% range is still possible. But these changes will obviously have diminishing returns, and there are limits to how much paralellization can be extracted without either rewriting the code and/or extending the ISA. I know Intel is working on something called "threadlets", which may be solving a lot of the pipeline flushes and stalls. If this is successful, we could easily be looking at a 2-3x performance increase.

So you're saying Apple has magic technology that makes general purpose code run on fixed-function hardware accelerators? Or did they tune their chip specifically for GeekBench? ;)
Geekbench is showcasing Apple's accelerators. This is a benchmark of various implementations, not something that translates into generic performance. Geekbench is useless for anything but showcasing special features.

IPC is instructions per clock. If the performance of the new M1 in some application or benchmark, say darktable or SPEC, is the same as on the latest intel tiger lake processor, the amount of instructions in the executables is similar and the clock speed of M1 is less than that of the tiger lake processor, it means that the IPC is higher on M1 than on tiger lake.

What am I missing here?

edit: for example take these results:
View attachment 183964
<snip>
And to clarify, the above test makes no use of fixed function accelerators.
You are right about this not using special accerlation, but you're missing that this is a "pure" ALU and FPU benchmark. There is nothing preventing an ARM design from having comparable ALU or FPU performance. This only stresses a small part of the CPU, while the front-end and the caches are not stressed at all. Such benchmarks are fine to showcase various strengths and weaknesses of architectures, but they don't translate into real world performance. The best benchmark is always real workloads, and even if synthetic workloads are used, there should be a large variety of them, and preferably algorithms, not single unrealistic workloads.

M1 has 50% more IPC than latest Ryzen, can you see that bound ? There is no way that next Ryzen will have more than 4 decoders, there is no way that next Ryzen will have 50% IPC, can you see that bound ? This is the main change M1 has.
Comparing IPC across ISAs means you don't even know what IPC means.

I don't believe there is anything preventing microarchitectures from Intel or AMD to have more than 4 decoders, you can pipeline many of them, and there is a good chance that upcoming Sapphire Rapids / Golden Cove actually will.
 
Joined
Apr 24, 2020
Messages
2,710 (1.61/day)
Sorry, but that's not IPC. The actual CPU cores aren't that fast, the reason these SoCs keep up are with the help of lots and lots of accelerators that help speed up tasks where the CPU cores are too slow to keep up. By utilising task specific co-processors (as do almost all ARM CPUs), it's possible to offer good system performance, without having great IPC.

The M1 chips have 8-wide decoders on the front end and over 700-depth to search for out-of-order execution.

In contrast, x86 chips are 4-wide decoders with only 350-ish depth for out-of-order.

-----

The "weakness" of the M1 chip is the absurd size. The M1 is far larger than the Zen chiplets but only delivers 4-cores. Apple is betting on HUGE cores for maximum IPC / single threaded performance, above and beyond what x86 (either Intel or AMD) delivers.

There is no way that next Ryzen will have more than 4 decoder
Wait, why not?

Its a question of corporate will, not technical feasibility. You can make a 16-wide decoder if you wanted. The question is if making the decoder 4x bigger is worth the 4x larger area (and yes, I'm pretty sure that parallel decoding is a O(n) problem in terms of chip size. Its non-obvious but I can discuss my proof if you're interested). It isn't necessarily 4x faster either (indeed: a 16-wide decoder wouldn't be helpful at all for small loops that are less than 16-instructions long)

AMD is clearly a "many core" kind of company. They pushed for more cores, instead of bigger cores, for Bulldozer, and even Zen. AMD could have made a bigger core than Intel for better IPC, but instead wanted to go with more cores. All Apple has done is show that there's room for bigger-cores.
 
Last edited:
Joined
Dec 26, 2006
Messages
3,837 (0.59/day)
Location
Northern Ontario Canada
Processor Ryzen 5700x
Motherboard Gigabyte X570S Aero G R1.1 BiosF5g
Cooling Noctua NH-C12P SE14 w/ NF-A15 HS-PWM Fan 1500rpm
Memory Micron DDR4-3200 2x32GB D.S. D.R. (CT2K32G4DFD832A)
Video Card(s) AMD RX 6800 - Asus Tuf
Storage Kingston KC3000 1TB & 2TB & 4TB Corsair MP600 Pro LPX
Display(s) LG 27UL550-W (27" 4k)
Case Be Quiet Pure Base 600 (no window)
Audio Device(s) Realtek ALC1220-VB
Power Supply SuperFlower Leadex V Gold Pro 850W ATX Ver2.52
Mouse Mionix Naos Pro
Keyboard Corsair Strafe with browns
Software W10 22H2 Pro x64
Step 1 - get ample supply in shelves
 
Joined
Oct 12, 2005
Messages
708 (0.10/day)
There is plenty of way to improve IPC in the future, we are very far from perfection right now.

Just having larger cores, with a larger front end to feed it will increase IPC. There still improvement that can be done in Branch prediction, There are still optimisation on feeding the cores with data and instructions, on prefetch algorythm to improve cache it. On cache management algorythm etc..

It's not because Intel did baby step with their tick-tock strategy that we are even close to be near the end of improving IPC. It's a matter of design choice. Very large core, more core, more cache, etc... You need to balance everything and do the right choice to get the most overall performance.

CPU design is all about choice trade off.
 
Joined
Oct 18, 2013
Messages
6,193 (1.53/day)
Location
Over here, right where you least expect me to be !
System Name The Little One
Processor i5-11320H @4.4GHZ
Motherboard AZW SEI
Cooling Fan w/heat pipes + side & rear vents
Memory 64GB Crucial DDR4-3200 (2x 32GB)
Video Card(s) Iris XE
Storage WD Black SN850X 4TB m.2, Seagate 2TB SSD + SN850 4TB x2 in an external enclosure
Display(s) 2x Samsung 43" & 2x 32"
Case Practically identical to a mac mini, just purrtier in slate blue, & with 3x usb ports on the front !
Audio Device(s) Yamaha ATS-1060 Bluetooth Soundbar & Subwoofer
Power Supply 65w brick
Mouse Logitech MX Master 2
Keyboard Logitech G613 mechanical wireless
Software Windows 10 pro 64 bit, with all the unnecessary background shitzu turned OFF !
Benchmark Scores PDQ
To sum it up, TSMS's manufacturing progress and capacity will be the limiting factor for the PC sector's performance progress.
Although I agree on both counts, I believe capacity is the biggest limitation right now and for the foreseeable future, at least until they can get some more fabs built & producing. I don't think that their progress is an issue, since they have publicly stated their intentions to move forward with 5, 3 & 2nm nodes as fast as possible.

But, as it stands right now, they have way more orders from existing customers than they can reasonably be expected to fulfill in a timely manner, which I guess is a somewhat good problem to have in most respects....
 
Joined
Jul 7, 2020
Messages
128 (0.08/day)
Zen 4 sometime close to the end of 2021

I think this is wrong, look closer:
1610656414140.png
 
Joined
May 3, 2018
Messages
2,881 (1.20/day)
If x86 can only support 4 decoders, how did Skylake manage to have 5 decoders and why did they get rid of the 5th?
 

Space Lynx

Astronaut
Joined
Oct 17, 2014
Messages
17,269 (4.67/day)
Location
Kepler-186f
Processor 7800X3D -25 all core ($196)
Motherboard B650 Steel Legend ($179)
Cooling Frost Commander 140 ($42)
Memory 32gb ddr5 (2x16) cl 30 6000 ($80)
Video Card(s) Merc 310 7900 XT @3100 core $(705)
Display(s) Agon 27" QD-OLED Glossy 240hz 1440p ($399)
Case NZXT H710 (Red/Black) ($60)
None of this matters, they won't be in stock anyway.
 

SL2

Joined
Jan 27, 2006
Messages
2,449 (0.36/day)
Zen 4 sometime close to the end of 2021

I think this is wrong, look closer:
Yeah, AMD has been launching its Ryzen models with more than 12 months between. I just looked at review dates here, as they usually are published when NDA ends (I'd assume).

1800X - 2 March 2017

2700X - 19 April 2018, 413 days after the launch before

3700X - 7 July 2018, 445 days after the launch before

5600X - 5 November 2020, 487 days after the launch before


Given that the latest Ryzen models are the most competitive for its time, and that AMD never had such a hard time meeting the demands as they do now, I'd be surprised if we'll see any major launch from AMD this year. Boring -XT models doesn't count.

7 nm production shortage would be one reason to move faster, but I'm not sure about the 5 nm production capacity anyway.
 

Space Lynx

Astronaut
Joined
Oct 17, 2014
Messages
17,269 (4.67/day)
Location
Kepler-186f
Processor 7800X3D -25 all core ($196)
Motherboard B650 Steel Legend ($179)
Cooling Frost Commander 140 ($42)
Memory 32gb ddr5 (2x16) cl 30 6000 ($80)
Video Card(s) Merc 310 7900 XT @3100 core $(705)
Display(s) Agon 27" QD-OLED Glossy 240hz 1440p ($399)
Case NZXT H710 (Red/Black) ($60)
Yeah, AMD has been launching its Ryzen models with more than 12 months between. I just looked at review dates here, as they usually are published when NDA ends (I'd assume).

1800X - 2 March 2017

2700X - 19 April 2018, 413 days after the launch before

3700X - 7 July 2018, 445 days after the launch before

5600X - 5 November 2020, 487 days after the launch before


Given that the latest Ryzen models are the most competitive for its time, and that AMD never had such a hard time meeting the demands as they do now, I'd be surprised if we'll see any major launch from AMD this year. Boring -XT models doesn't count.

7 nm production shortage would be one reason to move faster, but I'm not sure about the 5 nm production capacity anyway.

Agree with this 100%, companies at some point have to milk their leading products, because eventually node shrinks just won't cut it
 
Joined
Jan 14, 2021
Messages
16 (0.01/day)
Location
Australia
Processor 11800H
Motherboard Intel NUC X15 Laptop
Cooling Thermaltake Massive 20 RGB
Memory BL2K32G32C16S4B
Video Card(s) RTX 3060 Laptop
Storage 2X 2TB 980 Pro
Display(s) 2X Dell S2721DGF. 1X Laptop display 1080P 240Hz
Software Windows 11
Would love a new APU that isn't OEM only.
 

SL2

Joined
Jan 27, 2006
Messages
2,449 (0.36/day)
Agree with this 100%, companies at some point have to milk their leading products, because eventually node shrinks just won't cut it
Besides, developing new generations cost a lot of money, shrinked or not. I'd guess launching less than 12 months apart isn't feasible in the long run, although there are exceptions.
Rocket Lake is supposed to be launched within less than 12 months from Comet Lake, but that's also the first departure form Skylake, and I'd guess Intel have to speed up things right now (even if it still isn't <14 nm).

Would love a new APU that isn't OEM only.
The Pro versions are easier to find, although I have no info about it for where you live. Here in europe they're available tho.

I'd guess that the 5000 APU's will be both OEM and retail. I remember last year that AMD was talking about launching some other retail APU's at a later date.
Welcome to TPU! :toast:
 
Last edited:
Joined
Nov 3, 2011
Messages
695 (0.15/day)
Location
Australia
System Name Eula
Processor AMD Ryzen 9 7900X PBO
Motherboard ASUS TUF Gaming X670E Plus Wifi
Cooling Corsair H150i Elite LCD XT White
Memory Trident Z5 Neo RGB DDR5-6000 64GB (4x16GB F5-6000J3038F16GX2-TZ5NR) EXPO II, OCCT Tested
Video Card(s) Gigabyte GeForce RTX 4080 GAMING OC
Storage Corsair MP600 XT NVMe 2TB, Samsung 980 Pro NVMe 2TB, Toshiba N300 10TB HDD, Seagate Ironwolf 4T HDD
Display(s) Acer Predator X32FP 32in 160Hz 4K FreeSync/GSync DP, LG 32UL950 32in 4K HDR FreeSync/G-Sync DP
Case Phanteks Eclipse P500A D-RGB White
Audio Device(s) Creative Sound Blaster Z
Power Supply Corsair HX1000 Platinum 1000W
Mouse SteelSeries Prime Pro Gaming Mouse
Keyboard SteelSeries Apex 5
Software MS Windows 11 Pro
You are incorrect in your assumption. The M1 performs as well as intel’s fastest in most single threaded tasks even without the accelerators. Read the anandtech article about it: https://www.anandtech.com/show/16252/mac-mini-apple-m1-tested
What the accelerators do enable is truly excellent performance per watt in select use cases like watching videos or video calls, or encoding stuff.
For M1, Apple has added features to the hardware to improve the translation, hence it's approaching X86 post-RISC hybrid designs.

M1 has 50% more IPC than latest Ryzen, can you see that bound ? There is no way that next Ryzen will have more than 4 decoders, there is no way that next Ryzen will have 50% IPC, can you see that bound ? This is the main change M1 has.
1. M1 has faster on-chip memory to support wider instruction issue rates. Adding extra decoders can be gimped when memory bandwidth can be the limiting factor.

2. X86 instructions can generate two RISC instructions that involve AGU and ALU, hence IPC comparison is not 1 to 1 when comparing X86 instruction set to the RISC's atomic instruction set.

Ryzen fetches its instructions from 4 decoder units and 4-to 8 instructions from OP cache. OP cache's instruction source can be replaced by extra deocder units, but it will lead to a bandwidth increase from L0/L1/L2 cache requirements.

If x86 can only support 4 decoders, how did Skylake manage to have 5 decoders and why did they get rid of the 5th?
Intel Coffeelake still has 5 X86 decoders.
 
Last edited:
Joined
Dec 29, 2010
Messages
3,809 (0.75/day)
Processor AMD 5900x
Motherboard Asus x570 Strix-E
Cooling Hardware Labs
Memory G.Skill 4000c17 2x16gb
Video Card(s) RTX 3090
Storage Sabrent
Display(s) Samsung G9
Case Phanteks 719
Audio Device(s) Fiio K5 Pro
Power Supply EVGA 1000 P2
Mouse Logitech G600
Keyboard Corsair K95
Generally in the (review) industry, IPC is being used roughly as saying "Average amount of work done per clockcycle", at least AFAIK. It is in that spirit that I was using the term and assumed you were as well. If you want to limit the use of the term "IPC" to just the actual instructions a CPU core on average can decode and process per clock then that's fine with me:) Not sure which use is best for the discussion at hand however.

Whichever way you look at it, Seeing the M1 run software that's not even compiled for the architecture, and doing it this quickly to me shows that increasing the processing per clock at least isn't an impossibility. x86 Makers might need to further virtualize their decoding hardware/stack to reach that state however.

PS: Regarding the singlecore perfomance, decoders and instructions, does your view somewhat relate to this story i just found? Exclusive: Why Apple M1 Single "Core" Comparisons Are Fundamentally Flawed (With Benchmarks) (wccftech.com)
The industry is doing it wrong or completely missing the nuance. You cannot compare the single threading performance directly because of their architectural differences. In any case it will be unfair for one or the other but in race to get those clicks, the truth gets thrown to the wayside.

 
Joined
Dec 3, 2009
Messages
1,301 (0.24/day)
Location
The Netherlands
System Name PC ||Zephyrus G14 2023
Processor Ryzen 9 5900x || R9 7940HS @ 55W
Motherboard MAG B550M MORTAR WIFI || default
Cooling 1x Corsair XR5 360mm Rad||
Memory 2x16GB HyperX 3600 @ 3800 || 32GB DDR5 @ 4800MTs
Video Card(s) MSI RTX 2080Ti Sea Hawk EK X || RTX 4060 OC
Storage Samsung 9801TB x2 + Striped Tiered Storage Space (2x 128Gb SSD + 2x 1TB HDD) || 1TB NVME
Display(s) Iiyama PL2770QS + Samsung U28E590, || 14' 2560x1600 165Hz IPS
Case SilverStone Alta G1M ||
Audio Device(s) Asus Xonar DX
Power Supply Cooler Master V850 SFX || 240W
Mouse ROG Pugio II
Software Win 11 64bit || Win 11 64bit
The industry is doing it wrong or completely missing the nuance. You cannot compare the single threading performance directly because of their architectural differences. In any case it will be unfair for one or the other but in race to get those clicks, the truth gets thrown to the wayside.

That article did help me understand the actual difference you're trying to convey, thanks.
However, I don't think the industry is doing it wrong. I agree that single threaded benchmarks cannot fully utilize the cores by AMD and Intel, and are thus not a completely accurate way of showing per core performance, while the M1 doesn't suffer from the same problem. Single threaded software doesn't care about this though (and usually neither does the end user), and as the article you linked also stated, this is actually a weakness of current x86 architecture cores.

If I had the choice, for the same money and total performance, between a core that can reach 100% of it's potential performance in a single thread, or a core that needs more than one thread to reach that performance level I'd chose the 1st every time. Many tasks still just don't scale to multiple threads, and would thus simply run slower on the 2nd option.

This is why the M1 is such an achievement, and even though I do agree that single threaded benchmarks don't show the full potential of the x86 cores, it doesn't matter for the most part. This is just how they perform in the real world. For multithreaded performance we've got multithreaded benchmarks, with their own set of explanations, gotcha's and peculiarities. Properties which, for the most part, also don't matter to software and the end-user.
 
Joined
Dec 29, 2010
Messages
3,809 (0.75/day)
Processor AMD 5900x
Motherboard Asus x570 Strix-E
Cooling Hardware Labs
Memory G.Skill 4000c17 2x16gb
Video Card(s) RTX 3090
Storage Sabrent
Display(s) Samsung G9
Case Phanteks 719
Audio Device(s) Fiio K5 Pro
Power Supply EVGA 1000 P2
Mouse Logitech G600
Keyboard Corsair K95
That article did help me understand the actual difference you're trying to convey, thanks.
However, I don't think the industry is doing it wrong. I agree that single threaded benchmarks cannot fully utilize the cores by AMD and Intel, and are thus not a completely accurate way of showing per core performance, while the M1 doesn't suffer from the same problem. Single threaded software doesn't care about this though (and usually neither does the end user), and as the article you linked also stated, this is actually a weakness of current x86 architecture cores.

If I had the choice, for the same money and total performance, between a core that can reach 100% of it's potential performance in a single thread, or a core that needs more than one thread to reach that performance level I'd chose the 1st every time. Many tasks still just don't scale to multiple threads, and would thus simply run slower on the 2nd option.

This is why the M1 is such an achievement, and even though I do agree that single threaded benchmarks don't show the full potential of the x86 cores, it doesn't matter for the most part. This is just how they perform in the real world. For multithreaded performance we've got multithreaded benchmarks, with their own set of explanations, gotcha's and peculiarities. Properties which, for the most part, also don't matter to software and the end-user.
That's not real world though. Real world would be like setting up an encoding job then timing and seeing which finishes first. These tests are very flawed cuz they ignore architectural differences that come into play making said test irrelevant.
 
Joined
Dec 3, 2009
Messages
1,301 (0.24/day)
Location
The Netherlands
System Name PC ||Zephyrus G14 2023
Processor Ryzen 9 5900x || R9 7940HS @ 55W
Motherboard MAG B550M MORTAR WIFI || default
Cooling 1x Corsair XR5 360mm Rad||
Memory 2x16GB HyperX 3600 @ 3800 || 32GB DDR5 @ 4800MTs
Video Card(s) MSI RTX 2080Ti Sea Hawk EK X || RTX 4060 OC
Storage Samsung 9801TB x2 + Striped Tiered Storage Space (2x 128Gb SSD + 2x 1TB HDD) || 1TB NVME
Display(s) Iiyama PL2770QS + Samsung U28E590, || 14' 2560x1600 165Hz IPS
Case SilverStone Alta G1M ||
Audio Device(s) Asus Xonar DX
Power Supply Cooler Master V850 SFX || 240W
Mouse ROG Pugio II
Software Win 11 64bit || Win 11 64bit
That's not real world though. Real world would be like setting up an encoding job then timing and seeing which finishes first. These tests are very flawed cuz they ignore architectural differences that come into play making said test irrelevant.
Sorry maybe I should've been more specific. With real world in this case I mean real world usage of the CPU core in it's normal-day functioning, simulated by running benchmarks that try to replicate some aspect of that real world usage (eg, encoding a movie, converting some images, running down a decision tree, simulating some physics, running a game).

How would you go about setting up a test that would accurately compare both cores on equal ground?
Edit: which also directly reflects how they are actually used by end-users.
 
Joined
Jan 8, 2017
Messages
9,438 (3.27/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
Single threaded software doesn't care about this though

There is no such thing as single threaded software these days, practically everything is written to use multiple threads.
 
Joined
Dec 29, 2010
Messages
3,809 (0.75/day)
Processor AMD 5900x
Motherboard Asus x570 Strix-E
Cooling Hardware Labs
Memory G.Skill 4000c17 2x16gb
Video Card(s) RTX 3090
Storage Sabrent
Display(s) Samsung G9
Case Phanteks 719
Audio Device(s) Fiio K5 Pro
Power Supply EVGA 1000 P2
Mouse Logitech G600
Keyboard Corsair K95
Sorry maybe I should've been more specific. With real world in this case I mean real world usage of the CPU core in it's normal-day functioning, simulated by running benchmarks that try to replicate some aspect of that real world usage (eg, encoding a movie, converting some images, running down a decision tree, simulating some physics, running a game).

How would you go about setting up a test that would accurately compare both cores on equal ground?
Edit: which also directly reflects how they are actually used by end-users.
Ok you seem to be rather obsessed with cores and are still missing the point. You can't directly compare the cores, let me repeat that again, ya can't compare the cores. What we can do is compare how they perform real world tasks, measure, and compare. This hasn't changed ever since ppl compared macs to pc decades ago. For ex. check comparisons like Puget.

 
Joined
Dec 3, 2009
Messages
1,301 (0.24/day)
Location
The Netherlands
System Name PC ||Zephyrus G14 2023
Processor Ryzen 9 5900x || R9 7940HS @ 55W
Motherboard MAG B550M MORTAR WIFI || default
Cooling 1x Corsair XR5 360mm Rad||
Memory 2x16GB HyperX 3600 @ 3800 || 32GB DDR5 @ 4800MTs
Video Card(s) MSI RTX 2080Ti Sea Hawk EK X || RTX 4060 OC
Storage Samsung 9801TB x2 + Striped Tiered Storage Space (2x 128Gb SSD + 2x 1TB HDD) || 1TB NVME
Display(s) Iiyama PL2770QS + Samsung U28E590, || 14' 2560x1600 165Hz IPS
Case SilverStone Alta G1M ||
Audio Device(s) Asus Xonar DX
Power Supply Cooler Master V850 SFX || 240W
Mouse ROG Pugio II
Software Win 11 64bit || Win 11 64bit
Ok you seem to be rather obsessed with cores and are still missing the point. You can't directly compare the cores, let me repeat that again, ya can't compare the cores. What we can do is compare how they perform real world tasks, measure, and compare. This hasn't changed ever since ppl compared macs to pc decades ago. For ex. check comparisons like Puget.

How odd, you're trying to make the exact point I had the feeling I was trying to make. I think something is getting lost in translation here. Ohwell, I think we're actually mostly in agreement so I'll leave it at that for now.
 
Top