• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

IPC Comparisons Between Raptor Cove, Zen 4, and Golden Cove Spring Surprising Results

Joined
Nov 28, 2012
Messages
427 (0.10/day)
Dont think thats true. 5800x has one ccd measuring 85mm2. 5950x has 2 ccds measuring double that.

When it comes to cooling, what i mean by easy or hard to cool is iso wattage with the same cooler. With a tdp of 170w the 7950x will probably go above 200w even at stock.
The cores per ccd is the same (8), if anything the 5950x should put out more heat since it has two of them but the opposite is tru as AMD refined their manufacturing process.
By hard to cool eveyrone compares similar coolers in similar use conditions, not watt per watt.
Also if the TDP is 170W it will do at most that at stock.
 
Joined
Jun 14, 2020
Messages
3,549 (2.14/day)
System Name Mean machine
Processor 12900k
Motherboard MSI Unify X
Cooling Noctua U12A
Memory 7600c34
Video Card(s) 4090 Gamerock oc
Storage 980 pro 2tb
Display(s) Samsung crg90
Case Fractal Torent
Audio Device(s) Hifiman Arya / a30 - d30 pro stack
Power Supply Be quiet dark power pro 1200
Mouse Viper ultimate
Keyboard Blackwidow 65%
The cores per ccd is the same (8), if anything the 5950x should put out more heat since it has two of them but the opposite is tru as AMD refined their manufacturing process.
By hard to cool eveyrone compares similar coolers in similar use conditions, not watt per watt.
Also if the TDP is 170W it will do at most that at stock.
You don't understand the fundamental parts of thermodynamics. The 5950x is power limited to the same wattage as the 5800x,but that wattage is spread out to double the die size, thats why its easier to cool.

Zen 4 has an even smaller die size, but even higher power draw, which will make it way harder to cool than zen 3. On the other hand Raptor will have a bigger die size than alderlake but similar power draw, which makes it easier. Assuming the zen 4 rumors are true and the 7950x draws north of 200w, it will be way harder to cool than the 13900k at 250watts. Thats just physics
 
Joined
Nov 28, 2012
Messages
427 (0.10/day)
You don't understand the fundamental parts of thermodynamics. The 5950x is power limited to the same wattage as the 5800x,but that wattage is spread out to double the die size, thats why its easier to cool.

Zen 4 has an even smaller die size, but even higher power draw, which will make it way harder to cool than zen 3. On the other hand Raptor will have a bigger die size than alderlake but similar power draw, which makes it easier. Assuming the zen 4 rumors are true and the 7950x draws north of 200w, it will be way harder to cool than the 13900k at 250watts. Thats just physics
The die size is the same, the 5950x uses two 8 core chiplets while the 5800x uses one of them.
We'll see once the chips are out, but something tells me the 13900k will be another miniature stove while the 7950x will be reasonable.
 
Joined
Jun 14, 2020
Messages
3,549 (2.14/day)
System Name Mean machine
Processor 12900k
Motherboard MSI Unify X
Cooling Noctua U12A
Memory 7600c34
Video Card(s) 4090 Gamerock oc
Storage 980 pro 2tb
Display(s) Samsung crg90
Case Fractal Torent
Audio Device(s) Hifiman Arya / a30 - d30 pro stack
Power Supply Be quiet dark power pro 1200
Mouse Viper ultimate
Keyboard Blackwidow 65%
The die size is the same, the 5950x uses two 8 core chiplets while the 5800x uses one of them.
We'll see once the chips are out, but something tells me the 13900k will be another miniature stove while the 7950x will be reasonable.
You are confusing the ihs with the die. The ihs is the same yes, the die isn't. The 5950x has 2 ccds of 85mm2 each. The 5800x has one.
 
Joined
Jan 28, 2021
Messages
854 (0.60/day)
Raptor cove still superior on an older node.. intel architecture is more advanced
But AMD is matching Intel's performance using significantly fewer transistors so clearly AMD is still superior.

The reality is they are both very different and it looks like both have good designs and AMD and Intel will pretty much directly competing overall.
Those scores are pretty close if not within the margin of error. It's like splitting hairs here... I also think bios immaturity with RPL could be a handicap.
Its just one test but it is pretty insane just how close these very different architectures perform when normalized at the same clock, I would not have expected that at all.
 
Joined
Jun 10, 2014
Messages
2,995 (0.78/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
IPC is a constant (and depends on task), and it is independent of core frequency (and why you multiple both together to approximate performance FYI).

The higher the core frequency, the higher it will be scewed by buses/IMC/DRAM performance, and higher chance of throttling based on cooling/power requirements …
This is a typical misconception.
Real IPC is a constant and is given by the architectural design, it's the architecture's ability to process instructions across "any" workload, and is measured in clocks. Real IPC isn't possible for us to measure, so we approximate it by locking clock speed far below any throttling point, choosing memory hopefully fast enough not to cause a bottleneck, and hopefully selecting a good amount of workloads able to saturate a single core. What we get is a relative IPC, which is an approximation, and the quality of this approximation is dependent on the aforementioned factors which will affect the benchmark scores.
 
Joined
Jan 3, 2021
Messages
3,616 (2.49/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
This is a typical misconception.
Real IPC is a constant and is given by the architectural design, it's the architecture's ability to process instructions across "any" workload, and is measured in clocks. Real IPC isn't possible for us to measure, so we approximate it by locking clock speed far below any throttling point, choosing memory hopefully fast enough not to cause a bottleneck, and hopefully selecting a good amount of workloads able to saturate a single core. What we get is a relative IPC, which is an approximation, and the quality of this approximation is dependent on the aforementioned factors which will affect the benchmark scores.
How do you account for the fact that different instructions take different number of cycles to execute, from zero (sometimes, if the front end manages to fuse two instructions into one micro-op) to several tens (division, whose time to execute also depends of the actual data being divided)?
How do you account for the fact that, as an example, a Skylake core can do four non-vector additions at the same time (they probably execute in one cycle but I haven't checked) but only one division (which, again, takes many cycles to execute)?
 
Last edited:
Joined
Nov 4, 2005
Messages
12,016 (1.72/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s) 55" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
How do you account for the fact that different instructions take different number of cycles to execute, from zero (sometimes, if the front end manages to fuse two instructions into one micro-op) to several tens (division, whose time to execute also depends of the actual data being divided)?
How do you account for the fact that, as an example, a Skylake core can do four non-vector additions at the same time (they probably execute in one cycle but I haven't checked) but only one division (which, again, takes many cycles to execute)?

Cause that is the actual real world effect of architecture on IPC in real world software at a set frequency so we can determine the efficiency of a architecture at a given task.


I seriously don't know how that is so hard to understand by so many.

Architecture A may be great at X software, while Y architecture may excel with Z software and its a balance act to make one great at everything, which is also why a great architecture at in order execution has a long/deep pipeline but a out of order architecture must have a either shallow pipeline and or a great predictive branching unit and lots of cache.


Why are Arm CPUs so good on phones and closed environments? They have a closed environment and can be optimized for typical handheld devices. The same program can run significantly faster on a desktop CPU through a emulator though, so which architecture is superior? Which has higher IPC.

1663467772894.png


1663467866761.png
 
Joined
Nov 28, 2012
Messages
427 (0.10/day)
arm is built on a RISC architecture which means they have less a simpler and smaller instruction set which means less space and lower power.
x86 is a CISC architecture which means they have a wider set of instructions, some of which are very complex and take a lot of hardware and power to implement.

The advantage of RISC is efficiency for small tasks, the advantage of CISC is performance on highly complex tasks, neither is superior in absolute.
in other words the x86 CPU can do the same thing with less instructions so this doesn't really reflect IPC.
 
Last edited:
Joined
Jan 3, 2021
Messages
3,616 (2.49/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
Real IPC is a constant and is given by the architectural design, it's the architecture's ability to process instructions across "any" workload
So the real IPC of the Haswell or Skylake architecture is 6, is that what you mean? It's been calculated by people who seem to know the architecture well enough.

The big surprise here is just how good the "Gracemont" E-cores are in SPECint. OneRaichu made a distinction between the "Gracemont" E-cores of "Alder Lake" (GLC-12) and those of "Raptor Lake" (GLC-13,) as the latter have double the amount of shared L2 cache per E-core cluster. The E-core is fast approaching IPC levels comparable to that of "Skylake," which really is Intel's calculation in giving its processors a large number of E-cores next to a small number of P-cores. The idea is that the E-cores will soak up all the moderately-intensive compute workloads and background processes, keeping the P-cores free for gruelling compute-heavy tasks.
This was single-threaded benchmarking. While it does reveal a lot, it would have been great if it was also done with two threads and four threads.

2 threads on a single P core vs. 2 threads on the same E core cluster: each thread's performance on P should drop sharply (by 35% or so) but what about E?

4 threads on two P cores vs. 4 threads on the same E core cluster: similar but the E cores would be even more constrained because they share L2 and access to L3 and bus.

There may be optimisations (or regressions, for that matter) in how a P core handles SMT, and such benchmarking would have exposed that.
 
Top