• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Intel Core i7-11700K "Rocket Lake" CPU Outperforms AMD Ryzen 9 5950X in Single-Core Tests

Joined
Nov 6, 2016
Messages
1,778 (0.60/day)
Location
NH, USA
System Name Lightbringer
Processor Ryzen 7 2700X
Motherboard Asus ROG Strix X470-F Gaming
Cooling Enermax Liqmax Iii 360mm AIO
Memory G.Skill Trident Z RGB 32GB (8GBx4) 3200Mhz CL 14
Video Card(s) Sapphire RX 5700XT Nitro+
Storage Hp EX950 2TB NVMe M.2, HP EX950 1TB NVMe M.2, Samsung 860 EVO 2TB
Display(s) LG 34BK95U-W 34" 5120 x 2160
Case Lian Li PC-O11 Dynamic (White)
Power Supply BeQuiet Straight Power 11 850w Gold Rated PSU
Mouse Glorious Model O (Matte White)
Keyboard Royal Kludge RK71
Software Windows 10
Here comes the Ryzen 5000 XT series on 7nm EUV (improved node)...willing to bet on it
 
Joined
Apr 24, 2020
Messages
2,723 (1.60/day)
I have no issue with the possibility of increasing the decoder width or even adding more execution ports. But I question how likely it is, if Cypress Cove is basically a backport of Sunny Cove, since these kinds of changes usually require a total overhaul of the cache, register files and everything on the front-end.

Wouldn't it be more likely to backport something from the execution side from e.g. Sapphire Rapids, or to simply add more execution units on existing execution ports? (like one extra MUL unit?)

I think ARM has an advantage on decoder width. That's the only weak point of the x86 ISA I can think of.

x86 requires a byte-by-byte decoder, because you have 2-byte, 3-byte, 4-byte... 15-byte instructions (some of which are macro-op fused and/or micro-op split). ARM standardized upon 4-byte instructions with an occasional 8-byte macro-op fused. That means if you want to perform 4-wide decoding (and assume an average of 4-bytes per instruction), you need 64-parallel decoders: one for every byte (byte0, byte1, byte2) of the cache line.

ARM on the other hand is always 4-bytes or 8-bytes at a time (in the case of macro-op fused operations). Which means for a 64-byte decoder, ARM only need 16-parallel decoders: knowing there's no 2-byte or 3-byte instructions that could be "in-between". Just hypothetically speaking of course, I dunno really how these things are organized.

Anyway: Apple M1 shot a broadside at the x86 camp with their 8-wide decoder. I do think its relevant to bring up. However, ARM Neoverse is still only 4-wide decoding. It hasn't really been proven yet that an ultra-wide decoder (like Apple's M1) is really the best path forward.
 
Joined
Sep 1, 2020
Messages
2,400 (1.52/day)
Location
Bulgaria
LoL I see that X86 is ok with better decoder. But isn't possible to make better decoder because has depencies how work ISA with information. This is same as ISA X86 is not ok itself.
 
Joined
Mar 17, 2011
Messages
159 (0.03/day)
Location
Christchurch, New Zealand
tbh the “next big thing” should be Alder Lake, not Rocket Lake.


Let’s be honest: it almost was a paper launch with skyrocket prices for Zen 3.
I don’t know about new zeland, but here in Europe it is very difficult to find one at a decent price.

Well, what's a decent price? This OK? Considering that you guys suffer more with higher taxes on stuff, it's probably a decent price.
 
Joined
Jul 26, 2019
Messages
419 (0.21/day)
Processor R5 5600X
Motherboard Asus TUF Gaming X570-Plus
Memory 32 GB 3600 MT/s CL16
Video Card(s) Sapphire Vega 64
Storage 2x 500 GB SSD, 2x 3 TB HDD
Case Phanteks P300A
Software Manjaro Linux, W10 if I have to
Well... they hit 5Ghz on 14nm++++++ which we know they can do quite easily. Let's see if Alder Lake hits 5Ghz at 10nm.
Oh, silly me for thinking Intel was finally on a new process lmao
 
Joined
Mar 7, 2010
Messages
993 (0.18/day)
Location
Michigan
System Name Daves
Processor AMD Ryzen 3900x
Motherboard AsRock X570 Taichi
Cooling Enermax LIQMAX III 360
Memory 32 GiG Team Group B Die 3600
Video Card(s) Powercolor 5700 xt Red Devil
Storage Crucial MX 500 SSD and Intel P660 NVME 2TB for games
Display(s) Acer 144htz 27in. 2560x1440
Case Phanteks P600S
Audio Device(s) N/A
Power Supply Corsair RM 750
Mouse EVGA
Keyboard Corsair Strafe
Software Windows 10 Pro
Got to love that headline..:shadedshu:
 
Joined
May 10, 2020
Messages
738 (0.44/day)
Processor Intel i7 13900K
Motherboard Asus ROG Strix Z690-E Gaming
Cooling Arctic Freezer II 360
Memory 32 Gb Kingston Fury Renegade 6400 C32
Video Card(s) PNY RTX 4080 XLR8 OC
Storage 1 TB Samsung 970 EVO + 1 TB Samsung 970 EVO Plus + 2 TB Samsung 870
Display(s) Asus TUF Gaming VG27AQL1A + Samsung C24RG50
Case Corsair 5000D Airflow
Power Supply EVGA G6 850W
Mouse Razer Basilisk
Keyboard Razer Huntsman Elite
Benchmark Scores 3dMark TimeSpy - 26698 Cinebench R23 2258/40751
Well, what's a decent price? This OK? Considering that you guys suffer more with higher taxes on stuff, it's probably a decent price.
It is one of the best prices in Europe... if you consider UK as part of the Europe (which is not...).
By the way that website is one of the best, but for some items they don’t ship outside UK
 
Joined
Mar 17, 2011
Messages
159 (0.03/day)
Location
Christchurch, New Zealand
It is one of the best prices in Europe... if you consider UK as part of the Europe (which is not...).
By the way that website is one of the best, but for some items they don’t ship outside UK

The UK is not part of the European Union. It will always be a part of Europe by dint of its geographical location. In any case, the offer is not availble to you if you're not a UK resident. But if you knew someone in the UK that might do you a favour....
 
Joined
May 10, 2020
Messages
738 (0.44/day)
Processor Intel i7 13900K
Motherboard Asus ROG Strix Z690-E Gaming
Cooling Arctic Freezer II 360
Memory 32 Gb Kingston Fury Renegade 6400 C32
Video Card(s) PNY RTX 4080 XLR8 OC
Storage 1 TB Samsung 970 EVO + 1 TB Samsung 970 EVO Plus + 2 TB Samsung 870
Display(s) Asus TUF Gaming VG27AQL1A + Samsung C24RG50
Case Corsair 5000D Airflow
Power Supply EVGA G6 850W
Mouse Razer Basilisk
Keyboard Razer Huntsman Elite
Benchmark Scores 3dMark TimeSpy - 26698 Cinebench R23 2258/40751
The UK is not part of the European Union. It will always be a part of Europe by dint of its geographical location. In any case, the offer is not availble to you if you're not a UK resident. But if you knew someone in the UK that might do you a favour....
yep... I was speaking about not being in Europe in a commercial way, not geographically :D
 
Joined
Dec 22, 2011
Messages
3,890 (0.82/day)
Processor AMD Ryzen 7 3700X
Motherboard MSI MAG B550 TOMAHAWK
Cooling AMD Wraith Prism
Memory Team Group Dark Pro 8Pack Edition 3600Mhz CL16
Video Card(s) NVIDIA GeForce RTX 3080 FE
Storage Kingston A2000 1TB + Seagate HDD workhorse
Display(s) Samsung 50" QN94A Neo QLED
Case Antec 1200
Power Supply Seasonic Focus GX-850
Mouse Razer Deathadder Chroma
Keyboard Logitech UltraX
Software Windows 11
Here comes the Ryzen 5000 XT series on 7nm EUV (improved node)...willing to bet on it

Oh God, lets hope not, 2% more performance for way more money and they don't include a cooler as a bonus!
 

SL2

Joined
Jan 27, 2006
Messages
2,460 (0.36/day)
tbh the “next big thing” should be Alder Lake, not Rocket Lake.
You missed the "since 2015" part. Intel have been using the same Skylake design all these years, and Rocket Lake is the departure from that. Alder Lake may be better, but it's the next big thing after RL.
 
Joined
Mar 10, 2010
Messages
11,878 (2.20/day)
Location
Manchester uk
System Name RyzenGtEvo/ Asus strix scar II
Processor Amd R5 5900X/ Intel 8750H
Motherboard Crosshair hero8 impact/Asus
Cooling 360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory Corsair Vengeance Rgb pro 3600cas14 16Gb in four sticks./16Gb/16GB
Video Card(s) Powercolour RX7900XT Reference/Rtx 2060
Storage Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s) Samsung UAE28"850R 4k freesync.dell shiter
Case Lianli 011 dynamic/strix scar2
Audio Device(s) Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply corsair 1200Hxi/Asus stock
Mouse Roccat Kova/ Logitech G wireless
Keyboard Roccat Aimo 120
VR HMD Oculus rift
Software Win 10 Pro
Benchmark Scores 8726 vega 3dmark timespy/ laptop Timespy 6506
yep... I was speaking about not being in Europe in a commercial way, not geographically :D
Yep well 6800s most geforce's n in fact most GPU are scarce in the UK a rx580 sells for 220£ new still.
Plenty of intel CPU but few others in stock and favourite ones like 5900x aren't about anymore.
Oh to be rich.
 
Joined
Apr 24, 2020
Messages
2,723 (1.60/day)
Yep well 6800s most geforce's n in fact most GPU are scarce in the UK a rx580 sells for 220£ new still.
Plenty of intel CPU but few others in stock and favourite ones like 5900x aren't about anymore.
Oh to be rich.

AMD is supply constrained. They were only planning to reach 10% or 15% marketshare around now, and didn't expect that their chips would be such a hit. If AMD produced too many chips, they could risk bankruptcy as well as damage to their brand.

The Xilinx purchase probably helps: since it will give them a stable source of revenue, allowing them to play a bit more aggressive in the months and years ahead.
 
Joined
Jan 8, 2017
Messages
9,511 (3.27/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
I think ARM has an advantage on decoder width. That's the only weak point of the x86 ISA I can think of.

x86 requires a byte-by-byte decoder, because you have 2-byte, 3-byte, 4-byte... 15-byte instructions (some of which are macro-op fused and/or micro-op split). ARM standardized upon 4-byte instructions with an occasional 8-byte macro-op fused. That means if you want to perform 4-wide decoding (and assume an average of 4-bytes per instruction), you need 64-parallel decoders: one for every byte (byte0, byte1, byte2) of the cache line.

ARM on the other hand is always 4-bytes or 8-bytes at a time (in the case of macro-op fused operations). Which means for a 64-byte decoder, ARM only need 16-parallel decoders: knowing there's no 2-byte or 3-byte instructions that could be "in-between". Just hypothetically speaking of course, I dunno really how these things are organized.

Anyway: Apple M1 shot a broadside at the x86 camp with their 8-wide decoder. I do think its relevant to bring up. However, ARM Neoverse is still only 4-wide decoding. It hasn't really been proven yet that an ultra-wide decoder (like Apple's M1) is really the best path forward.

Regardless, I do not think x86 designs are limited in any way by current decode width. Actually, they clearly can't be since the back end on these CPUs keeps getting wider and wider and there don't seem to be any problem feeding all of those execution ports. And after all x86 is still more compact when it comes down how much decode is necessary to get the same amount of work done.

Apple's obsession with a ultra wide front end (and ultra wide everything really) seems to be rather wasteful, there is no obvious reason why that's actually required at the moment, I bet you everything that with half the decode stage the performance regression would be marginal.

I mean most of the performance that's worth extracting through ILP sits in loops and those don't put pressure on the decode stage because you'll keep hitting the instruction cache anyway which is colossal on something like M1. Actually the more I think about it the more absurd Apple's design choices appear to me.
 
Joined
Apr 24, 2020
Messages
2,723 (1.60/day)
I mean most of the performance that's worth extracting through ILP sits in loops and those don't put pressure on the decode stage because you'll keep hitting the instruction cache anyway which is colossal on something like M1. Actually the more I think about it the more absurd Apple's design choices appear to me.

Well, the uOp cache could be used for other registers, or L1 cache instead. So I'm not sure of the existence of the uOp cache is favorable to your argument.
 
Joined
Jan 8, 2017
Messages
9,511 (3.27/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
Well, the uOp cache could be used for other registers, or L1 cache instead. So I'm not sure of the existence of the uOp cache is favorable to your argument.

It's really, really, unlikely for a real world single program to have it's instructions pushed out of the micro-op cache, or if there is some kind of weird uop count that makes the whole mechanism inefficient. I am not saying it doesn't happen but the micro-op cache is typically the least of your worries.
 
Joined
Apr 24, 2020
Messages
2,723 (1.60/day)
It's really, really, unlikely for a real world single program to have it's instructions pushed out of the micro-op cache. I am not saying it doesn't happen but the micro-op cache is typically the least of your worries.

Oh, what I'm saying is that the M1 has an advantage, because its decoder is 8-wide, while Intel / AMD has a disadvantage, because their uOp caches are only 6-wide. Instead of having a uOp cache, the M1 spent its transistors on more L1 cache and a larger register-file.

The M1 can fit 192kB into its L1 i-cache, which performs a bit faster than the Intel/AMD uOp cache thanks to 8-way decoding. Intel / AMD only have 48kB i-L1 (for Rocket Lake) or 32kB i-L1 (AMD Zen 3), and smaller than that for its uOp cache.

-------

EDIT: I should say "Seems to have an advantage". Its not very clear if Apple's big decoder is a good strategy yet IMO. But its interesting, and worth keeping an eye on. Especially because it seems like an area that may be harder to implement into x86.
 
Last edited:
Joined
Jan 8, 2017
Messages
9,511 (3.27/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
Oh, what I'm saying is that the M1 has an advantage, because its decoder is 8-wide, while Intel / AMD has a disadvantage, because their uOp caches are only 6-wide. Instead of having a uOp cache, the M1 spent its transistors on more L1 cache and a larger register-file.

I really doubt that a micro-op cache is that much of an expensive mechanism to add definitely not comparable to the size and power of a larger L1 I-cache, they probably didn't add one because it just wasn't required, you're really gonna tell me that Apple is that conscious about their transistor budget ? :) And after all a lot of CPUs out there don't have one either, it's a pretty recent addition.

What I am also saying it that I haven't actually seen any evidence that such a wide decoder is actually worth it. Wide decode means a lot of delay in the circuitry which means poor clock speed scaling.
 
Joined
Nov 4, 2005
Messages
12,016 (1.72/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s) 55" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
Oh, what I'm saying is that the M1 has an advantage, because its decoder is 8-wide, while Intel / AMD has a disadvantage, because their uOp caches are only 6-wide. Instead of having a uOp cache, the M1 spent its transistors on more L1 cache and a larger register-file.

The M1 can fit 192kB into its L1 i-cache, which performs a bit faster than the Intel/AMD uOp cache thanks to 8-way decoding. Intel / AMD only have 48kB i-L1 (for Rocket Lake) or 32kB i-L1 (AMD Zen 3), and smaller than that for its uOp cache.

-------

EDIT: I should say "Seems to have an advantage". Its not very clear if Apple's big decoder is a good strategy yet IMO. But its interesting, and worth keeping an eye on. Especially because it seems like an area that may be harder to implement into x86.


And the fact that ARM is essentially hardware based accelerators taped together with good power gating on the most advanced nodes. X86-64 has the advantage of is you want to do X in the future software and brute force will do it, ARM designs..... You need to buy a whole new device.

What happens when 8K or whatever is next becomes a thing? Apple products become obsolete and cheap, which is why a used Ipad Pros get sold for dirt cheap. 2-3 year old one is now $270 VS the initial price of $1k. Almost as bad as other ASIC hardware like GPU's, but 1K of X86 hardware will retain its value longer, and you can upgrade RAM and GPU's, increase storage and it just works.
 
Joined
Apr 24, 2020
Messages
2,723 (1.60/day)
I really doubt that a micro-op cache is that much of an expensive mechanism to add definitely not comparable to the size and power of a larger L1 I-cache, they probably didn't add one because it just wasn't required, you're really gonna tell me that Apple is that conscious about their transistor budget ? :) And after all a lot of CPUs out there don't have one either, it's a pretty recent addition.

What I am also saying it that I haven't actually seen any evidence that such a wide decoder is actually worth it. Wide decode means a lot of delay in the circuitry which means poor clock speed scaling.

I guess I feel like the uop cache in x86 (both Skylake and Zen3) is because of the decode width problem. In performance-critical sections, Skylake / Zen3 go from 4-uops / tick (from the decoder) to 6-uops/tick (from the uop cache). In effect: its a way for x86 to reach higher uops/tick... but only in select areas of code (the areas that fit inside a uop cache).

Apple has a superior decoder: just 8-uops/tick no matter what. Its the "more expensive transistor budget" compared to a uop cache. Apple can achieve 8uops/tick across the entire 192kB L1 instruction cache, while Intel Skylake / AMD Zen3 can only achieve 4-uops/tick across a 48kB L1 cache (Skylake) / 32kB L1 (Zen3) cache, and a 6-uop/tick across a smaller region inside of the uOp cache.
 
Joined
Nov 7, 2016
Messages
159 (0.05/day)
Processor 5950X
Motherboard Dark Hero
Cooling Custom Loop
Memory Crucial Ballistix 3600MHz CL16
Video Card(s) Gigabyte RTX 3080 Vision
Storage 980 Pro 500GB, 970 Evo Plus 500GB, Crucial MX500 2TB, Crucial MX500 2TB, Samsung 850 Evo 500GB
Display(s) Gigabyte G34WQC
Case Cooler Master C700M
Audio Device(s) Bose
Power Supply AX850
Mouse Razer DeathAdder Chroma
Keyboard MSI GK80
Software W10 Pro
Benchmark Scores CPU-Z Single-Thread: 688 Multi-Thread: 11940
There goes AMD's brief lead in gaming. :roll:

But it was never a real lead since the Ryzen 5000 launch was a paper launch.

The 5950X has been sitting on my desk for about month now, as I have been awaiting the shipment of the Dark Hero. When I placed the order, I saw a lot of high end motherboards had been sold out, a few gaming monitors were also sold out, also.
 
Joined
Jan 27, 2015
Messages
1,747 (0.48/day)
System Name Legion
Processor i7-12700KF
Motherboard Asus Z690-Plus TUF Gaming WiFi D5
Cooling Arctic Liquid Freezer 2 240mm AIO
Memory PNY MAKO DDR5-6000 C36-36-36-76
Video Card(s) PowerColor Hellhound 6700 XT 12GB
Storage WD SN770 512GB m.2, Samsung 980 Pro m.2 2TB
Display(s) Acer K272HUL 1440p / 34" MSI MAG341CQ 3440x1440
Case Montech Air X
Power Supply Corsair CX750M
Mouse Logitech MX Anywhere 25
Keyboard Logitech MX Keys
Software Lots
Joined
Oct 7, 2020
Messages
90 (0.06/day)
Location
N California
Processor 5930k @ 3.7 normally; can do 4.1 stable
Motherboard asus rog rampage 5
Cooling noctua server two fan air
Memory 16GB @2133
Video Card(s) 1670 ti
Storage several SSD adding nvme soon
Display(s) lg 77" cx 4k, lg 55 b8 4k
Case big tower lots of fans including side and top fans
Audio Device(s) on board
Power Supply seasonic 1200watt
Mouse adjustable vertical
Keyboard really cheap
Software windows 8.1, Linux -dual boot different drives
They should call them Turbo Rockets--they'd sell like hot cakes and maybe cook them too.
But seriously I'm starting to really feel, between this and AMD's new offerings, the upgrade itch. Just need a little patience for all these new goodies to become obtainable.
 
Joined
Dec 31, 2009
Messages
19,372 (3.54/day)
Benchmark Scores Faster than yours... I'd bet on it. :)
I started reading this thread and all I saw were people taking jabs at Intel... ridiculous. This forum man... I swear........ :(

Anyway, who knows how true this is... but it's a good sign so far. Wondering what the power draw will be (more than AMD I'd guess), but if IPC is back up there along with clocks and they keep the more reasonable pricing... it sounds like a solid option in the market to me...
 
Top