• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Zen 2 12-Core, 24-Thread Matisse CPU Spotted in UserBenchmark

Joined
May 7, 2009
Messages
5,392 (0.95/day)
Location
Carrollton, GA
System Name ODIN
Processor AMD Ryzen 7 5800X
Motherboard Gigabyte B550 Aorus Elite AX V2
Cooling Dark Rock 4
Memory G Skill RipjawsV F4 3600 Mhz C16
Video Card(s) MSI GeForce RTX 3080 Ventus 3X OC LHR
Storage Crucial 2 TB M.2 SSD :: WD Blue M.2 1TB SSD :: 1 TB WD Black VelociRaptor
Display(s) Dell S2716DG 27" 144 Hz G-SYNC
Case Fractal Meshify C
Audio Device(s) Onboard Audio
Power Supply Antec HCP 850 80+ Gold
Mouse Corsair M65
Keyboard Corsair K70 RGB Lux
Software Windows 10 Pro 64-bit
Benchmark Scores I don't benchmark.
it auto submits.

And they know that. The can test this stuff in house and never show anything, but AMD wants and needs us to be talking about their new CPU. We have months before they are available so this is what they do. Oh we just so happened to use some software that posting a score. They will do the same with a real performance test paried with a Radeon VII and top specs close to release.
 
Joined
Jun 10, 2014
Messages
2,985 (0.78/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
All "leaks" like this that are genuine are intentional.
 
Joined
Mar 16, 2017
Messages
2,095 (0.75/day)
Location
Tanagra
System Name Budget Box
Processor Xeon E5-2667v2
Motherboard ASUS P9X79 Pro
Cooling Some cheap tower cooler, I dunno
Memory 32GB 1866-DDR3 ECC
Video Card(s) XFX RX 5600XT
Storage WD NVME 1GB
Display(s) ASUS Pro Art 27"
Case Antec P7 Neo
It’s quite possible it’s intentional. It might be another case of a demo to compare to Intel’s best options. The CES demo was matching a 9900K’s performance for 50W less TDP. This one appears to match a 10C/20T 9900X in multicore for 60W less....on single channel DDR4 no less. Of course it’s just the one benchmark, but it also doesn’t appear to be AMD’s strongest possible effort based on the design.
 
Joined
Jun 10, 2014
Messages
2,985 (0.78/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
I put this Zen 2 12-core up against the i9-9900X (10-core) in my table in post #6 for a reason; there are rumors of a stop-gap "Comet Lake" 10-core socket 1151 CPU until Ice Lake is fully here. If true, this will probably have similar characteristics of i9-9900X with some tweaks, and is probably the CPU the 12-core Zen 2 will be competing against.

While I do expect a final 12-core Zen 2 to be slightly higher clocked and get slightly better single and quad core scores, and the Zen 2 to have the upper hand in energy efficiency, I don't expect there to be a 60W difference in TDP. Let's hope Intel at least ditches the integrated graphics, it has nothing to do in a 10-core.

I do wonder though what place these CPUs deserve in the market. Don't get me wrong, options are fine and while they look compelling, what market demand do they serve?
It's obviously not gaming. And many heavy multithreaded workloads are also consuming of memory bandwidth. I guess these are relevant for people looking for a "HEDT lite", perhaps image editing or coding, but probably not heavy encoding or simulations. Personally I would probably not consider these high-core "mainstream" CPUs from either company, as I value the flexibility for more memory bandwidth and capacity, and when investing this much money anyway, the expandability of the platform is also something to consider.
 
Joined
Mar 16, 2017
Messages
2,095 (0.75/day)
Location
Tanagra
System Name Budget Box
Processor Xeon E5-2667v2
Motherboard ASUS P9X79 Pro
Cooling Some cheap tower cooler, I dunno
Memory 32GB 1866-DDR3 ECC
Video Card(s) XFX RX 5600XT
Storage WD NVME 1GB
Display(s) ASUS Pro Art 27"
Case Antec P7 Neo
Skylake-X doesn’t have integrated graphics already. It also uses quad channel memory. Being a different design, I’m now sure how easy it will be for Intel to get it into their desktop socket. And that TDP value is for sustained all-core speed at base clock—turbo will take it way over 165W. I do actually expect there to be a big delta in TDP, as Zen2 is on 7nm and Skylake-X is 14nm. AMDs individual cores are actually pretty efficient—it’s the InfinityFabric that consumes a fair amount of power, especially when there is more than one CCX on a CPU. I’m curious if the chiplet design of Zen2 will improve this.
 
Joined
Jun 10, 2014
Messages
2,985 (0.78/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
Skylake-X doesn’t have integrated graphics already. It also uses quad channel memory. Being a different design, I’m now sure how easy it will be for Intel to get it into their desktop socket. And that TDP value is for sustained all-core speed at base clock—turbo will take it way over 165W. I do actually expect there to be a big delta in TDP, as Zen2 is on 7nm and Skylake-X is 14nm.
A potential "Comet Lake" 10-core on socket 1151 will probably be an extended Skylake (non-X) design like the recent 8-core Coffee Lake refresh. My reasoning behind this is that the Skylake-X/SP 10-core design have a 6-channel memory controller, 2× UPI-links and AVX-512, all of which are "wasted" die space compared to the competition. But who knows, they have done strange things in the past.
 

hat

Enthusiast
Joined
Nov 20, 2006
Messages
21,745 (3.31/day)
Location
Ohio
System Name Starlifter :: Dragonfly
Processor i7 2600k 4.4GHz :: i5 10400
Motherboard ASUS P8P67 Pro :: ASUS Prime H570-Plus
Cooling Cryorig M9 :: Stock
Memory 4x4GB DDR3 2133 :: 2x8GB DDR4 2400
Video Card(s) PNY GTX1070 :: Integrated UHD 630
Storage Crucial MX500 1TB, 2x1TB Seagate RAID 0 :: Mushkin Enhanced 60GB SSD, 3x4TB Seagate HDD RAID5
Display(s) Onn 165hz 1080p :: Acer 1080p
Case Antec SOHO 1030B :: Old White Full Tower
Audio Device(s) Creative X-Fi Titanium Fatal1ty Pro - Bose Companion 2 Series III :: None
Power Supply FSP Hydro GE 550w :: EVGA Supernova 550
Software Windows 10 Pro - Plex Server on Dragonfly
Benchmark Scores >9000
One might wonder whether the single stick of abysmally slow DDR4 has to do with showing off the CPU performance by using a terrible stick of RAM, or if they had to use that RAM to get the system to run...
 
Joined
Jun 10, 2014
Messages
2,985 (0.78/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
DDR4-2667 isn't "abysmally slow".
I wouldn't read too much into the specifics of the setup as an indicator of problems with the platform. It's highly likely that this is some kind of validation setup, and the BIOS which was just a few days old might be an indicator of a BIOS or motherboard testing lab.
 
Joined
Dec 5, 2017
Messages
157 (0.06/day)
That might be possible. Theoretically, that arrangement could have logistical, time and cost savings.

Correct me if I'm wrong, but I don't believe it's possible to have the L3 off-die with the current CCX design. My understanding is that the L3 is a fundamental part of making a CCX a CCX. Each has its own shared pool of L3 victim cache which is fully inclusive of the L2; cores within a CCX use the L3 to communicate with one another, and CCX's communicate with each other by transferring data between each others' L3 pools. Then there's the issue of latency which is already not very good with CCX design. We'll see though, it would make some amount of sense as caches are very costly in terms of die area and don't benefit much from node shrinks, and I suppose having the CCX's all use a single unified L3 could be considered an architectural advancement.
 
Joined
Mar 16, 2017
Messages
2,095 (0.75/day)
Location
Tanagra
System Name Budget Box
Processor Xeon E5-2667v2
Motherboard ASUS P9X79 Pro
Cooling Some cheap tower cooler, I dunno
Memory 32GB 1866-DDR3 ECC
Video Card(s) XFX RX 5600XT
Storage WD NVME 1GB
Display(s) ASUS Pro Art 27"
Case Antec P7 Neo
Correct me if I'm wrong, but I don't believe it's possible to have the L3 off-die with the current CCX design. My understanding is that the L3 is a fundamental part of making a CCX a CCX. Each has its own shared pool of L3 victim cache which is fully inclusive of the L2; cores within a CCX use the L3 to communicate with one another, and CCX's communicate with each other by transferring data between each others' L3 pools. Then there's the issue of latency which is already not very good with CCX design. We'll see though, it would make some amount of sense as caches are very costly in terms of die area and don't benefit much from node shrinks, and I suppose having the CCX's all use a single unified L3 could be considered an architectural advancement.
This is correct, at least with current Zen. We don’t really know how Zen 2 works yet. However, I’d be very surprised if Zen 2 deviated from this, as that would not only be a pretty significant design change, but it would also likely add latency to the the L3 and intercore communication, resulting in IPC decreases.
 
Joined
Jun 10, 2014
Messages
2,985 (0.78/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
Then there's the issue of latency which is already not very good with CCX design. We'll see though, it would make some amount of sense as caches are very costly in terms of die area and don't benefit much from node shrinks, and I suppose having the CCX's all use a single unified L3 could be considered an architectural advancement.
Well, cache is one of the things that benefits the most from die shrinks. Cache is on the least thermally intensive end of the scale, which means it can be packed tighter, compared to FPUs, ALUs and register files which are on the opposite end of the scale. When it comes to packing cache tight, it comes more down to placement in terms of latency and in relation to the other parts of the design that needs to interact with it. So cache has traditionally been challenging to place due to its size and the increasing core complexity, but shrinks should generally help this.

I'm not sure making a unified L3 cache for several chiplets is a good idea in general, but not primarily because of latency as Darmok N Jalad mentioned, but because of the way L3 works. As you said, L3 is a victim cache, and in most designs it's an inclusive cache. There is a reason why Skylake-X changed this, because it's very inefficient use of die space, and it also means that increasing L2 will also decrease the efficiency of L3. As you probably know, modern CPUs typically split L1 cache into instruction and data caches, while L2 and L3 is both. And while L3 cache is shared between multiple cores, the actual sharing is commonly very minimal. The entire cache is overwritten every few microseconds, so the chance of two cores needing data from the same cache line is very minimal, because when you have multiple threads working, they have to use separate data, otherwise they would stall all the time. So the only thing that is generally shared between cores is instructions, if the cores are executing the same part of the code of course. And the few times the times the L3 victim cache is useful for data, it's usually from the same core that evicted it. So to sum up, L3 is largely wasteful in its current application, and only gives minor benefits.

I think it's time to re-evaluate L3 cache's role, and the changes Intel did in Skylake-X is probably just the beginning. Perhaps a split L3 cache, or instructions only L3 cache? Perhaps L3 shouldn't be shared and be data only, but L4 be instructions only and shared?
 
Last edited:
Top