• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

AMD Ryzen 9 7900X CPU-Z Benched, Falls Short of Core i7-12700K in ST, Probably Due to Temperature Throttling

For anyone who's actually paying attention, you already knew that Zen 4 clock-normalized IPC was only a match to Alder Lake, not Raptor Lake. On FP Raptor is over 4% faster per clock than Zen 4, and on integer it is about 0.5% faster.

With Raptor Lake having higher clocks, and apparently able to maintain them better, the pure performance win is pretty obvious to predict.

How this plays out in real life applications is going to have a lot to do with the cache and memory performance though. Every indication is that Raptor Lake will have a DDR5 memory speed advantage, but will be disadvantaged vs Zen 4's cache.

Reminder :

1663604660793.png
 
Low quality post by Sora
It has nothing to do with cooling issues.

AMD already showed in their slides that CPU-Z is the least relevent metric for their next gen Zen4 CPUs. Go read those slides from the reveal.

CPU-Z is totally meaningless and doesn't proxy for the average performance at all.

You people will come up with anything won't you.

IPC is IPC.
 
Last edited by a moderator:
Actually, Skylake and Zen 2/3 have similarly sized structures, but Zen 3 has a superior front end (much better branch prediction and larger micro op cache) and cache hierarchy. Skylake's derivatives only put up a fight due to high clock speeds and low memory latency.

I agree that AMD needs to focus on increasing IPC as increasing clocks is not a winning strategy for servers. However, servers don't need the highest single thread performance; overall throughput with acceptable single thread performance is the need there. Increasing core counts is the primary lever for this, and as interconnect power is already too high, they may opt for increasing the number of cores per CCD in the non Zen 4c cores as well.
 
You people will come up with anything won't you.

IPC is IPC.
"IPC" in the way it's used around x86 CPUs has essentially nothing to do with the technical term IPC (which can be calculated relatively simply from hardware features present in the core, at least for in-order cores), and is instead a summation of general clock-normalized performance typically derived from running a broad selection of benchmark workloads while normalizing for clock speeds. This is the only really useful understanding of the term for these CPUs, as they are so complex and their workloads are so diverse as to render any simplistic, technical understanding of IPC entirely useless in light of real-world performance. There's nothing fanboyish about this - it's just accepting the reality that a massively complex OoO CPU with advanced instruction handling, ILP optimizaitons, etc., makes any technical understanding of IPC useless. This is also exactly how Intel uses the term IPC - as "performance in a selection of workloads per core per clock".
 
Low quality post by AnotherReader
You fanboys will come up with anything won't you.

IPC is IPC.
That's like saying you can read one page of a difficult mathematics textbook in the same time as a page of your favourite young adults book.
 
Actually, Skylake and Zen 2/3 have similarly sized structures, but Zen 3 has a superior front end (much better branch prediction and larger micro op cache) and cache hierarchy. Skylake's derivatives only put up a fight due to high clock speeds and low memory latency.

I agree that AMD needs to focus on increasing IPC as increasing clocks is not a winning strategy for servers. However, servers don't need the highest single thread performance; overall throughput with acceptable single thread performance is the need there. Increasing core counts is the primary lever for this, and as interconnect power is already too high, they may opt for increasing the number of cores per CCD in the non Zen 4c cores as well.

Servers for many, many uses are really not as straightforward as just measuring IPC.

A lot of uses don't really need floating point math, or frankly don't need much compute at all. This is one of the problems with looking at something like SPEC for servers. SPEC is good for measuring HPC application performance, but what percent of servers are spending any significant time doing that? My guess is something well below 5%.

Many implementations will use a DB2 database from a mainframe to serve up the data, and use an HPC compute platform to analyze that data, and then send results back. You use the right tool for the right task.

AMD's real power in the server space is concentrated on high core counts in small spaces with low heat density. Selling these cores off for cheap with low maintenance cost is in the area of cloud providers and web sites, neither of which really needs high IPC on any single thread. That is really where AMD excels, when you need to handle 10,000 people hitting your web page and you just need a bunch of cores to prevent context switch thrashing, but they don't need to be fast cores. I would call it 'front end' applications.

Back-end applications, the core density becomes a minor expense, and IPC matters. This is because of licensing costs. I'd rather have 48 very fast cores on a DB server than 96 slow ones, because every core I put on that database license costs me about $12,000. Do the math.
 
Servers for many, many uses are really not as straightforward as just measuring IPC.

A lot of uses don't really need floating point math, or frankly don't need much compute at all. This is one of the problems with looking at something like SPEC for servers. SPEC is good for measuring HPC application performance, but what percent of servers are spending any significant time doing that? My guess is something well below 5%.

Many implementations will use a DB2 database from a mainframe to serve up the data, and use an HPC compute platform to analyze that data, and then send results back. You use the right tool for the right task.

AMD's real power in the server space is concentrated on high core counts in small spaces with low heat density. Selling these cores off for cheap with low maintenance cost is in the area of cloud providers and web sites, neither of which really needs high IPC on any single thread. That is really where AMD excels, when you need to handle 10,000 people hitting your web page and you just need a bunch of cores to prevent context switch thrashing, but they don't need to be fast cores. I would call it 'front end' applications.

Back-end applications, the core density becomes a minor expense, and IPC matters. This is because of licensing costs. I'd rather have 48 very fast cores on a DB server than 96 slow ones, because every core I put on that database license costs me about $12,000. Do the math.
Well summarized. And that seems to be pretty much why AMD is expanding into ever more slightly separate EPYC lines - lower power, density optimized, clock optimized, and now 3D cache equipped too. But there are other challenges to adding cores too: as you reach higher core counts, the impact of any increase diminishes. If you have a 32-core CPU and add 32, that's a 100% increase, but when you're at 128 cores and add 32 that's just 25% up - yet the difficulty of adding those additional 32 cores is going to be higher (and thus more expensive). At some point, increasing cores per socket just doesn't make much sense (though this is a moving target as various technologies develop) as diminishing returns are eclipsed by costs even for those workloads that can actually utilize all those cores. That's why maintaining a focus on IPC improvements is so important - it's the one thing that makes nearly everything better (even if you don't need the performance, you can clock down and save power).
 
Servers for many, many uses are really not as straightforward as just measuring IPC.

A lot of uses don't really need floating point math, or frankly don't need much compute at all. This is one of the problems with looking at something like SPEC for servers. SPEC is good for measuring HPC application performance, but what percent of servers are spending any significant time doing that? My guess is something well below 5%.

Many implementations will use a DB2 database from a mainframe to serve up the data, and use an HPC compute platform to analyze that data, and then send results back. You use the right tool for the right task.

AMD's real power in the server space is concentrated on high core counts in small spaces with low heat density. Selling these cores off for cheap with low maintenance cost is in the area of cloud providers and web sites, neither of which really needs high IPC on any single thread. That is really where AMD excels, when you need to handle 10,000 people hitting your web page and you just need a bunch of cores to prevent context switch thrashing, but they don't need to be fast cores. I would call it 'front end' applications.

Back-end applications, the core density becomes a minor expense, and IPC matters. This is because of licensing costs. I'd rather have 48 very fast cores on a DB server than 96 slow ones, because every core I put on that database license costs me about $12,000. Do the math.
I agree that for a lot of applications, having fewer fast cores is better than a lot of slower cores, and you're right again that FP is useless for most server applications. This is why AMD has specific SKUs like the EPYC 75F3 for high licensing cost applications. You're also right that SPEC is oriented towards workstations. That is why you should always test your own application on these servers. However, dismissing AMD's cores as fit only for web servers, is very short sighted.

Well summarized. And that seems to be pretty much why AMD is expanding into ever more slightly separate EPYC lines - lower power, density optimized, clock optimized, and now 3D cache equipped too. But there are other challenges to adding cores too: as you reach higher core counts, the impact of any increase diminishes. If you have a 32-core CPU and add 32, that's a 100% increase, but when you're at 128 cores and add 32 that's just 25% up - yet the difficulty of adding those additional 32 cores is going to be higher (and thus more expensive). At some point, increasing cores per socket just doesn't make much sense (though this is a moving target as various technologies develop) as diminishing returns are eclipsed by costs even for those workloads that can actually utilize all those cores. That's why maintaining a focus on IPC improvements is so important - it's the one thing that makes nearly everything better (even if you don't need the performance, you can clock down and save power).
Yes, IPC increase is probably the best way, but IPC increase isn't uniform as different applications have different bottlenecks. Before the much lamented end of Dennard scaling, the best way to increase performance for all non memory bound applications, was to increase clock speed.
 
For anyone who's actually paying attention, you already knew that Zen 4 clock-normalized IPC was only a match to Alder Lake, not Raptor Lake. On FP Raptor is over 4% faster per clock than Zen 4, and on integer it is about 0.5% faster.

With Raptor Lake having higher clocks, and apparently able to maintain them better, the pure performance win is pretty obvious to predict.

How this plays out in real life applications is going to have a lot to do with the cache and memory performance though. Every indication is that Raptor Lake will have a DDR5 memory speed advantage, but will be disadvantaged vs Zen 4's cache.

Reminder :

View attachment 262237
Zen4c will be a threat to intel on the mobile front.. meteor lake needs big ipc to keep the lead in the mobile market
 
I agree that for a lot of applications, having fewer fast cores is better than a lot of slower cores, and you're right again that FP is useless for most server applications. This is why AMD has specific SKUs like the EPYC 75F3 for high licensing cost applications. You're also right that SPEC is oriented towards workstations. That is why you should always test your own application on these servers. However, dismissing AMD's cores as fit only for web servers, is very short sighted.

I didn't say that. I said their strong suit was front end.

When you get into this area, for example benchmarking using TPC, nobody cares what CPU you are using. You're benchmarking a system, not a chip.

"Submission of a TPC-C result also requires the disclosure of the detailed pricing of the tested configuration, including hardware and software maintenance with 7/24 coverage over a three-year period. The priced system has to include not only the system itself, but also sufficient storage to hold the data generated by running the system at the quoted tpmC rate over a period of 60 days. "

 
Well summarized. And that seems to be pretty much why AMD is expanding into ever more slightly separate EPYC lines - lower power, density optimized, clock optimized, and now 3D cache equipped too. But there are other challenges to adding cores too: as you reach higher core counts, the impact of any increase diminishes. If you have a 32-core CPU and add 32, that's a 100% increase, but when you're at 128 cores and add 32 that's just 25% up - yet the difficulty of adding those additional 32 cores is going to be higher (and thus more expensive). At some point, increasing cores per socket just doesn't make much sense (though this is a moving target as various technologies develop) as diminishing returns are eclipsed by costs even for those workloads that can actually utilize all those cores. That's why maintaining a focus on IPC improvements is so important - it's the one thing that makes nearly everything better (even if you don't need the performance, you can clock down and save power).
Zen4c cores are brilliant .. i.m sure epyc zen 5 will have over 350 cores lol
 
I didn't say that. I said their strong suit was front end.

When you get into this area, for example benchmarking using TPC, nobody cares what CPU you are using. You're benchmarking a system, not a chip.

"Submission of a TPC-C result also requires the disclosure of the detailed pricing of the tested configuration, including hardware and software maintenance with 7/24 coverage over a three-year period. The priced system has to include not only the system itself, but also sufficient storage to hold the data generated by running the system at the quoted tpmC rate over a period of 60 days. "

I regret my misunderstanding; however, I still stand by my broader point. These are excellent CPUs for most server applications, and barring AVX-512 based code, would only lose to the Intel alternatives in cases where a single process is using more than 8 cores and using frequent inter core communication. Fortunately, there are very few applications like that. For TPC-C, the CPU is an important component of the system, but the rest of the system is very important. However, I was thinking more of TPC-H or TPC-DS, as there the CPU is more important than is the case in TCP-C.
 
looks solid to me tbh
 
Well summarized. And that seems to be pretty much why AMD is expanding into ever more slightly separate EPYC lines - lower power, density optimized, clock optimized, and now 3D cache equipped too. But there are other challenges to adding cores too: as you reach higher core counts, the impact of any increase diminishes. If you have a 32-core CPU and add 32, that's a 100% increase, but when you're at 128 cores and add 32 that's just 25% up - yet the difficulty of adding those additional 32 cores is going to be higher (and thus more expensive). At some point, increasing cores per socket just doesn't make much sense (though this is a moving target as various technologies develop) as diminishing returns are eclipsed by costs even for those workloads that can actually utilize all those cores. That's why maintaining a focus on IPC improvements is so important - it's the one thing that makes nearly everything better (even if you don't need the performance, you can clock down and save power).

Actually memory bandwidth and IO subsystem bandwidth are more important than IPC in most cases.

Compute takes a big back seat to that 95% of the time. HPC is fun to talk about, but for example in TPC-C which basically simulates a complete supply chain from order entry to warehouse and inventory operations, it's going to care a whole lot more about how quickly you can move data than how quickly you can analyze it. A lot of that performance goes past just any single CPU core, and starts to look at how well you can scale up to for example 100 cpus and 4000 cores and still be able to move data around quickly.

Data analysis - which falls into the HPC arena where "IPC" according to something like SPEC - has grown in size, a lot, but the core is still all those mundane entering data, reporting on it, updating it, backing it up, serving it up to front end apps type of operations.
 
Well it depends on cooling used. But aperently we will either need very good CPU cooler for ZEN 4 or higer Wattage PSU for Raptor Lake

There are physical limits... putting more energy in smaller die sizes makes cooling problematic. Even with water you will get problems at some point and you don't want to pay the bill for an LNG cooled system. We should expect larger chips again - maybe even with a drop in chip speeds - and a massive increase in corecounts. intel is already moving that direction with Rocketlake and AMD will most likely do the same with Zen 5. Zen 4 is probably not the best bet to swtich sockets.
 
Did anyone notice the uncore frequency in the CPU-Z validation screen? If true, that would be an incredible improvement over Zen 3.
 
CPU-Z is simply a bad benchmark to compare different processor families. It run very well on the original Ryzen, beating Skylake at the same clock speed in ST. In version 1.79 the benchmark engine was changed significantly in favor of Intel. It doesn't seem to have any meaning for real world performance. It's more or less just a meaningful microbenchmark to show some numbers.
 
We should expect larger chips again - maybe even with a drop in chip speeds - and a massive increase in corecounts. intel is already moving that direction with Rocketlake and AMD will most likely do the same with Zen 5. Zen 4 is probably not the best bet to swtich sockets.
Larger chips cost more to make, and consumer workloads don't need more cores, so that's a bit of a bind for AMD and Intel. Theres no reason for consumers to wait if what they're waiting for is more cores. Even the 5950X has unnecessarily many threads. Games are becoming more/better threaded but they won't be moving meaningfully past 8c16t any time soon.
 
For anyone who's actually paying attention, you already knew that Zen 4 clock-normalized IPC was only a match to Alder Lake, not Raptor Lake. On FP Raptor is over 4% faster per clock than Zen 4, and on integer it is about 0.5% faster.

With Raptor Lake having higher clocks, and apparently able to maintain them better, the pure performance win is pretty obvious to predict.
That IPC comparison was done at 3.6 GHz. But boost clock speeds will be way higher. We don't know yet how clock speed scaling affects IPC. But we know that Intel chips tend to throttle clock speed more with high AVX utilization. So, an IPC comparison at 3.6 GHz might not be very meaningful for actual performance differences at 5+ GHz. Especially if we look at fp throughput. Integer performance also is more important on average. Which is practically the same for Zen 4 and Golden/Raptor Cove. It's wrong that Raptor Lake will have higher clock speeds. 5.7 GHz 7950X vs 5.8 GHz 13900K, 5.6 GHz 7900X vs 5.4 GHz 13700K, 5.4 GHz 7700X vs 5.1 GHz 13600K, 5.3 GHz 7600X. It seems that only Raptor Lake's top model has slightly higher clock speed. I also wouldn't expect Raptor Lake to maintain clock speeds better. Because Zen 4 is likely using a lot less power for a single core.
 
That IPC comparison was done at 3.6 GHz. But boost clock speeds will be way higher. We don't know yet how clock speed scaling affects IPC. But we know that Intel chips tend to throttle clock speed more with high AVX utilization. So, an IPC comparison at 3.6 GHz might not be very meaningful for actual performance differences at 5+ GHz. Especially if we look at fp throughput. Integer performance also is more important on average. Which is practically the same for Zen 4 and Golden/Raptor Cove. It's wrong that Raptor Lake will have higher clock speeds. 5.7 GHz 7950X vs 5.8 GHz 13900K, 5.6 GHz 7900X vs 5.4 GHz 13700K, 5.4 GHz 7700X vs 5.1 GHz 13600K, 5.3 GHz 7600X. It seems that only Raptor Lake's top model has slightly higher clock speed. I also wouldn't expect Raptor Lake to maintain clock speeds better. Because Zen 4 is likely using a lot less power for a single core.

The equivocation is breathtaking.

The SPEC measurements were done at a set clock speed, yes, that's how you measure Instructions Per Clock.

Cpu-Z wasn't.
 
Outside of disabled AVX units, Intel does not have meaningfully more dark silicon than AMD does. Their cores simply require more transistors and thus more space.

i clearly remember Lisa saying there would be less "empty space" on the new chips, therefore smaller, not sure when she said it but she said it
 
What mistake?

You got 8 cores crammed in a tiny little CCD with a bunch of cache that needs to be always powered on.

Once you start crossing the 5Ghz mark temps just fly upwards and it wont be better the higher or faster you go.

Both camps are pretty much at their limit to what they can do here really. Going smaller would only yield a more difficult source of heat to cool.

But really if i look at those graphs... 50 points behind on ST but a massive improvement on MT. I know my pick.

I don't accept. It is clearly a strategic mistake driven by the profit margin seek.
It's already 2023, we need more than just 8 cores.

They could have offered a larger die with more cores, lower clocks but overall higher performance.
 
i clearly remember Lisa saying there would be less "empty space" on the new chips, therefore smaller, not sure when she said it but she said it
That sounds weird to me, but I guess it's possible? That's a very strange thing to be talking about - after all, leaving dark silicon to alleviate thermal density is a long-standing chipmaking convention, and not one that's viewed negatively, nor one that anyone outside of chip designers tends to care about, so it doesn't make much sense to present that as a positive change either.

The equivocation is breathtaking.

The SPEC measurements were done at a set clock speed, yes, that's how you measure Instructions Per Clock.

Cpu-Z wasn't.
I mean, there are some minor variations in IPC related to clock speed as not all parts of the core are necessarily clock matched - caches, interconnects, etc. But those variations tend to be very small - small enough to be drowned out by the rest of the noise from testing.
 
Not complaining, just adding perspective. <5% is nothing. 13% is a bit more than nothing.

Come on, the context here is important. First of all Alder Lake for example did very well at Cinebench, it is literally the best performing bench for those CPUs and yet people use those improvements as representative of everything. Actually Alder Lake IPC improvements on average with real word tests like in the AMD slide are quite low. Don't compare this 13 percent number to a Cinebench number for ADL.

Secondly ADL succeeded 11th gen which had the same high clock speeds. AMD gets both IPC and massive clock speed improvements with Zen4. The jump from Zen3 to Zen4 will be quite large. I'm looking forward to making a Zen generation comparison chart.
 
Come on, the context here is important. First of all Alder Lake for example did very well at Cinebench, it is literally the best performing bench for those CPUs and yet people use those improvements as representative of everything. Actually Alder Lake IPC improvements on average with real word tests like in the AMD slide are quite low. Don't compare this 13 percent number to a Cinebench number for ADL.

Secondly ADL succeeded 11th gen which had the same high clock speeds. AMD gets both IPC and massive clock speed improvements with Zen4. The jump from Zen3 to Zen4 will be quite large. I'm looking forward to making a Zen generation comparison chart.
Don't get me wrong, I like the fact CPUs make slightly larger generational jumps, but none of this is making or breaking anything new. We're going to see a higher power draw in part of the exchange for more performance, too. Not on IPC, but let's see how much of a difference we'll really get in real life.
 
Back
Top