Monday, October 17th 2022

AMD Ryzen 9 7950X Posts Significantly Higher Gaming Performance with a CCD Disabled

AMD Ryzen 9 7950X 16-core processor exhibits some strange behavior with regards to the max boost frequency spread among its cores. A multi-chip module with two 8-core CCDs (CPU complex dies); we noticed early on in our review that the cores located in CCD-1 boost to a higher frequency than the ones in CCD-2, with differences as high as 300 MHz. CapFrameX noticed that when CCD-2 is disabled on a machine running Windows 11 22H2, the processor actually puts out higher gaming performance, by as much as 10%. This is mainly because the cores in CCD-2, with a lower maximum boost frequency no longer handle processing load from the game; and with CCD-2 disabled, CCD-1 has all of the processor's power budget—up to 230 W—to itself, giving it much higher boost residency across its 8 cores.
Source: CapFrameX
Add your own comment

47 Comments on AMD Ryzen 9 7950X Posts Significantly Higher Gaming Performance with a CCD Disabled

#1
Selaya
any plans for a tpu review/bench of that?
Posted on Reply
#2
InVasMani
That's basically what I said and indicated in the forums on the x7950 review after reading it and spotting the CCX behavior issue W1zzard highlighted.
InVasManiWhat's it look in terms of performance, temps, and boosts if you disable the second CCX similar to E cores being disabled. I can see that as a reasonable reason to consider doing so actually while gaming or doing lighter tasks especially if you can just do it easily with software from the desktop rather than venturing into the bios to do so. I'm not sure if AMD has them setup to be disabled from the desktop software or not though they've done quite a lot with software so I wouldn't doubt it.

Provided you can still access the full L3 cache with 1 of the 2 CCX disabled I really don't see a issue with it at all and if anything it could even provide some better performance depending on application and system usage. Perhaps AMD can work with Microsoft on Windows scheduler to determine what's best based on usage at when the other CCX is basically effectively idle disable them until it needs to be activated or assign it to background time slice tasks only, but not foreground tasks.
I also spotted this earlier on the 5800X3D5.5GHz and from what I can see they dropped the uncore frequency to reduce CPU temps and/or voltage requirements by slowing the CPU cache frequency to a more pedestrian 1066MHz rather than something more like x4 that amount. In turn it allows for higher CPU scaling to be possible, but with some caveats though I think the stacked cache actually softens the performance penalties you'd normally encounter around it quite a lot because it avoids lots of cache misses you'd normally encounter that are actually a order of magnitude worse.
Posted on Reply
#3
tabascosauz
I noticed @W1zzard tested on Win 11 - even on Zen 3 the 11 scheduler is far less accommodating of 2CCD. I've been saying for a long time that it not only treats 2CCD like a big.Little CPU (CCD2 acting as "little"), but also regularly disrespects CCX hierarchy by juggling load from CCD1 preferred cores all the way onto Windows' designated CCD2 background core. Which inevitably incurs inter-CCD performance penalties. Windows 10 at the very least still kept loads within CCDs. Wouldn't be surprised if the 7950X isn't the only CPU suffering this way. Gamersnexus' review with the 7950X's abysmal showing in a few games seems to suggest that.

Can avoid some scheduler behaviours by disabling CPPC Preferred Cores on 1CCD CPUs, but for 2CCD it doesn't do much to avoid Windows picking some CCD2 core.
InVasManiThat's basically what I said and indicated in the forums on the x7950 review after reading it and spotting the CCX behavior issue W1zzard highlighted.


I also spotted this earlier on the 5800X3D5.5GHz and from what I can see they dropped the uncore frequency to reduce CPU temps and/or voltage requirements by slowing the CPU cache frequency to a more pedestrian 1066MHz rather than something more like x4 that amount. In turn it allows for higher CPU scaling to be possible, but with some caveats though I think the stacked cache actually softens the performance penalties you'd normally encounter around it quite a lot because it avoids lots of cache misses you'd normally encounter that are actually a order of magnitude worse.
The 5800X3D result is not the same.

Uncore in CPU-Z is Fabric FCLK for Ryzens.

L3 runs on its own clock that usually (but not always, especially for X3D) mirrors core clocks. It doesn't share clock domain, nor voltage domain with Fabric.
Posted on Reply
#4
InVasMani
All I know is dropping uncore frequency for Intel drops temps quite a lot and would really expect similar for AMD. It's promising if the scaling actually in fact performs well enough to be worthy of consideration. It has me interested to see more about it and possible trade offs.
Posted on Reply
#5
Crackong
Isn't this a problem of bug11 ?
Posted on Reply
#6
phanbuey
software scheduler issue but can be fixed with updates to kernel?

I thought they already addressed such issues with Zen 3
Posted on Reply
#7
InVasMani
I think I recall Linus indicating that either Threadripper or Epyc behaving similarly. I'm not sure it's the CCX's at fault so much as disabling a good number of them confuses the scheduler usage to a lesser extent. Beyond that disabling half the cores obviously makes thermals easier to tackle. I seems complicated much like hyper threading.
Posted on Reply
#8
Rowsol
Fascinating. Turning off SMT helps too. I'd like to see some benches for a pure gaming setup like that.
Posted on Reply
#9
ratirt
I think it depends on the CCD and which one is disabled. I'm sure one is a bit slower than the other? Maybe it is the latency issue again but considering win 11, anything is possible.
Posted on Reply
#10
zmeul
what's the point of buying an expensive platform then disabling 1/2 of it?
Posted on Reply
#11
Ferrum Master
It just shows that CPPC2 isn't working as it should again... it needs smarter profiling.
Posted on Reply
#12
Bwaze
Ryzen Gaming Mode ON!
Posted on Reply
#13
mama
Microsoft strikes again!
Posted on Reply
#14
The King
So is the 7700X not single CCD? If it is should it not be outperforming the 7950X in benchmarks?

When you disable 1 CDD on the 7950X does that mean the L3 cache available is 32MB or is it still at 64MB?

1CDD with access to 64MB of L3 should be faster in games??
Posted on Reply
#15
Prima.Vera
What happens if you disable E-Cores for an Intel CPU? Do you gain the same performance in games?
Posted on Reply
#16
Ferrum Master
mamaMicrosoft strikes again!
Nope... this could be much more complicated that you think.

In September there was and ACPI fix speeding up all AMD arch on linux... TPU doesn't cover those kind of news as it ain't yellow enough.

I like the quote... you don't have to guess who said this.
ACPI is a complete design disaster in every way. But we're kind of stuck with it. If any Intel people are listening to this and you had anything to do with ACPI, shoot yourself now, before you reproduce.
Posted on Reply
#17
clopezi
zmeulwhat's the point of buying an expensive platform then disabling 1/2 of it?
Because it's not a CPU for gaming, it's a CPU for work and it has very good gaming performance too. If you want TOP gaming performance, maybe it's better to buy another model...
Prima.VeraWhat happens if you disable E-Cores for an Intel CPU? Do you gain the same performance in games?
E-Cores and P-cores are better done in this way, because are "smart" cores. In Ryzen, CCD it's a feature design, take or leave it hehe
Posted on Reply
#18
tabascosauz
The KingSo is the 7700X not single CCD? If it is should it not be outperforming the 7950X in benchmarks?

When you disable 1 CDD on the 7950X does that mean the L3 cache available is 32MB or is it still at 64MB?

1CDD with access to 64MB of L3 should be faster in games??
The idea that 1 core has access to 64MB L3 in 2CCD should theoretically be possible, but pretty clear from history of 2CCD parts that cache on the other CCD either does not matter, or the Ryzen design is simply not designed to take advantage of it.

There's a good reason why CPU-Z describes 2CCD L3 as being 4x16MB (Matisse) or 2x32MB (Vermeer, Raphael), not simply 64MB.
Posted on Reply
#19
john_
Maybe we have an indication of what to expect from a 3D cache model?
I mean, does the 1 CCD of 7950X has access to much more cache compared to 7700X and 7600X?
Posted on Reply
#20
DrGrossman
Is not the boost, is the latency penalty between CCDs. My 7950x is locked on all cores at 5.5 on all games and exhibits the same behavior when you set the affinity on just one CCD. For example on Riftbreaker the difference is ~30 fps. Other games respond well with affinity on physical cores only, like Cyberpunk, Battlefield 2042. In most of the games I saw an uplift in average and 1% by manually setting the affinity. The issue is old but for some reason is accentuated on this platform. Windows doesn't care, AMD doesn't care, developers doesn't care either....so don't cripple your CPU, just use Process Lasso :rolleyes:
Posted on Reply
#21
HenrySomeone
Just goes to show - for gaming at least, monolithic is still the way to go!
Posted on Reply
#22
Wirko
tabascosauzThe idea that 1 core has access to 64MB L3 in 2CCD should theoretically be possible, but pretty clear from history of 2CCD parts that cache on the other CCD either does not matter, or the Ryzen design is simply not designed to take advantage of it.

There's a good reason why CPU-Z describes 2CCD L3 as being 4x16MB (Matisse) or 2x32MB (Vermeer, Raphael), not simply 64MB.
Latency when accessing other CCD's L3 is about the same as when going to RAM. Anand's big charts, those that show each core's latency when communicating with each other core, show that clearly.
Posted on Reply
#23
Vayra86
zmeulwhat's the point of buying an expensive platform then disabling 1/2 of it?
2% moar fps bro, you need this to get your chicken dinner
Posted on Reply
#24
ToTTenTranz
If games make no use of more than 7 cores and disabling the second CCD still makes its L3 cache available to the first (though at a latency penalty, sure), then this makes some sense.
There's more energy and effective bandwidth available to the enabled CCD. There's also the fact that the disabled CCD makes zero requests to the system memory.

All this just means that Zen4 non-X3D is probably more bandwidth-limited than anything else, just like Zen3.
Posted on Reply
#25
Wirko
btarunrwith CCD-2 disabled, CCD-1 has all of the processor's power budget—up to 230 W—to itself, giving it much higher boost residency across its 8 cores
Regarding the concentration of heat, wouldn't it be better to disable half of the cores in each CCD? Also the full L3 would remain active, and even if it's not really acting as one 64MB block, it still seems better to have two 32MB blocks than only one.
Posted on Reply
Add your own comment
Nov 8th, 2024 22:54 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts