Monday, April 24th 2023

AMD Ryzen 7000X3D Processors Prone to Physical Damage with Voltage-assisted Overclocking, Motherboard Vendors Rush BIOS Updates with Voltage Limiters

AMD Ryzen 7000X3D processors are prone to irreversible physical damage if CPU overclocking is attempted at some of the higher VDDCR voltages (the main power domain for the CPU cores). A Redditor who goes by Speedrookie, attempted to overclock their Ryzen 7 7800X3D, leading to an irreversible failure. The motherboard socket and the processor's land-grid contacts, show signs of overheating damage caused by the contacts melting from too much current draw.

A Ryzen 7000X3D processor features a special CPU complex die (CCD) with stacked 3D Vertical Cache memory. This cache die is located in the central region over the CCD where its 32 MB on-die L3 cache is located, while the difference in Z-height of the stacked die is filled up by structural silicon, which sit over the regions of the CCD with the 8 "Zen 4" CPU cores. It stands to reason that besides having an inferior thermal transfer setup to conventional "Zen 4" CCDs (without the 3DV cache), the CCD itself has a higher power-draw at any given clock-speed than a conventional CCD (since it's also powering the L3D). This is the main reason why overclocking capabilities on the 7000X3D processors are almost non-existent, and the processor's power limits are generally lower than their regular Ryzen 7000X counterparts. Attempting to dial up voltage kicks up the perfect storm for these processors.
Igor's Lab posted a detailed analysis of the region of the Socket AM5 land-grid most susceptible to a burn-out in the above scenario. The central region of the LGA has 93 pins dedicated to the VDDCR power domain, dispersed in a mostly checkered pattern, toward the center of the land-grid. Igor isolated 6 of these VDDCR pins in particular, which are most prone to physical damage, as they are located in a region below the CCD that sees it sandwiched between the L3D (stacked 3D Vertical cache die), and the fiberglass substrate below. Apparently, AMD's thermal and electrical protection mechanisms aren't able to prevent a runaway overheating of the pins that causes the substrate to melt, deform, and bulge outward, resulting in irreversible damage to both the processor and the socket.

Meanwhile, AMD's motherboard partners are rushing to release UEFI BIOS updates for their entire lineups of motherboards, which enforce tighter limits on the VDDCR voltage. MSI is the first motherboard manufacturer with such updates. MSI, in a press statement, stated that it has redesigned automated overclocking for 7000X3D processors. "The BIOS now only supports negative offset voltage settings, which can reduce the CPU voltage only," the MSI statement to Tom's Hardware reads. "MSI Center also restricts any direct voltage and frequency adjustments, ensuring that the CPU won't be damaged due to over-voltage." On the other hand, the update introduces an automated overclocking feature called Enhanced Mode Boost, which optimizes PBO settings to improve boost frequency residency, without any manual voltage adjustments.
Sources: Tom's Hardware 1, 2, Igor's Lab, Speedrookie (Reddit)
Add your own comment

258 Comments on AMD Ryzen 7000X3D Processors Prone to Physical Damage with Voltage-assisted Overclocking, Motherboard Vendors Rush BIOS Updates with Voltage Limiters

#201
Zubasa
AusWolfI wonder if there's been any more news about this, or any more cases confirmed.
Wait for GN's video on the topic, it is going to be spicy.
Posted on Reply
#202
AusWolf
ZubasaWait for GN's video on the topic, it is going to be spicy.
Nice. I gather it's actually Asus at fault here.
Posted on Reply
#203
Zubasa
AusWolfNice. I gather it's actually Asus at fault here.
It is pretty clear from the start it is a board vendor fault, they have been sneaking in Auto OCs to make themselves look good in reviews, and it is nothing new.
By constantly trying to one up each-other, It is only a matter of time until one of them went too far.

The CPU by itself is just an inert object. It cannot generate any of the voltages it requires to function.
I have never heard of CPUs killing motherboards, only the other way around.
Posted on Reply
#204
Dirt Chip
ZubasaIt is pretty clear from the start it is a board vendor fault, they have been sneaking in Auto OCs to make themselves look good in reviews, and it is nothing new.
By constantly trying to one up each-other, It is only a matter of time until one of them went too far.

The CPU by itself is just an inert object. It cannot generate any of the voltages it requires to function.
I have never heard of CPUs killing motherboards, only the other way around.
Thing is, even at extreme voltage and temp the CPU power\voltage\temp protection should kick in and shutdown before any motherboard damage accrue.
Unless all vendors bypass those protection (and then it will be an amusing shit-show to watch) than the CPU protection design is responsible as well.
The bus is long, plenty of room for all under it..

Also, it will be interesting debate about whos fault is when you get out-of-the-box OC from the mobo settings that practically cancel your warranty before even using the product.
Posted on Reply
#205
AusWolf
ZubasaIt is pretty clear from the start it is a board vendor fault, they have been sneaking in Auto OCs to make themselves look good in reviews, and it is nothing new.
By constantly trying to one up each-other, It is only a matter of time until one of them went too far.

The CPU by itself is just an inert object. It cannot generate any of the voltages it requires to function.
I have never heard of CPUs killing motherboards, only the other way around.
Exactly my thoughts! Yet, here we are with a headline article titled "Ryzen 7000X3D CPUs have a voltage problem". No, they don't. Certain Asus boards with certain BIOS versions do. This is some Der8auer level of flaming and shaming. At least TPU should be more technically literate than this. :shadedshu:
Dirt ChipThing is, even at extreme voltage and temp the CPU power\voltage\temp protection should kick in and shutdown before any motherboard damage accrue.
Unless all vendors bypass those protection (and then it will be an amusing shit-show to watch) than the CPU protection design is responsible as well.
The bus is long, plenty of room for all under it..
I'm wondering how much of that protection is built into the CPU itself, and how much of it is part of the motherboard BIOS.
Dirt ChipAlso, it will be interesting debate about whos fault is when you get out-of-the-box OC from the mobo settings that practically cancel your warranty before even using the product.
That's been an ongoing debate ever since boost clocks were invented. Interesting question, nonetheless.
Posted on Reply
#206
Zubasa
Dirt ChipThing is, even at extreme voltage and temp the CPU power\voltage\temp protection should kick in and shutdown before any motherboard damage accrue.
Unless all vendors bypass those protection (and then it will be an amusing shit-show to watch) than the CPU protection design is responsible as well.
The bus is long, plenty of room for all under it..

Also, it will be interesting debate about whos fault is when you get out-of-the-box OC from the mobo settings that practically cancel your warranty before even using the product.
AusWolfI'm wondering how much of that protection is built into the CPU itself, and how much of it is part of the motherboard BIOS.


That's been an ongoing debate ever since boost clocks were invented. Interesting question, nonetheless.
I am leaning towards the motherboard bios, in most case.
For example on Intel's page tjmax on my 12700k is 100C. Yet my motherboard has the option to increase that to 118C and I did tested it with Prime95 smallFFT, the cpu stopped throttling @100C.
It shows that the mobo at least can override the protection to a certain degree.
Posted on Reply
#207
Dirt Chip
AusWolfI'm wondering how much of that protection is built into the CPU itself, and how much of it is part of the motherboard BIOS.


That's been an ongoing debate ever since boost clocks were invented. Interesting question, nonetheless.
If the problem accurse on mobo`s from different vendors than AMD is the root source.
Also, the design of the CPU sensors is solely AMD so if they over-look a (rare) case that enable the CPU to reach extreme temp than it`s their head on the spike (read as: just RMA and move on).
I guess that AMD make a basic bios and then each vendor modify to their suite so it may be that the CPU hardware\design\sensors is spotless and the problem is the basic bios from AMD that doesn't use the data in the right way (and then easy fix).
Posted on Reply
#208
Zubasa
Dirt ChipIf the problem accurse on mobo`s from different vendors than AMD is the root source.
Also, the design of the CPU sensors is solely AMD so if they over-look a (rear) case that enable the CPU to reach extreme temp than it`s there head on the spike (read as: just RMA and move on).
I guess that AMD make a basic bios and then each vendor modify to their suite so it may be that the CPU hardware\design\sensors is spotless and the problem is the basic bios from AMD that doesn't use the data the right way (and then easy fix).
It is not as simple as you think, on my old X399 board there is actually a physical switch labeled ProcHot that disable the thermal shut down.
It is one thing for the CPU to signal the mobo to shutdown, it is another thing if the board keeps going or not.
Posted on Reply
#209
Dirt Chip
ZubasaIt is not as simple as you think, on my old X399 board there is actually a physical switch labeled ProcHot that disable the thermal shut down.
It is one thing for the CPU to signal the mobo to shutdown, it is another thing if the board keeps going or not.
Yep, and if this is the case- a power play (with and without quotes :)) between mobo vendors to AMD about who controls critical CPU protection, all done on the consumer back- than as I said it will be a very good show to watch.

It can put off many mainstream consumer that just want a solid, reliable build. Hope it will never come to that.
Posted on Reply
#210
Zubasa
Dirt ChipYep, and if this is the case- a power play (with and without quotes :) between mobo vendors to AMD about who controls critical CPU protection- than as I said it will be a very good show to watch.

It can put off many mainstream consumer that just want a solid, reliable build. Hope it will never come to that.
TBH the option already exists, for people who just want something reliable.
Those are the machines from the major OEMs like Dell / HP / Lenovo / Acer.
These machines tends to have almost no options in their bios thus very little points of failure. No XMP / EXPO / MCE / PBO whatsoever.
Also no finger pointing with who is at fault when things goes south, you just call the brand you brought it from.
There are many reasons why these are the machines you see in businesses.

I would assume even Asus' own non-ROG OEM machines are much more failsafe than their gaming stuff.
Posted on Reply
#211
Dirt Chip
ZubasaTBH the option already exists, for people who just want something reliable.
Those are the machines from the major OEMs like Dell / HP / Lenovo / Acer.
These machines tends to have almost no options in their bios thus very little points of failure. No XMP / EXPO / MCE / PBO whatsoever.
Also no finger pointing with who is at fault when things goes south, you just call the brand you brought it from.
There are many reasons why these are the machines you see in businesses.

I would assume even Asus' own non-ROG OEM machines are much more failsafe than their gaming stuff.
This is a very bad advice, to opt those OEM even for a normal day-to-day PC..
better go Intel hand-build instead, isn't it?
Posted on Reply
#212
Zubasa
Dirt ChipThis is a very bad advice, to opt those OEM even for a normal day-to-day PC..
better go Intel hand-build instead, isn't it?
FYI, DIY boards are doing all kinds of hackery on the Intel side as well.
You seems to be under the impression that this can only happen on AMD machines.

There is one difference on the Intel side, on Intel there are the Non-K CPUs and B / H series boards that are much more lock down.
Until 12th gen you cannot even do memory OC on B series boards, those effectively performs like OEM machines.
Posted on Reply
#213
Dirt Chip
ZubasaI am leaning towards the motherboard bios, in most case.
Isn`t that phenomena was confirm multiple vendors?
So all of them, separately, doing the same hidden off-spec overvoltageing and\or bypass fundamental CPU protection?
Because all of them removed all old bios I suspect that AMD contact them and instruct what to do (as a new restrictive bios) until further info emerge.
Posted on Reply
#214
Zubasa
Dirt ChipIsn`t that phenomena was confirm multiple vendors?
So all of them, separately, doing the same hidden off-spec overvoltageing and\or bypass fundamental CPU protection?
Because all of them removed all old bios I suspect that AMD contact them and instruct what to do (as a new restrictive bios) until further info emerge.
Yes, you can look up different articles by likes of GN and Hardware Unboxed over the years.
There is specifally a "normal" setting instead of "auto" in each voltage setting on Gigabyte boards which were implemeneted up on reviewers request.
That setting means the board actually respects the Intel / AMD voltage spec instead of the presets "optimized" by the board vendor.
Additionally there is an "Intel POR" setting that sets both PL1 and PL2 to the Intel spec (on the 12700K @190W) instead of auto which is 4095W / unlimited.
This was added so that reviewers have an easier time testing the CPU actully at stock.
Vayra86Correct, I remember this one vividly

www.gamersnexus.net/guides/3590-dont-run-z490-motherboards-with-default-settings-for-your-build
I could easily fish for another handful of generations/sockets where this has occurred at one or more mobo mfgrs.

So again... user due diligence is needed, has been needed and will always be needed with DIY building. Its why we call it DIY. People need to get a life and a good bit of common sense. And AMD/mobo partners need to lock this down on X3D's just as well, because there is literally no purpose to leave it unlocked. And if unlockable, it needs to come with warning on the UEFI toggle.

Frankly if you fried an X3D, you're just a n00b. And you just learned a thing. RMA it / get new one
/thread
Just in case you missed this post from another user.
Posted on Reply
#215
Dirt Chip
ZubasaFYI, DIY boards are doing all kinds of hackery on the Intel side as well.
You seems to be under the impression that this can only happen on AMD machines.

There is one difference on the Intel side, on Intel there are the Non-K CPUs and B / H series boards that are much more lock down.
Until 12th gen you cannot even do memory OC on B series boards, those effectively performs like OEM machines.
I`m talking about possible 'hot potato' of responsibly that will be thrown between AMD and vendors (as steve from GN suggested) as the cause of consumer to choose 'less risky option' just as a matter of the bad-product-image that this fight will make.
That thing is not a far fetched scenario as to with Intel side.
Shit can and have happen on Intel side more than once, but now it is AMD side who take the heat.
If you see a fight , you probably go elsewhere to buy ice-cream (except us, tech nerds, that pray on each yellowish detail and debate it to death..).

And on top of all, just as you said- going non-K Intel is pretty safe. And just like that, the restrictiveness of the platform ('safer' non-K) became it`s best merit...
Posted on Reply
#216
Zubasa
Dirt ChipI`m talking about possible 'hot potato' of responsibly that will be thrown between AMD and vendors (as steve from GN suggested) as the cause of consumer to choose 'less risky option' just as a matter of the bad-product-image that this fight will make.
That thing is not a far fetched scenario as with Intel side.
Shit can and have happen on Intel side more than once, but now it is AMD side who take the heat.
If you see a fight , you probably go elsewhere to buy ice-cream (except us, tech nerds, that pray on each yellowish detail and debate it to death..).

And on top of all, just as you said- going non-K Intel is pretty safe. And just like that, the restrictiveness of the platform ('safer' non-K) became it`s best merit...
FYI, they figured out ways to overclock non-k CPUs too on 12th gen....
On the older gen, as soon as you increase BCLK over 102 the system will refuse to boot, on 12th gen somehow it is no longer the case.
Intel Alder Lake non-K Overclocking via BCLK – der8auer

That Vcore though....
Posted on Reply
#217
Dirt Chip
ZubasaYes, you can look up different articles by likes of GN and Hardware Unboxed over the years.
There is specifally a "normal" setting instead of "auto" in each voltage setting on Gigabyte boards which were implemeneted up on reviewers request.
That setting means the board actually respects the Intel / AMD voltage spec instead of the presets "optimized" by the board vendor.
Additionally there is an "Intel POR" setting that sets both PL1 and PL2 to the Intel spec (on the 12700K @190W) instead of auto which is 4095W / unlimited.
This was added so that reviewers have an easier time testing the CPU actully at stock.



Just in case you missed this post from another user.
I know about vendors changing turbo limits, on both Intel and AMD boards, to gain more pref and that it is much more pronounce on Intel boards (13900k to 350+w, I know, I have one with the same AERO GB board you have) but here we talk about something else - fundamental CPU protection that some how didn't work even with extreme off-spec values.
As if somehow the hierarchy of protection changed (or just plain cancelled) and the CPU was allowed to first go out-of-the-roof voltage (as 4095w analog) and only then apply the temp control (95-105 degree) as opposite to "go to unlimited W as long as under x temp".
ZubasaFYI, they figured out ways to overclock non-k CPUs too on 12th gen....
On the older gen, as soon as you increase BCLK over 102 the system will refuse to boot, on 12th gen somehow it is no longer the case.
Intel Alder Lake non-K Overclocking via BCLK – der8auer

That Vcore though....
A nice fun fact (and limited compering real OC I might add) I know about, thank you. and so, it`s just strengthen your own argue that non-K is 'safer'. You need not-so-little manual bios changes to do that BCLK OC. No ordinary consumer will do that plus no vendor will ship an non-k BCLK-OC PC out of the box (if they have any sense in them..)
Posted on Reply
#218
AusWolf
How hard would it be for Intel and AMD to have stricter specs and enforce motherboard manufacturers to use them as the default setting?
Posted on Reply
#219
Bomby569
It's not just ASUS. Gigabyte, Asrock, Biostar, so something else is broken.

It's a bit like the Nvidia cables, if they were better designed, clear rules on how to use, systems in place to prevent misuse, QA, this would not have happened.

Now it's blame shifting, but i blame AMD and the partners. AMD surely tests their CPU's on the actual mobos they will be working on. Come on. This is like the most basic QA. But just like in games this days, we do the QA and beta testing, and engineering control.

Cases and cases keep pilling up, these companies, software and hardware just don't do QA, PERIOD.
Posted on Reply
#220
Dirt Chip
AusWolfHow hard would it be for Intel and AMD to have stricter specs and enforce motherboard manufacturers to use them as the default setting?
I guess very easy but not very welcome all the same. It might lead AMD to restrict OC just as Intel on non-K CPU`s.
We don`t need more restriction on every level of OC, just that the very basic to work rock solid (that is shutdown before any permanent and\or physical damage to the CPU\mobo).

My guess- a mix responsibly that start with AMD oversighting a rare case (as those incidents are very few so far) that caused by mobo vendors that changed the optimal settings (probably EXPO related) in order to get better out of the box pref for reviews, without telling it`s canceling warranty.
ZEN 4 is a new platform so, despite all the QA, a glitch found it's way out and make a mass.
Posted on Reply
#221
chrcoluk
kapone32Why do people not just leave the CPUs as they are and as for all the bashing about rushed product and being the fault of AMD. That is not true. One of the things that people who have built many PCs knows is that Gigabyte especially likes to push more voltage than reported by software but MSI and AS Rock are not far behind. Only buy Asus TUF and above as Prime is treated like budget by Asus. I have not seen my 7900X3D go past 1.36 volts and I don't mind at all. I know that there is a narrative out there that OC is always a good thing but the current generation of chips are already tuned to perform as it is not like these chips are being released in Dead space. Thank Intel for taking up the gauntlet and challenging AMD. When I had my 5800X3D I lamented about the clock speed. Well looks like the 7800X3D can also boost over 5 GHZ and that is not something to diminish. DDR5 pricing is no longer a mitigating factor either but the only thing I would do for OC would be to use AMD Software and the one button choices like undervolting.
Good that AMD chips are not going to 1.5v at stock anymore?

Agree, the only time I tinker now would be undervolting to reduce power, modern CPUs and GPUs are basically pre binned to run at almost as fast as they can go out of the box. I remember trying on my 2700X, PBO was actually slower than stock as it hit temp throttling earlier, and when I tried manual o/c, even an extra 100mhz would freeze during cinebench.

5600G so fast out of the box, I didnt even bother trying, just disabled XFR for low voltage, power consumption and ran it that way since. (its been used mostly as a server type workload, so XFR not important).
ZubasaYes, you can look up different articles by likes of GN and Hardware Unboxed over the years.
There is specifally a "normal" setting instead of "auto" in each voltage setting on Gigabyte boards which were implemeneted up on reviewers request.
That setting means the board actually respects the Intel / AMD voltage spec instead of the presets "optimized" by the board vendor.
Additionally there is an "Intel POR" setting that sets both PL1 and PL2 to the Intel spec (on the 12700K @190W) instead of auto which is 4095W / unlimited.
This was added so that reviewers have an easier time testing the CPU actully at stock.



Just in case you missed this post from another user.
An eye opener, on my ASRock boards I observe enabling XMP pumps excessive voltage to System Agent. Also voltage to CPU at default was above intel stock vcore.
Posted on Reply
#222
Redwoodz
Dirt ChipIf the problem accurse on mobo`s from different vendors than AMD is the root source.
Also, the design of the CPU sensors is solely AMD so if they over-look a (rare) case that enable the CPU to reach extreme temp than it`s their head on the spike (read as: just RMA and move on).
I guess that AMD make a basic bios and then each vendor modify to their suite so it may be that the CPU hardware\design\sensors is spotless and the problem is the basic bios from AMD that doesn't use the data in the right way (and then easy fix).
So if a buyer buys a AM5 motherboard with an old bios(preX3D) and inserts an X3D chip without flashing the new bios with proper support, who's at fault?
Posted on Reply
#223
Dirt Chip
RedwoodzSo if a buyer buys a AM5 motherboard with an old bios(preX3D) and inserts an X3D chip without flashing the new bios with proper support, who's at fault?
I think It should be impossible to do (the PC shouldn't POST at all in that case) and it's first under AMD responsibly to set those guidelines and than under each vendor to make sure their bios is solid about it.

The bare minimum is to display a warning massage on POST\bios that requires bios update and alert from warranty cancel if not done.

And even if disregarding all of that by the user, the CPU shouldn't kill itself and take the mobo with it.
Posted on Reply
#224
Klemc
I applied STRIX B650E-E GAMING BIOS 1413...
Posted on Reply
#225
Serhend
Everything works the best just out of the box.
The time and effort spent to achieve a stable OC is not worth it especially the first 3 years of having a new gen GPU or CPU. Only after that it might make sense to extract that 5 to 15% at best extra. By that time the product is also worth less than what it is on day 1 anyways.
Posted on Reply
Add your own comment
Dec 22nd, 2024 10:34 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts