Monday, April 24th 2023

AMD Ryzen 7000X3D Processors Prone to Physical Damage with Voltage-assisted Overclocking, Motherboard Vendors Rush BIOS Updates with Voltage Limiters

AMD Ryzen 7000X3D processors are prone to irreversible physical damage if CPU overclocking is attempted at some of the higher VDDCR voltages (the main power domain for the CPU cores). A Redditor who goes by Speedrookie, attempted to overclock their Ryzen 7 7800X3D, leading to an irreversible failure. The motherboard socket and the processor's land-grid contacts, show signs of overheating damage caused by the contacts melting from too much current draw.

A Ryzen 7000X3D processor features a special CPU complex die (CCD) with stacked 3D Vertical Cache memory. This cache die is located in the central region over the CCD where its 32 MB on-die L3 cache is located, while the difference in Z-height of the stacked die is filled up by structural silicon, which sit over the regions of the CCD with the 8 "Zen 4" CPU cores. It stands to reason that besides having an inferior thermal transfer setup to conventional "Zen 4" CCDs (without the 3DV cache), the CCD itself has a higher power-draw at any given clock-speed than a conventional CCD (since it's also powering the L3D). This is the main reason why overclocking capabilities on the 7000X3D processors are almost non-existent, and the processor's power limits are generally lower than their regular Ryzen 7000X counterparts. Attempting to dial up voltage kicks up the perfect storm for these processors.
Igor's Lab posted a detailed analysis of the region of the Socket AM5 land-grid most susceptible to a burn-out in the above scenario. The central region of the LGA has 93 pins dedicated to the VDDCR power domain, dispersed in a mostly checkered pattern, toward the center of the land-grid. Igor isolated 6 of these VDDCR pins in particular, which are most prone to physical damage, as they are located in a region below the CCD that sees it sandwiched between the L3D (stacked 3D Vertical cache die), and the fiberglass substrate below. Apparently, AMD's thermal and electrical protection mechanisms aren't able to prevent a runaway overheating of the pins that causes the substrate to melt, deform, and bulge outward, resulting in irreversible damage to both the processor and the socket.

Meanwhile, AMD's motherboard partners are rushing to release UEFI BIOS updates for their entire lineups of motherboards, which enforce tighter limits on the VDDCR voltage. MSI is the first motherboard manufacturer with such updates. MSI, in a press statement, stated that it has redesigned automated overclocking for 7000X3D processors. "The BIOS now only supports negative offset voltage settings, which can reduce the CPU voltage only," the MSI statement to Tom's Hardware reads. "MSI Center also restricts any direct voltage and frequency adjustments, ensuring that the CPU won't be damaged due to over-voltage." On the other hand, the update introduces an automated overclocking feature called Enhanced Mode Boost, which optimizes PBO settings to improve boost frequency residency, without any manual voltage adjustments.
Sources: Tom's Hardware 1, 2, Igor's Lab, Speedrookie (Reddit)
Add your own comment

258 Comments on AMD Ryzen 7000X3D Processors Prone to Physical Damage with Voltage-assisted Overclocking, Motherboard Vendors Rush BIOS Updates with Voltage Limiters

#151
Hxx
trparkyI'm watching that video now. I'm at the ten minute mark on the video and Buildzoid said that (whether you want to think of this as good or bad depends upon how you want to look at it) if this issue were to occur to your processor, there's a high chance that it would not be the kind of situation that gradually kills your processor. It would the kind of situation in which you'd know your processor was dead because in a fraction of a second, your processor would be toast. This isn't a gradual thing at all, at least as he says.

And knowing the kind of content that Buildzoid puts out, I'd have to have to say that there's a good chance that he knows what he's talking about.

Suffice it to say... if your processor isn't dead by now, you're in the clear.
a high enough vcore yes will most def. kill your chip in an instant. But i always thought that a higher than recommended vcore will degrade your chip to a point where its lifespan will be shortened significantly. I would imagine feeding 1.5V+ (or a higher figure) on the SOC will kill that board rather quickly but probably not in an instant.

In my case when i enable EXPO the SOC vcore goes up from ~1V to 1.35V which is insanely high. This was on an MSI board with the "fixed" bios. Can't imagine if that value hits 1.5V how long until your system goes belly up. Maybe i need to watch Buildzoid's video :)
Posted on Reply
#152
rv8000
The odd thing is though, plenty of users have been pushing 1.3, 1.35, and even 1.38v to VSOC on non 3D chips for months now without reports of failures . If this was actually due to, and specifically VSOC in relation to EXPO and auto bios rules, there would’ve been hundreds if not thousands of dead chips at this point.

Which leads me to believe the ASUS statement is just to cover a wide net and their ass, in addition to pulling bioses with full voltage control mainly so end users don’t get the option to kill their CPU until they can actually confirm what’s causing this; as opposed to confirmation that it actually is VSOC.

Now we have a bunch of conflicting information and misinformation/panic with no conclusive testing. The next wave will be the loveable YouTube morons like Jay who are not technically inclined enough to make statements on such topics causing more panic through sensationalist videos.
Posted on Reply
#153
jh_berg
TheDeeGeeWhy would you even want to OC the fastest gaming CPU in the world?

Why would you even want to OC to begin with in 2023, everything clocks so high out of the box that 15C for 100 MHz isn't worth it.
Indeed.
Posted on Reply
#154
Klemc
How much CPU reported ? 100<1000 or more ???
Posted on Reply
#155
Redwoodz
TheDeeGeeThis issue is recall worthy, cuz good luck to the average joe flashing a BIOS.
nguyenWell, the actual user errors here are trusting AMD with their dysfunctional QA.

Every AM5 mobo maker is frantically deleting old BIOSes from their websites and put out latest BIOS with limited voltages, user errors....yeah right
No one is blaming only the user. What you have here is a bad bios from Asus and a user trying to see if he could kill his chip. Period.
Posted on Reply
#157
Klemc
So we have to unmount our AIO to verifiy because even all in Auto could cause CPU burn ?.. right ?
Posted on Reply
#158
wheresmycar
oh this sucks!!

to be frank, i "was" at some point considering splurging for a fresh DDR5 platform with a 7800X3D and one of the first pieces of news (brief overview of the marketing/reviewer material) which came to light was "no overclocking" "3D sensitive to higher voltages" "danger imminent". Actually even before NDA's were lifted there were speculations flying around over "no overclocking" for the X3D counterparts.

So what is being suggested here?

Are the boards out-of-the-box a threat to X3D chips?

Or, are the boards at stock safe for 3D-chips but BIOS-level options allow for peril-driven overclocking?

If the latter, i'm surprised someone would purchase a ~$500 CPU and not partially consider the DO's and DON'Ts. I guess at the same time i can understand why AMD/board partners should employ no-entry zones with locked OC features considering not everyone is going to bother with looking into matters for even partial awareness. But i can't shake off not seeking some level of know-how before being brave enough to play with sensitive settings at the BIOS-level.

One thing i'm a little confused about.... what happened to thermal throttling thresholds? Or any other automated counter measure to keep these CPUs safe from overheating/burning up. This seems like a bigger cockup we can definitely point our fingers towards AMD/B-partners. IMO, in 2023 we shouldn't have to worry about burning shit up unless extreme overclockers enable unlimited flexibility for whatever cause.
Posted on Reply
#159
Chrispy_
A perfect example of why it's always wise to avoid the first iteration of a new platform if you have the self-control and patience to do so.

LGA1700 - issues with the socket bending and first-gen non-heterogenous cores requiring numerous bios fixes, patches, and scheduler re-writes to iron out all the problems.
AM5 - DDR5 stability issues galore, stupid motherboard price hikes, and now motherboard vendors breaking the rules to fry your chip.

Rocket Lake turned out okay, AMD will probably have most of these dumb issues ironed out by Zen5.
Posted on Reply
#160
AusWolf
KlemcSo we have to unmount our AIO to verifiy because even all in Auto could cause CPU burn ?.. right ?
Huh? How?
Chrispy_motherboard vendors breaking the rules to fry your chip.
That has been an issue with at least one manufacturer on every platform ever since boost clocks were invented. Except that on 11th gen Intel, some boards were heavily criticised for keeping the recommended guidelines. There's no way to please people (especially reviewers), that's for sure.
Posted on Reply
#161
Klemc
AusWolfHuh? How?


That has been an issue with at least one manufacturer on every platform ever since boost clocks were invented. Except that on 11th gen Intel, some boards were heavily criticised for keeping the recommended guidelines. There's no way to please people (especially reviewers), that's for sure.
AIO cooler !
Posted on Reply
#162
AusWolf
KlemcAIO cooler !
AIO cooler. Yes. I know. But why would I have to unmount it? What would I want to check?
Posted on Reply
#163
phanbuey
Chrispy_A perfect example of why it's always wise to avoid the first iteration of a new platform if you have the self-control and patience to do so.

LGA1700 - issues with the socket bending and first-gen non-heterogenous cores requiring numerous bios fixes, patches, and scheduler re-writes to iron out all the problems.
AM5 - DDR5 stability issues galore, stupid motherboard price hikes, and now motherboard vendors breaking the rules to fry your chip.

Rocket Lake turned out okay, AMD will probably have most of these dumb issues ironed out by Zen5.
I mean yes - but as a first adopter of LGA1700 - the "socket bending" was really a non issue and non-hetero core design was basically massively overblown by people who didn't own the CPUs. Most reviews and users didn't run into any issues. I was also a first adopter of AM4 Zen1 (bought it on release) and there were some issues with that for sure (it hated my corsair ram kit with all of its silicon)... that rig is still in service though and never fried the CPU.

Memory stability issues and chips frying are a little worse though - I would say AM5 is sketchier than AM4, AM4 had some memory issues but it was dirt cheap and the MT performance was huge. You were really getting alot with that setup for the $ so having to fiddle with ram sort of felt ok. This time its insanely expensive, to use the 7950X3D and 7900X3D you need to perform 25 steps to get them to work most of the time, and also the motherboards might kill CPU :/.

Doesn't feel as great.
Posted on Reply
#164
Why_Me
phanbueyI mean yes - but as a first adopter of LGA1700 - the "socket bending" was really a non issue and non-hetero core design was basically massively overblown by people who didn't own the CPUs. Most reviews and users didn't run into any issues. I was also a first adopter of AM4 Zen1 (bought it on release) and there were some issues with that for sure (it hated my corsair ram kit with all of its silicon)... that rig is still in service though and never fried the CPU.

Memory stability issues and chips frying are a little worse though - I would say AM5 is sketchier than AM4, AM4 had some memory issues but it was dirt cheap and the MT performance was huge. You were really getting alot with that setup for the $ so having to fiddle with ram sort of felt ok. This time its insanely expensive, to use the 7950X3D and 7900X3D you need to perform 25 steps to get them to work most of the time, and also the motherboards might kill CPU :/.

Doesn't feel as great.
Lest we forget AM4 and USB issues.
Posted on Reply
#165
wheresmycar
phanbueyThis time its insanely expensive, to use the 7950X3D and 7900X3D you need to perform 25 steps to get them to work most of the time, and also the motherboards might kill CPU :/.
But are there reports on damaged CPUs via mobo stock configurations?

I'm also wandering is PBO enabled at stock or any other auto-OC setting? if yes, are these contributing factors to running the chip 6-feet under or is this issue purely related to additional tweak settings?
Posted on Reply
#166
phanbuey
wheresmycarBut are there reports on damaged CPUs via mobo stock configurations?

I'm also wandering is PBO enabled at stock or any other auto-OC setting? if yes, are these contributing factors to running the chip 6-feet under or is this issue purely related to additional tweak settings?
But even so - PBO and XMP are not settings that should ever be able to bypass the CPUs thermal and voltage protection.

Modern CPUs can withstand you pulling off their cooler while they’re at load. There should be no casual settings in the bios that can kill your chip. If you get the ln2 over locker board and pump 2V through your chip and turn off the OVP then fine - you set it to fry and that’s what it did. Setting expo or pbo should never do that.
Posted on Reply
#167
wheresmycar
phanbueyBut even so - PBO and XMP are not settings that should ever be able to bypass the CPUs thermal and voltage protection.

Modern CPUs can withstand you pulling off their cooler while they’re at load. There should be no casual settings in the bios that can kill your chip. If you get the ln2 over locker board and pump 2V through your chip and turn off the OVP then fine - you set it to fry and that’s what it did. Setting expo or pbo should never do that.
ok so its just manual tweaks which are reporting damaged CPUs. I was a little concerned since i've had 2 people buy into 7800X3D/AM5. A third potential buyer is a mate of mine considering the upgrade.
Posted on Reply
#168
tpa-pr
Based on more recent reports of it potentially affecting non-X3D chips, I am thanking my lucky stars that I have never turned on any of the auto-overclock features like PBO and EXPO. I've also let my boss know since he just finished building a 7950X3D build.

I've only gotten back into the new PC building game recently, have new platform launches always been this "interesting"? And if AMD keep making the news it's going to make it harder to playfully rib my Intel/Nvidia enthusiast workmates, sheesh.
Posted on Reply
#169
phanbuey
wheresmycarok so its just manual tweaks which are reporting damaged CPUs. I was a little concerned since i've had 2 people buy into 7800X3D/AM5. A third potential buyer is a mate of mine considering the upgrade.
Yeah definitely keep it stock and updated to latest bios for now. Hopefully it’s not widespread.
Posted on Reply
#170
Gica
Chrispy_LGA1700 - issues with the socket bending and first-gen non-heterogenous cores requiring numerous bios fixes, patches, and scheduler re-writes to iron out all the problems.
Rare cases that did not endanger the processor.
At 12500 I use a two-piece cooler: the sole from the time of AM3 (between lga 1366 and 1700 there is only a 1mm difference in the holes) and a cheap and good cooler, but with a horrible grip and sent for recycling.
The results are ok (capture). Not perfect, but it's ok for me. The important thing is that the temperature of the hottest core dictates the behavior of the processor protections. The first to reach the critical temperature triggers the protection.
In the case of the AM5 socket, it is a much more serious problem. It seems to be a design error, the protections are useless and the risks are huge. You can burn a processor with overclocking, but not that fast.
If you put a 13900KS on the AM5 socket, you blow up the whole neighborhood. :rockout:


Posted on Reply
#171
mama
Hxxa high enough vcore yes will most def. kill your chip in an instant. But i always thought that a higher than recommended vcore will degrade your chip to a point where its lifespan will be shortened significantly. I would imagine feeding 1.5V+ (or a higher figure) on the SOC will kill that board rather quickly but probably not in an instant.

In my case when i enable EXPO the SOC vcore goes up from ~1V to 1.35V which is insanely high. This was on an MSI board with the "fixed" bios. Can't imagine if that value hits 1.5V how long until your system goes belly up. Maybe i need to watch Buildzoid's video :)
Like you I have enabled EXPO but my SOC vcore is a constant 1.239V. I'm using Kingston Fury DDR5 6000 so I guess horses for courses. I thought about overclocking the ram but will pause to see what happens.
Posted on Reply
#172
Berfs1
Alrighty, here's the deal with warranty. If you are in the US, you are covered under the Magnuson Moss Warranty Act, because applying EXPO also changes other settings that the motherboard auto defined. Because the MOTHERBOARD entered in the unsafe settings and not you, you wouldn't be liable for this kind of damage, the motherboard manufacturer would. While yes the manufacturer of a product has to prove you caused the damage, in this particular case since the motherboard was auto defining unsafe values as safe, it would be the motherboard manufacturer that would be liable for this. Also why did AMD not test these chips beforehand? If over 1.3V SOC voltage instantly kills CPUs, AMD should have known about this prior to launching them. This is mainly AMD's fault for not testing their products and telling motherboard manufacturers what values to NOT auto define. Imagine Ford shipped you a car that should work in ideal conditions, but when driving below sea level, the engine blows up because of the higher air density. Why would that be YOUR fault because you drove it in a place that was 200ft below sea level? It's not like you are visiting a volcano or going to the core of the earth which would need a special kind of car, it's 200 feet below sea level.
Posted on Reply
#173
nguyen
Berfs1Alrighty, here's the deal with warranty. If you are in the US, you are covered under the Magnuson Moss Warranty Act, because applying EXPO also changes other settings that the motherboard auto defined. Because the MOTHERBOARD entered in the unsafe settings and not you, you wouldn't be liable for this kind of damage, the motherboard manufacturer would. While yes the manufacturer of a product has to prove you caused the damage, in this particular case since the motherboard was auto defining unsafe values as safe, it would be the motherboard manufacturer that would be liable for this. Also why did AMD not test these chips beforehand? If over 1.3V SOC voltage instantly kills CPUs, AMD should have known about this prior to launching them. This is mainly AMD's fault for not testing their products and telling motherboard manufacturers what values to NOT auto define. Imagine Ford shipped you a car that should work in ideal conditions, but when driving below sea level, the engine blows up because of the higher air density. Why would that be YOUR fault because you drove it in a place that was 200ft below sea level? It's not like you are visiting a volcano or going to the core of the earth which would need a special kind of car, it's 200 feet below sea level.
Yeah no, almost every reviewer benchmark Ryzen 7000 with 6000MT DDR5 and EXPO enabled and didn't encounter any problem, it should be common sense that people think enable EXPO is very safe to do so.

It could be false advertising for AMD to specifically tell reviewers to benchmark with EXPO enabled, when doing so could damage their CPUs (and motherboards) in mere 2-3 weeks.

In your example if FORD show in an advertisement that their cars can run submerged and when you do it, it kills the engine, that is false advertisement isn't it
Posted on Reply
#174
trparky
trparkyIt looks like VDDCR_VDD reached a maximum voltage of 1.377 volts while VDDCR_SOC only reached 1.245 volts.

A ran a Cinebench R23 test and the only change was that the VDDCR_SOC only reached 1.246 volts.
OK, so I disabled EXPO and these are the results...

Enabling EXPO automatically added .21v (1.245 - 1.035 = .21) for a maximum of 1.245 volts on the SOC. That's two tenths of a volt here that my motherboard added by default just by enabling EXPO.

What's really stupid here is that outside of benchmarks, turning off EXPO has no discernable change to how my system feels. I don't notice any kind of slowdown in how long it takes to boot Windows, there's no slowdowns in how web pages are rendered, the program that I compile from C++ source code using Microsoft's Visual C++ Compiler takes very nearly the same amount of time that it took with EXPO enabled, nor has it really affected Cinebench R23 scores (multi-core score of 19179).

So, I'm going to keep EXPO off for the time being until new UEFI versions come out. So far, Gigabyte hasn't come out with a new version for my board yet.
Posted on Reply
#175
AusWolf
tpa-prBased on more recent reports of it potentially affecting non-X3D chips, I am thanking my lucky stars that I have never turned on any of the auto-overclock features like PBO and EXPO. I've also let my boss know since he just finished building a 7950X3D build.

I've only gotten back into the new PC building game recently, have new platform launches always been this "interesting"? And if AMD keep making the news it's going to make it harder to playfully rib my Intel/Nvidia enthusiast workmates, sheesh.
AM5 has been along for about half a year now. Somebody fried their 7800X3D, and only now "suddenly", we get sporadic reports (Der8auer?) from "people" with their non-3D chips dying. I'd say it's not just a little suspicious. I'm smelling Youtube sensationalism again.
trparkyOK, so I disabled EXPO and these are the results...

Enabling EXPO automatically added .21v (1.245 - 1.035 = .21) for a maximum of 1.245 volts on the SOC. That's two tenths of a volt here that my motherboard added by default just by enabling EXPO.

What's really stupid here is that outside of benchmarks, turning off EXPO has no discernable change to how my system feels. I don't notice any kind of slowdown in how long it takes to boot Windows, there's no slowdowns in how web pages are rendered, the program that I compile from C++ source code using Microsoft's Visual C++ Compiler takes very nearly the same amount of time that it took with EXPO enabled, nor has it really affected Cinebench R23 scores (multi-core score of 19179).

So, I'm going to keep EXPO off for the time being until new UEFI versions come out. So far, Gigabyte hasn't come out with a new version for my board yet.
Yep. Memory OC is overrated.
Posted on Reply
Add your own comment
Dec 22nd, 2024 10:09 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts