Monday, April 24th 2023

AMD Ryzen 7000X3D Processors Prone to Physical Damage with Voltage-assisted Overclocking, Motherboard Vendors Rush BIOS Updates with Voltage Limiters
AMD Ryzen 7000X3D processors are prone to irreversible physical damage if CPU overclocking is attempted at some of the higher VDDCR voltages (the main power domain for the CPU cores). A Redditor who goes by Speedrookie, attempted to overclock their Ryzen 7 7800X3D, leading to an irreversible failure. The motherboard socket and the processor's land-grid contacts, show signs of overheating damage caused by the contacts melting from too much current draw.
A Ryzen 7000X3D processor features a special CPU complex die (CCD) with stacked 3D Vertical Cache memory. This cache die is located in the central region over the CCD where its 32 MB on-die L3 cache is located, while the difference in Z-height of the stacked die is filled up by structural silicon, which sit over the regions of the CCD with the 8 "Zen 4" CPU cores. It stands to reason that besides having an inferior thermal transfer setup to conventional "Zen 4" CCDs (without the 3DV cache), the CCD itself has a higher power-draw at any given clock-speed than a conventional CCD (since it's also powering the L3D). This is the main reason why overclocking capabilities on the 7000X3D processors are almost non-existent, and the processor's power limits are generally lower than their regular Ryzen 7000X counterparts. Attempting to dial up voltage kicks up the perfect storm for these processors.Igor's Lab posted a detailed analysis of the region of the Socket AM5 land-grid most susceptible to a burn-out in the above scenario. The central region of the LGA has 93 pins dedicated to the VDDCR power domain, dispersed in a mostly checkered pattern, toward the center of the land-grid. Igor isolated 6 of these VDDCR pins in particular, which are most prone to physical damage, as they are located in a region below the CCD that sees it sandwiched between the L3D (stacked 3D Vertical cache die), and the fiberglass substrate below. Apparently, AMD's thermal and electrical protection mechanisms aren't able to prevent a runaway overheating of the pins that causes the substrate to melt, deform, and bulge outward, resulting in irreversible damage to both the processor and the socket.
Meanwhile, AMD's motherboard partners are rushing to release UEFI BIOS updates for their entire lineups of motherboards, which enforce tighter limits on the VDDCR voltage. MSI is the first motherboard manufacturer with such updates. MSI, in a press statement, stated that it has redesigned automated overclocking for 7000X3D processors. "The BIOS now only supports negative offset voltage settings, which can reduce the CPU voltage only," the MSI statement to Tom's Hardware reads. "MSI Center also restricts any direct voltage and frequency adjustments, ensuring that the CPU won't be damaged due to over-voltage." On the other hand, the update introduces an automated overclocking feature called Enhanced Mode Boost, which optimizes PBO settings to improve boost frequency residency, without any manual voltage adjustments.
Sources:
Tom's Hardware 1, 2, Igor's Lab, Speedrookie (Reddit)
A Ryzen 7000X3D processor features a special CPU complex die (CCD) with stacked 3D Vertical Cache memory. This cache die is located in the central region over the CCD where its 32 MB on-die L3 cache is located, while the difference in Z-height of the stacked die is filled up by structural silicon, which sit over the regions of the CCD with the 8 "Zen 4" CPU cores. It stands to reason that besides having an inferior thermal transfer setup to conventional "Zen 4" CCDs (without the 3DV cache), the CCD itself has a higher power-draw at any given clock-speed than a conventional CCD (since it's also powering the L3D). This is the main reason why overclocking capabilities on the 7000X3D processors are almost non-existent, and the processor's power limits are generally lower than their regular Ryzen 7000X counterparts. Attempting to dial up voltage kicks up the perfect storm for these processors.Igor's Lab posted a detailed analysis of the region of the Socket AM5 land-grid most susceptible to a burn-out in the above scenario. The central region of the LGA has 93 pins dedicated to the VDDCR power domain, dispersed in a mostly checkered pattern, toward the center of the land-grid. Igor isolated 6 of these VDDCR pins in particular, which are most prone to physical damage, as they are located in a region below the CCD that sees it sandwiched between the L3D (stacked 3D Vertical cache die), and the fiberglass substrate below. Apparently, AMD's thermal and electrical protection mechanisms aren't able to prevent a runaway overheating of the pins that causes the substrate to melt, deform, and bulge outward, resulting in irreversible damage to both the processor and the socket.
Meanwhile, AMD's motherboard partners are rushing to release UEFI BIOS updates for their entire lineups of motherboards, which enforce tighter limits on the VDDCR voltage. MSI is the first motherboard manufacturer with such updates. MSI, in a press statement, stated that it has redesigned automated overclocking for 7000X3D processors. "The BIOS now only supports negative offset voltage settings, which can reduce the CPU voltage only," the MSI statement to Tom's Hardware reads. "MSI Center also restricts any direct voltage and frequency adjustments, ensuring that the CPU won't be damaged due to over-voltage." On the other hand, the update introduces an automated overclocking feature called Enhanced Mode Boost, which optimizes PBO settings to improve boost frequency residency, without any manual voltage adjustments.
258 Comments on AMD Ryzen 7000X3D Processors Prone to Physical Damage with Voltage-assisted Overclocking, Motherboard Vendors Rush BIOS Updates with Voltage Limiters
You have 3-4 gen's of support so don't be an idiot and become their beta tester/early adopter.
For example enabling xmp is not covered by warranty but sure as shit no one designs a bios where there is a big fat warning for it and ofc 99.99% users enable it . Same goes for most bios “tuning” options outside of specific voltage adjustments so ofc users will poke around thinking it’s “safe”. There has to be a clear line between what users can do and cannot do in bios that would not be covered and manufacturers have zero interest in clarifying it but they will hold u accountable when shit breaks
Igor from igorsLAB posted some numbers in the forums. Enabling EXPO 6000 ramps up the VDDCR_SOC to 1,35v. Reduced manually to 1,30v and he can only run 5800 solid. Might be just a BIOS bug, but if it turns out that you can't run DDR5-6000 without putting your CPU in danger because you need that juice to run the OC RAM solid, then that would be big PR nightmare. DDR5-5200 vs DDR5-6000 can make a difference of up to 10%, so you would loose quite some performance.
There's also a reddit post gaining traction:
Pin analysis of the destroyed Ryzen 7800X3D - All burned pins supply the VDDCR (CPU Core Power Supply) | igor'sLAB
because this ability to use fast memory and 4 of them (yes the trash limit of 2 modules) is overclocking in the eyes of AMD
limit of 2xddr5-5200 or 4xddr5-3600
and means the X3D could burn your money in multiple ways
You are right there..never leave the fire service!
Also, if anyone remembers the Surface “hot bag” woes, it was because MS implemented a v1.0 power feature on Kaby Lake. Why did no other OEM have trouble with their Kaby Lake rollout? Because they all had a practice of not implementing a v1.0 feature from Intel. They waited for the next release.
"b-b-b-but all these years we could go and do whatever and it all went fine!!!!!!!!!! !THIS IS ALL AMDS FAULT!!!!!!!"
Yes, my idiot, that's where you see that it's a proper new technology. It breaks when you pull on it. It's not just an automatic step up from what you're used to, and actually comes with increased fragility/risks. Edit: I was clearly too late to the thread! :roll:
I'd really like to know how widespread the issue is, or if it's just a one-off case? Not reading many details in the info.
Let's see how "not cool" the issue is.
Shifting blame to the user is the standard approach, which also never works.
Within manufacturer specifications.
If I overclock my 65 inch LG tv and it breaks- do I now get to blame LG? Of course not- that’s braindead.
This is exactly why nobody should ever trust all those great "100% 24/7 stable 6.7GHz 1.1V OC" stories from forum bros.
any manual overvolting on CPU or change of multiplier, will disable manufacturer-set power and current limits for the CPU
Then it's up to motherboard or user to set limits
Now go play with the EDC 1A bug
Literally getting a new technology and going pompously pointing fingers and throwing shade at the creator of the new tech for "not mitigating" well enough.
Hey genius, that's what being NEW is. We don't know the mitigations because they're new. That's the entire concept of "new", or "unknown" or "innovative" or really anything that isn't just walking the same inroads we've been doing for 15 years. Ignoring instruction sets, 3D Vcache is probably the first very big CPU improvement since multi core.
And here we have these Holy Judges passing their Holy Judgement on AMD for their users playing around with a completely new tech and not somehow magically guessing how to stop them doing something stupid before they even do any of these things. Well, that is quite boring.
I mean it's safe and all, sure. But boring.
read here:
Amd/comments/12tlk7s/_/jhbfh8y
I love how you stick to this argument that stands up to 0 scrutiny. The guy didn't even OC it just enabled XMP. What an absolute idiot that guy is -- clearly everyone knows not to enable XMP on the 'NEW' products because their NEWNESS just means added fragility.
All the motherboard manufacturers rushing out bios updates to fix their default settings is definitely not a sign of a flaw... it's just new! these users are so dumb and cant handle the newness. Maybe send them a dictionary with the definition of 'new' tabbed for 'em... that'll fix it.
>it breaks the component
HOW DID THE MANUFACTURER NOT MITIGATE THIS
And also, ASUS already had 2 x3D CPUs that blew that showed up on Reddit. There's a thread about some extremely shady behaviour there too:
Amd/comments/12uvcsm
But I'm sure I'll hear more sarcasm from you in about 10 seconds and it'll be all about AMD's fault and not ASUS or anything.