Monday, April 24th 2023

AMD Ryzen 7000X3D Processors Prone to Physical Damage with Voltage-assisted Overclocking, Motherboard Vendors Rush BIOS Updates with Voltage Limiters
AMD Ryzen 7000X3D processors are prone to irreversible physical damage if CPU overclocking is attempted at some of the higher VDDCR voltages (the main power domain for the CPU cores). A Redditor who goes by Speedrookie, attempted to overclock their Ryzen 7 7800X3D, leading to an irreversible failure. The motherboard socket and the processor's land-grid contacts, show signs of overheating damage caused by the contacts melting from too much current draw.
A Ryzen 7000X3D processor features a special CPU complex die (CCD) with stacked 3D Vertical Cache memory. This cache die is located in the central region over the CCD where its 32 MB on-die L3 cache is located, while the difference in Z-height of the stacked die is filled up by structural silicon, which sit over the regions of the CCD with the 8 "Zen 4" CPU cores. It stands to reason that besides having an inferior thermal transfer setup to conventional "Zen 4" CCDs (without the 3DV cache), the CCD itself has a higher power-draw at any given clock-speed than a conventional CCD (since it's also powering the L3D). This is the main reason why overclocking capabilities on the 7000X3D processors are almost non-existent, and the processor's power limits are generally lower than their regular Ryzen 7000X counterparts. Attempting to dial up voltage kicks up the perfect storm for these processors.Igor's Lab posted a detailed analysis of the region of the Socket AM5 land-grid most susceptible to a burn-out in the above scenario. The central region of the LGA has 93 pins dedicated to the VDDCR power domain, dispersed in a mostly checkered pattern, toward the center of the land-grid. Igor isolated 6 of these VDDCR pins in particular, which are most prone to physical damage, as they are located in a region below the CCD that sees it sandwiched between the L3D (stacked 3D Vertical cache die), and the fiberglass substrate below. Apparently, AMD's thermal and electrical protection mechanisms aren't able to prevent a runaway overheating of the pins that causes the substrate to melt, deform, and bulge outward, resulting in irreversible damage to both the processor and the socket.
Meanwhile, AMD's motherboard partners are rushing to release UEFI BIOS updates for their entire lineups of motherboards, which enforce tighter limits on the VDDCR voltage. MSI is the first motherboard manufacturer with such updates. MSI, in a press statement, stated that it has redesigned automated overclocking for 7000X3D processors. "The BIOS now only supports negative offset voltage settings, which can reduce the CPU voltage only," the MSI statement to Tom's Hardware reads. "MSI Center also restricts any direct voltage and frequency adjustments, ensuring that the CPU won't be damaged due to over-voltage." On the other hand, the update introduces an automated overclocking feature called Enhanced Mode Boost, which optimizes PBO settings to improve boost frequency residency, without any manual voltage adjustments.
Sources:
Tom's Hardware 1, 2, Igor's Lab, Speedrookie (Reddit)
A Ryzen 7000X3D processor features a special CPU complex die (CCD) with stacked 3D Vertical Cache memory. This cache die is located in the central region over the CCD where its 32 MB on-die L3 cache is located, while the difference in Z-height of the stacked die is filled up by structural silicon, which sit over the regions of the CCD with the 8 "Zen 4" CPU cores. It stands to reason that besides having an inferior thermal transfer setup to conventional "Zen 4" CCDs (without the 3DV cache), the CCD itself has a higher power-draw at any given clock-speed than a conventional CCD (since it's also powering the L3D). This is the main reason why overclocking capabilities on the 7000X3D processors are almost non-existent, and the processor's power limits are generally lower than their regular Ryzen 7000X counterparts. Attempting to dial up voltage kicks up the perfect storm for these processors.Igor's Lab posted a detailed analysis of the region of the Socket AM5 land-grid most susceptible to a burn-out in the above scenario. The central region of the LGA has 93 pins dedicated to the VDDCR power domain, dispersed in a mostly checkered pattern, toward the center of the land-grid. Igor isolated 6 of these VDDCR pins in particular, which are most prone to physical damage, as they are located in a region below the CCD that sees it sandwiched between the L3D (stacked 3D Vertical cache die), and the fiberglass substrate below. Apparently, AMD's thermal and electrical protection mechanisms aren't able to prevent a runaway overheating of the pins that causes the substrate to melt, deform, and bulge outward, resulting in irreversible damage to both the processor and the socket.
Meanwhile, AMD's motherboard partners are rushing to release UEFI BIOS updates for their entire lineups of motherboards, which enforce tighter limits on the VDDCR voltage. MSI is the first motherboard manufacturer with such updates. MSI, in a press statement, stated that it has redesigned automated overclocking for 7000X3D processors. "The BIOS now only supports negative offset voltage settings, which can reduce the CPU voltage only," the MSI statement to Tom's Hardware reads. "MSI Center also restricts any direct voltage and frequency adjustments, ensuring that the CPU won't be damaged due to over-voltage." On the other hand, the update introduces an automated overclocking feature called Enhanced Mode Boost, which optimizes PBO settings to improve boost frequency residency, without any manual voltage adjustments.
258 Comments on AMD Ryzen 7000X3D Processors Prone to Physical Damage with Voltage-assisted Overclocking, Motherboard Vendors Rush BIOS Updates with Voltage Limiters
Although I setup AMD Expo with memory timings for a 6000Mhz memory module which I then overclocked to 6200Mhz ( or underclocked depending on how you look at it ), I set the voltages manually.
Vsoc 1.2v
MemVDD 1.28v
MemVDDQ 1.18v
It looks by the looks of ZenTimings below that I may be able to lower the starting Vsoc voltage even lower as it is currently .0150v lower than what I specified I just have to confirm this voltage is accurate with other software and perhaps a multimeter.
What are the chances though that this will cause the damage discussed?
We've had literally zero such reports since AM5 released, and now we "suddenly" have this. It smells fishy to me.
What is so difficult to understand ? Please explain.
The socket bending is an issue with every single motherboard and cpu in existence. They all do that. Others more, others less. I never experienced it myself That's nonsense. Do you understand the difference between a CPU dying cause of electromigration and the cpu frying itself? The latter should not happen just by "overclocking" or god forbid enabling expo, lol
Anyways, you do bring a good point, AMD even told reviewers to use 6000 MHz for their reviews, so in AMD's eyes, that should be perfectly fine. Unless they were like Intel telling that one person to last minute overclock that 28 core xeon on a chiller lol
www.gamersnexus.net/guides/3590-dont-run-z490-motherboards-with-default-settings-for-your-build
I could easily fish for another handful of generations/sockets where this has occurred at one or more mobo mfgrs.
So again... user due diligence is needed, has been needed and will always be needed with DIY building. Its why we call it DIY. People need to get a life and a good bit of common sense. And AMD/mobo partners need to lock this down on X3D's just as well, because there is literally no purpose to leave it unlocked. And if unlockable, it needs to come with warning on the UEFI toggle.
Frankly if you fried an X3D, you're just a n00b. And you just learned a thing. RMA it / get new one
/thread
We're still talking about an extremely rare occurence and a world full of people looking for ad revenue with cool videos
If this was a board default thing even on ONE board of ONE vendor, we would have seen many more issues. Shit just doesn't line up
And honestly I challenge myself to read articles from any of the random tech sites online, and every time I'm literally astounded by the lack of actual article in each of them. Basically its regurgitating everything every gen and looking for storms in teacups 99% of the time. Sure, things also truly, rarely happen, but boy are we making it big and boy are we bad at fact checking.
Similarly, look at the Nvidia space invaders, the supposed faulty memory from vendor X or Y, the 12 pin horror show et al. While truly issues, they're also blown way out of proportion. I look at myself too - I'm not immune to those hype trains either, but in retrospect, every time, its a lot smaller than we are led to believe.
no scientific conclusion, the best you can get is a theory from derbauer
SA Voltage: locked
Overclocking: locked
All the voltages provided by the processor are within the parameters imposed by Intel.
If with Alder they had a small escape with overclocking (non-k processors with very expensive motherboards), with Raptor not even God can overclock a non-k processor.
This means the processor is blocked!
However, this is not the problem. The serious problem is that the socket and/or the processor deteriorates quickly and without warning. I remain in my opinion that it is a design error, aka insufficient pins allocated to power the processor.
Of course, I'm not an expert, you don't have to listen to me and try to break world overclocking records with AM5. God be with you!