Easier said than done. It's more complex than this, but generally speaking, voltage, total power, temperature, utilization... all of the main ones, are locked-in on sort of a set curve. Or at least, that's the ultimate behavior. My experience is that any means you can use to try to lower voltage will also decrease boosts. Say your temperature is low, but your usage is high, but you've manually set a voltage offset. XFR and PBO don't acknowledge that you just want to lower the voltage. It sees the temp/util and wants to boost higher, but it can't because the voltage available isn't in-line with what it wants for those higher clocks. It wants what it wants to hit a given clock. Same thing happens if you go into your PBO settings and try to change the power limit. All you're doing by cutting voltage is nipping the upper straps off of the boost curve. It seems to slide all of the parameters back so that it won't boost as high as it would've otherwise. Basically, if all of the conditions aren't met, including voltage from the mobo, it absolutely will not hit its max frequency for any load. It'll always be lower in step with the voltage drop.
It's kind of crazy, how tight the power curve is. We're seeing different numbers because of mobo manufactures doing different things with their configs, but under the hood, the behavior from chip to chip is astonishingly uniform. Like, I could get another 3900x, drop it in this board, and I'm betting it'll run the same voltages, same clocks, same wattage. There really just is no wiggle to it. What it pulls out for you is pretty much all there is.
The general consensus is that it doesn't hurt anything. For overclocks, I think the common limit is 1.35-1.4v, or at least that's what I go by when I'm not okay with cooking a chip. But that's constant voltage. Ryzen's boost doesn't sustain it constantly. I'm betting it looks like it's flat-lining at 1.4-1.5 volts all of the time because the spikes and dips are too brief and slip between polling. But it's certainly not the same as running constant voltage. Guarantee you if you take that 3950x and pump 1.5v volts to it while putting it under load, the temperatures will be astronomically higher than it boosting on its own and hitting that same voltage.
The temperatures are kind of a similar thing. It's got those little spikes even under light loads... seems that when you give a Ryzen some meager tasks to much on, it likes to max the voltage out... probably because the current is so low. But again, if your cooling is up for it, not a problem. Worst case is that it stops boosting or even throttles down. But on the flipside, you might not even see much better temperatures with better cooling. If it had headroom outside of temperature, it might just eat up a good chunk of any temperature headroom you give it.. It's pretty much gonna keep going until it hits one of a few limits. You have the clock speed limit (they won't automatically run over advertised boost clocks.) And then you have wattage/current and temperature. Voltage is more of a soft limit. It has a max of 1.5, but that 1.5 doesn't limit clock speed... like it doesn't stop going higher because it wants the full 1.5. It does, however, seem to have a minimum, per a given clock speed and load. Again, to do certain clocks under certain loads, it requires that voltage. Mobo manufacturers could do better on their end, but from what I've seen the only ones could really change it much are AMD themselves. But it seems they've decided this is the optimal behavior, because they all do it
Nah, I think in your case a BIOS update might help. But the behavior you're describing is pretty typical. It's always gonna want those high voltages and little spikes. Over time I'm sure AMD will refine it and smooth it out so that maybe it doesn't spike so much and permits lower voltages... but it's not exactly a malfunction. Just really stiff rules on their end. They decide if it's a problem or not on their end. Is what it is... I feel a little conflicted, myself... but then I kinda think who am I to think I know better than AMD how their chips should run. I'm sure there are some issues with windows power management and just... all sorts of hiccups getting everything in this admittedly pretty complex boosting system to communicate and respond as intended. They've always liked to overvolt their stuff out of the box, but this is probably the first time you couldn't undervolt an AMD product that runs hot and high. There's gotta be a reason for that.
I'd say they're actually pretty good at self-regulating. They're nothing if not consistent, and while behavior takes some getting used to, I don't see one killing itself from it. It seems like their main concern is current and I assume there's a good reason for that. I've seen the voltage dip under heavy threaded workloads, when the current was the highest I've seen it. I think below a certain voltage, it's probably the combination of high current AND high voltage that kills many a CPU. You look at your voltage and see 1.5 and think gee, that's high! But when you look at your total wattage it could be like... 48. Which also means the current is very low. Hence why on the flipside, as current starts getting legitimately high, voltage will start to dip a little... more and more if the load keeps increasing. So it seems to me there is a voltage where even a small amount of current would be enough to fry a Ryzen, but 1.5v isn't it. I wish we all knew for sure, but it seems that IT at least knows how much of each it can take and will take steps to never hit a point of damage/degradation. I trust AMD has put in the footwork to figure out what voltage and current levels show signs of degradation and with their long-gathered knowledge of how CPUs behave found a safe line to draw for their boost system.
They really are a different beast, though. We're talking about a little mobile/SoC arch turned desktop powerhouse, here. All of the parameters and behavior are going to be different from what we're all used to with Intel, who's been rehashing the same chip for years. I think they're probably a lot more finicky to get the performance out of than those older, more refined architectures. Just a lot more going on with em, across the board.
And just so we're clear, that's me saying I don't really know what's going on with them but yes it is normal and no there's not much to be done