OK thanks I tried your method but couldn't get it right and was getting BSOD as soon as I was changing the voltage to Default.
With my method I had a stable version at 4.4GHz by lowering the voltage by 0.1V from the 4.3GHz however when I did run the Cinebench as per your suggestion it crashed midway in the multi-thread test.
Running the Cinebench at 4.3 GHz is very stable and no exceptions were given. Temps very contained (around 75C and max 78C the stock cooling is very effective even at 15% from the low) TDP max 137W (which is near the 140W manufacturers TDP). Results below performs really good given this is a budget system.
I may spend some time to figure out how to go to 4.4 GHZ stable by fine tuning further but given the tiny performance increase not sure it is worth it.
In terms of impact to the games from the original 3.5GHz version seems significant. In one game I tested it got 7% more average fps and 25% higher value for the minimum 1% FPS.
Somewhere I heard increasing the voltages of the cache will have a performance impact too but not sure as this is set already to the Default voltage and inherits the same VCCIN
The SpeedStep ticked or unticked didn't make any difference by the way.
I have done 1 more attempt with the following setup where the multiplier is 43T and SpeedStep ticked.
And then I set the ratios in FVR at 44 (so higher over the multiplier).
For some reason this seemed to work better giving max power 131.6 Watts (less than before) and at the same time the Cinebench performance improved a little vs previous run (7356 vs 7188).
This could be random but not sure. No exceptions reported either.