Thursday, April 23rd 2020

AMD FX-8350 Pushed to 8.1 GHz via Extreme Overclocking by Der8auer

AMD's Bulldozer architecture is a well-known quantity by now, and seemingly straddles a line between loathing and love between tech enthusiasts. Slow and power hungry compared to Intel's options, it harkens back to a time where the roles were reversed, and AMD were looking to compensate for architectural deficiencies (and architectural design decisions that can either be claimed as erroneous or ahead of their time) via increased clockspeeds. However you look at these Bulldozer CPUs, the fact is that they remain some of the best overclockers of all time - at least when it comes to maximum operating frequencies, especially at absolutely scorching vCore values.

To achieve that operating frequency, Der8auer used an Elmor EVC2 controller and diagnostics chip, which, connected to a usually unpopulated pin area in the ASUS 970 PRO GAMING/AURA motherboard, allowed him to read-out everything that was running through the motherboard's VRM circuitry, and perform manual adjustments. Corsair Vengeance 2,666 MHz DDR3 memory was also used in the system. An accident happened along the way, though: when pulling AMD's stock cooler from the motherboard, the CPU remained attached to the cooler, which resulted in some bent pins (screams in horror). Luckily, things were fixed with a screwdriver - let that serve as a warning, alert, and tip, should this happen to you.
Anyway, the AMD FX-8350 achieved an 8,127 MHz speed with a 1,920 vCore, which is an absolutely incredible voltage for a 32 nm CPU. Running at 7,500 MHz for a single-core performance benchmark, the CPU was pulling 100 W of power - for a single core to operate at that speed, mind you. Even so, the AMD FX-8350 only achieved a single-core score of 172 points - for comparison sake, AMD's six-core Ryzen 5 2600X, running at stock clocks of 3.6 GHz with all cores enabled, achieves 176 points in the same benchmark. Watch the video below for the full rundown on this experiment.

Add your own comment

64 Comments on AMD FX-8350 Pushed to 8.1 GHz via Extreme Overclocking by Der8auer

#51
ARF
de.das.dudeProud fx owner of an 8320 :D i can still run 4.5GHz stable on all cores... doesnt make much sense though. Feet gets warm :lol:
You can game at Ultra HD 3840 x 2160 and probably will forget what your CPU is :D
Posted on Reply
#52
ARF
Would you like to know a fun fact?
Well, the FX-9590 at up to 5 GHz www.amd.com/en/products/cpu/fx-9590 is as fast as Core i7-3960X and Core i7-4790K in the proper CPU-Z 1.78.3.x64 test.
This is the last CPU-Z version which came before Ryzen 1 launch back in 2017.







Posted on Reply
#53
Jism
The problem was; they focussed so bad on MT in a timespan where single-core performance mattered the most. If you threw large workloads at it such at video encoding or whatever that was equal, it would beat the i7 with ease. You just had alot of games that relied on single core performance more then today.
Posted on Reply
#54
ARF
JismThe problem was; they focussed so bad on MT in a timespan where single-core performance mattered the most. If you threw large workloads at it such at video encoding or whatever that was equal, it would beat the i7 with ease. You just had alot of games that relied on single core performance more then today.
They probably had a process problem / missing frequencies target, too. It would be quite interesting if today they make it on the newest TSMC N7 node with 16 modules/ 32 threads, for example.
Posted on Reply
#55
Jism
It was the best chip related to overclocking. You'd buy a 3Ghz model and get a guaranteed 4.2 ~ 5Ghz overclock. And some where lucky enough to pass 5500Mhz. I had a 8320 with a base of 3.5Ghz and a 24/7 overclock for at least a good year at 4.8Ghz. That's 1300Mhz for free, if you take the cooling not into account. It brought a 25% up to 45% uplift in performance too, and the power consumption? Really it just depends on which workload you throw at it. Games usually taxed just 4 ouf of 8 cores. Basic windows stuff was an avg of just 2 to 4 cores. Watching video's pretty much is decoded by the GPU these days. Put some power saving features in and you have efficient FX platform. It could take 2400MHz DDR3 too if you where lucky enough, knowing they where made for just 1600Mhz DDR3.

The biggest gain in FPS for games was attempting to increase the CPU/NB clock, which was responsible for the speed of the L2 cache as well. That improved games seriously alot. I woud'nt say i miss it but it was one of the platforms that i enjoyed most. It could keep you busy for hours when you start to overclock these. Ryzen is slap a big cooler on it, enable PBO and call it a day.
Posted on Reply
#56
seronx
ARFThey probably had a process problem / missing frequencies target, too.
AMD probably didn't have issues with 32nm PDSOI.
Husky text in 2010:
"The 32nm implementation of an AMD x86-64 core [1,2,5], occupies 9.69mm2, contains more than 35 million transistors (excluding L2 cache), and operates at frequencies in excess of 3GHz. The core incorporates numerous design and power improvements to enable an operating range of 2.5 to 25W and a near zero-power gated state, which makes the core well-suited to a broad range of mobile and desktop products."

Bulldozer text in 2011:
"Frequency at constant voltage is improved by more than 20% (Fig. 4) while the dual-core switching capacitance is reduced to 84% of two previous cores"


3 GHz on legacy core is >3.6 GHz on Bulldozer.
JismThe problem was; they focussed so bad on MT in a timespan where single-core performance mattered the most.
They did target single-threaded performance.

Bulldozer per core is packed with more OoO than Greyhound/Husky.
Posted on Reply
#57
Jism
Their single core performance was weak; compared to the thuban it was almost equal. It was just the clock advantage that made the FX "faster". The opteron series, which came in 8 module / 16 thread CPU's, was weak due to it's low clock (2.3Ghz on avverage). This is why AMD lost so much marketspace into the enterprise market in the first place.

In order to get a FX platform running as it should; you defenitly need to invest in good memory, i.e 16GB with tight timings and high speeds (1600Mhz and above). If you wanted to overclock, you cant rely on having a budget board, but you needed a high end one. I had a crosshair Z and i learned pretty quick that the VRM was capable of over 250W easily, this provided all the headroom i needed. Now with AMD FX cpu's, there's a limit set by AMD of just 25A of current. You need to have a premium board to make the CPU use more then 25A if you want to pass or exceed certain speeds. You could test this very easily when you seem to hit a wall, no matter what voltage you throw at it. Now disable one or 2 cores, and when it does pass at the same settings, your running into a current limit.

Now overclocking with air up to 4.1 to 4.4GHz IS possible, but anything beyond that requires water and the higher you go the more you need. My CPU ran 781CB points (CB15) on 4.8GHz with a 300MHz HTT (FSB). 5Ghz was possible but it required a jump in voltage and thus much more heat then accepted. They where very nice CPU's and if you where lucky enough the CPU/NB would allow beyond 2700Mhz, which is amazing if you have such a chip in the first place.

People always focussed on multiplier overclocking. Setup a voltage and work from there. Really the best way is and has bin by FSB. You increase the overall speed of the system and because of that you would need less hard cpu clocks in order to archieve the same. Like, a 4.8Ghz / 300Mhz FX CPU is faster then a 5.2 or even 5.4Ghz using a stock FSB. But again you gotta be lucky that your system can handle it. Because of that advantage my FX platform really did well compared to other systems. It played every game and it worked out everything ive did with it (media, web etc).

However the moment ive replaced this with a 2700x pretty much bulldozed the whole FX all together, while requiring only half of the total system power and still be 2.5x faster. There's just a fundamental issue with the FX and it is it's shared module / threads. However now in 2020 you still see the FX holding really well; it's because the software ecosystem pretty much turned to more multithreading now, and even consoles are 'optimized' for 8 cores (jaguar). This will change tho as the Zen is being the primary ecosystem now for the new generation of consoles.

Really i loved it. Brew coffee, start at 01:00 AM and end at like 06:00AM with a well running system. AMD FX should be the basis for anyone wanting to learn how to overclock.
Posted on Reply
#58
seronx
JismTheir single core performance was weak; compared to the thuban it was almost equal. It was just the clock advantage that made the FX "faster". The opteron series, which came in 8 module / 16 thread CPU's, was weak due to it's low clock (2.3Ghz on avverage). This is why AMD lost so much marketspace into the enterprise market in the first place.
Magny-Cours => Busy reservation(3x24 retire/3x8 sched) and non-simultaneous ALU/AGU execution.
Interlagos => No busy reservation(1x128 retire/1x40 sched) and simultaneous ALU/AGU execution.

In non-FPU/Core-only workloads, Bulldozer should complete thicker integer workloads faster. Since, its core will have less wait/busy times.

The FPU co-processsor isn't a deal breaker, but Bulldozer's FPU is more advanced in OoO(better rename), width(64-wide vs 42-wide), extension support(3op-AVX128 support). Meaning it is much better in Datacenter/HPC workloads than Interlagos.

~Better Integer aka better actual core.
~Good-enough FPU/SIMD but with more modern features aka better fpu.

Do to some really weird design decisions Bulldozer is also more tolerant to memory bandwidth. It is faster with slower memory and faster memory.
Posted on Reply
#59
ARF
seronxAMD probably didn't have issues with 32nm PDSOI.
6 months before the FX-8350 launch, Intel already was on the 22nm process node with Core i7-3770K. So yeah, AMD did definitely had a process node problem.
Where Bulldozer is different is AMD insists the design didn't aggressively pursue frequency like the P4, but rather aggressively pursued gate count reduction per stage. According to AMD, the former results in power problems while the latter is more manageable.

AMD's target for Bulldozer was a 30% higher frequency than the previous generation architecture. Unfortunately that's a fairly vague statement and I couldn't get AMD to commit to anything more pronounced, but if we look at the top-end Phenom II X6 at 3.3GHz a 30% increase in frequency would put Bulldozer at 4.3GHz.

Unfortunately 4.3GHz isn't what the top-end AMD FX CPU ships at. The best we'll get at launch is 3.6GHz, a meager 9% increase over the outgoing architecture. Turbo Core does get AMD close to those initial frequency targets, however the turbo frequencies are only typically seen for very short periods of time.
www.anandtech.com/show/4955/the-bulldozer-review-amd-fx8150-tested/3
Posted on Reply
#60
seronx
ARF6 months before the FX-8350 launch, Intel already was on the 22nm process node with Core i7-3770K. So yeah, AMD did definitely had a process node problem.
FX-8350 isn't Bulldozer, it is physically an upgraded Piledriver pulled from Trinity.

In general, AMD usually is behind on nodes if it is with the AMD/GloFo Fabs.
Intel 32nm Hexa-core => 248 mm² in January 7, 2010
AMD 45nm Hexa-core => 346 mm² in April 27, 2010

No issues relative to frequency/power were observed for 45nm/32nm/28nm at GloFo. In relation, to stock performance achievement. All products got their clock target at a given voltage including Bulldozer.
Posted on Reply
#61
EarthDog
60+ posts on something accomplished initially almost 8 years ago... lol
Posted on Reply
#62
lexluthermiester
seronxFX-8350 isn't Bulldozer, it is physically an upgraded Piledriver pulled from Trinity.
That is not correct. The FX8350 has the Vishera series cores and that line is most definitely Bulldozer architecture. Trinity based CPU's are exclusively FM2 socket based APUs.
See post below...
Posted on Reply
#63
seronx
lexluthermiesterThat is not correct. The FX8350 has the Vishera series cores and that line is most definitely Bulldozer architecture. Trinity based CPU's are exclusively FM2 socket based APUs.
The product is called Vishera.
Zambezi is OR-(B2 or B3) and Vishera is OR-C0.
Zambezi is Bulldozer, so it has a few issues. No perceptron branch predictor. Only decodes a single AVX256 per cycle. Has a reduced TLB size. Doesn't support FMA3.
Vishera is Piledriver, which has a perceptron branch predictor, can decode two AVX256 ops per cycle, has an increased TLB(larger than Trinity's Piledriver). Given up to a certain frequency/voltage it has a resonant clock mesh. Which reduces active power 5%-10% by reducing power intake of the clocking macros. Supports FMA3.
Posted on Reply
#64
lexluthermiester
seronxThe product is called Vishera.
Zambezi is OR-(B2 or B3) and Vishera is OR-C0.
Zambezi is Bulldozer, so it has a few issues. No perceptron branch predictor. Only decodes a single AVX256 per cycle. Has a reduced TLB size. Doesn't support FMA3.
Vishera is Piledriver, which has a perceptron branch predictor, can decode two AVX256 ops per cycle, has an increased TLB(larger than Trinity's Piledriver). Given up to a certain frequency/voltage it has a resonant clock mesh. Which reduces active power 5%-10% by reducing power intake of the clocking macros. Supports FMA3.
You're right. I got it partly wrong. At one time I had both an 8150 and an 8350. Mistakenly thought they were apart of the same family.
Posted on Reply
Add your own comment
Dec 21st, 2024 22:34 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts