Wednesday, December 14th 2011
AMD Gives Bulldozer 6-core a Speed-Bump with FX-6200
AMD launched its AMD FX processor family with two eight-core parts (FX-8150, FX-8120), a six-core part (FX-6100), and a quad-core one (FX-4100), apparently a newer, slightly faster six-core FX processor is just around the corner, the FX-6200. Since all AMD FX processors are unlocked out of the box, the FX-6200 is essentially a speed-bump. Out of the box, it is clocked at 3.80 GHz, with 4.10 GHz maximum TurboCore speed. It features six cores, 6 MB total L2 cache, and 8 MB total L3 cache. Its TDP is rated at 125W. In a presentation to retailers sourced by DonanimHaber, AMD pitched the FX-6200 to have about 10% higher performance at Mainconcept HD to Flash conversion, than the FX-6100 (3.30 GHz nominal, 3.90 GHz max. turbo).
Source:
DonanimHaber
79 Comments on AMD Gives Bulldozer 6-core a Speed-Bump with FX-6200
Could be wrong, but the whole point of bulldozer is that is is a modular design so there is no more need to have disabled cores, they can simply cut them away.
Notice the near excellent scaling vs. the (faster clock for clock) Phenom II. It's not just that two extra cores failed to mean 33% extra performance, it's that it actually meant less performance in many areas. This, as I've repeated above, is because Bulldozer "cores" are not cores in the same sense as Phenom cores. You could just as easily call it a quad core in which each core has two of some things.
The hype definitely was a big mistake by AMD - this should have been promoted as a budget multitasking chip. I think what really killed it, though, was the single-threaded performance and single-threaded performance per watt. Too much of what we do is still dependent on this.
There's also an issue (fixed in Windows 8 iirc) with how Windows assigns tasks to cores. If you have a dual-threaded task, for example, Windows may well send it to cores 0 and 1 of a Bulldozer CPU - because it sees it as a regular 8 core. But depending on the application, performance could sometimes be almost doubled by sending that task to cores on different modules, such as 0 and 2. Tests in Windows 8 show BD close the gap on but not catch up with SB.
Anyhow nice speed bump from FX-6100 to FX-6200. Though it does sound odd that they didn't name it the FX-6120/6150. I believe the FX-**70 and FX-**90 are reserved for clock speeds higher than 4GHz such as the upcoming FX-4170 @ 4.20GHz, FX-8170 @ 4.00GHz and FX-8190 @ 4.60 GHz at stock speeds.
Even with this problem resolved, SB still leaves BD in the dust.
Anyways, that does show a 10% increase in performance.
Then another 20% get to SB... but yes they’re playing catch-up to IB. Although, I don't necessarily subscribe to the idea they have to beat Intel in every B-M to be taken seriously, as long as the CPU/Mobo are priced right and available they' stay in the game.
I used that benchmark PRECISELY FOR THAT REASON. Using an 8-threaded application clearly would not demonstrate this effect at all, but that's not all that significant as so few applications are optimised for 8 cores. The 10% performance boost is a best case scenario. It's explained in my post above and in the TH article, I cannot be bothered to go over it again for your benefit.
If you want to turn this into a benchmark contest, we can post images of multi threaded games and applications leaving "SB still leaves BD in the dust". - But that would be immature. You agree :)
What we need from Bulldozer or better yet, from Piledriver is at least 5% higher IPC and lower power consumption...for a start. If Piledriver doesn't deliver AMD will fall even further behind Intel, since IB is suposed to have ~8% higher IPC than SB.
The point is that (insert any 2-6 core optimised workload here) can perform noticeably better on BD with Windows 8 than with Windows 7, because of how Windows 7 sees it as a "full" 8 core.
If you look at the TH article, or go google "Windows 8 Bulldozer benchmark" (without the quotation marks of course) you can see plenty of other applications showing similar effects.
I think you are reaching, I fail to see what Bulldozer's performance has to do with World of War Craft Cataclysm, and why WWCC is even cared about in the enthusiast community.
Seems like Chief design Engineer from P IV got fired and got hired to design the FX line :laugh:
Your 8120 @ 4.4 is like a Phenom II X6 @ 4 Ghz when apps use all of its cores.
It's not that Bulldozer's performance is somehow related to WoW:C. I don't see how that point is relevant or interesting, though.
I certainly don't care about WoW:C, but, as explained above, it's an excellent way of comparing and analysing how CPUs deal with applications optimised for two cores (which is a LOT of applications and games right now).
The flaw of Tom's Hardware guide is the implications not noted
Windows 7:
Both Cores in a module being used
Core A1 <-- 30-50 ns --> Core A2
Windows 8:
Two different cores in two differnt modules being used
Core A2 <-- 100-200 ns --> Core B2
The only reason Windows 8 is showing an increase is because World of Warcraft is optimized for Intel Architectures where the decoders are an odd number
Intel Sandy Bridge, Nehalem can have 5 macro-op decodes(3 simple, 1 complex)
While Bulldozer 1 module can decode 8 macro-ops, 4 per core... meaning for Bulldozer to have relatively the same performance it will need a uops cache and 6 decoders to have perfect alignment with code in World of Warcraft...because you will have a bleed of 2 macro-ops with World of Warcraft unless Blizzard recodes the game for AMD FX
What is Interesting is the benchmark that is on the front page is single threaded
4.1 GHz / ~17 = .2412 x 14 = 3.4GHz
i5 2400 = 3.4GHz single core turbo
FX-6200 = 4.1GHz single core turbo
meaning that Bulldozer has a 16-17 stage pipeline compared to Sandy Bridges 14 stage pipeline
Until then, I think we're better off with the generally accepted explanation, which is that in treating BD CPUs as "normal" 8-cores, Windows 7 isn't (yet) understanding the Bulldozer architecture properly.
Two modules has a peak throughput of sixteen macro-ops much more than the eight macro-ops of one module
When two modules are used you have a higher throughput thus seemingly higher FPS but you then get blockade by the slowest cache Windows 7 sees Bulldozer correctly it has 8 normal cores
Windows 8 will just fix the problem of legacy programs show when a new architecture is introduced that changes the number of decoders
A normal core is registers to ALU clusters
Bulldozer has 8 individual registers to 8 ALU Clusters thus can be called 8 physical cores
Sandy Bridge has 8 registers that have pairs shared to 4 ALU Clusters thus can be called 4 physical cores
Other than that you can go by Database licensing which concludes the amount of cores being the amount of logical cores thus Bulldozer is 8 cores and Sandy Bridge is 8 cores