Monday, October 5th 2015
AMD Zen Features Double the Per-core Number Crunching Machinery to Predecessor
AMD "Zen" CPU micro-architecture has a design focus on significantly increasing per-core performance, particularly per-core number-crunching performance, according to a 3DCenter.org report. It sees a near doubling of the number of decoder, ALU, and floating-point units per-core, compared to its predecessor. In essence, the a Zen core is AMD's idea of "what if a Steamroller module of two cores was just one big core, and supported SMT instead."
In the micro-architectures following "Bulldozer," which debuted with the company's first FX-series socket AM3+ processors, and running up to "Excavator," which will debut with the company's "Carrizo" APUs, AMD's approach to CPU cores involved modules, which packed two physical cores, with a combination of dedicated and shared resources between them. It was intended to take Intel's Core 2 idea of combining two cores into an indivisible unit further.AMD's approach was less than stellar, and was hit by implementation problems, where software sequentially loaded cores in a multi-module processor, resulting in a less than optimal scenario than if they were to load one core per module first, and then load additional cores across modules. AMD's workaround tricked software (particularly OS schedulers) into thinking that a "module" was a "core" which had two "threads" (eg: an eight-core FX-8350 would be seen by software as a 4-core processor with 8 threads).
In AMD's latest approach with "Zen," the company did away with the barriers that separated two cores within a module. It's one big monolithic core, with 4 decoders (parts which tell the core what to do), 4 ALUs ("Bulldozer" had two per core), and four 128-bit wide floating-point units, clubbed in two 256-bit FMACs. This approach nearly doubles the per-core number-crunching muscle. AMD implemented an Intel-like SMT technology, which works very similar to HyperThreading.
Source:
3DCenter.org
In the micro-architectures following "Bulldozer," which debuted with the company's first FX-series socket AM3+ processors, and running up to "Excavator," which will debut with the company's "Carrizo" APUs, AMD's approach to CPU cores involved modules, which packed two physical cores, with a combination of dedicated and shared resources between them. It was intended to take Intel's Core 2 idea of combining two cores into an indivisible unit further.AMD's approach was less than stellar, and was hit by implementation problems, where software sequentially loaded cores in a multi-module processor, resulting in a less than optimal scenario than if they were to load one core per module first, and then load additional cores across modules. AMD's workaround tricked software (particularly OS schedulers) into thinking that a "module" was a "core" which had two "threads" (eg: an eight-core FX-8350 would be seen by software as a 4-core processor with 8 threads).
In AMD's latest approach with "Zen," the company did away with the barriers that separated two cores within a module. It's one big monolithic core, with 4 decoders (parts which tell the core what to do), 4 ALUs ("Bulldozer" had two per core), and four 128-bit wide floating-point units, clubbed in two 256-bit FMACs. This approach nearly doubles the per-core number-crunching muscle. AMD implemented an Intel-like SMT technology, which works very similar to HyperThreading.
85 Comments on AMD Zen Features Double the Per-core Number Crunching Machinery to Predecessor
AMD is a company out to make a profit, not be your best friend.
I think it's all speculation and a just really to early to start beating any Zen drum.
While it's fun to think what Mark Keller might have come up with, I think the premise of... Does anyone believe it was that straight forward of a revamp.
I say tapper the enthusiasm as this is "speculation", and nothing that has fact in what AMD/Keller went about laying out.
EDIT: I talked about it in this old thread a bit, and my expetation hasn't changed: if Zen outperforms SKL/KBL, Intel will just change core counts at the different price points and match AMD again, the obvious one being enabling HT on i5 and moving i7 to 6core HT.
We are close to the end of the line with IPC improvements for Intel or AMD, the rest will come through process, cache, and instruction set/hardware support. We are close to the end of the line with Silicon in the high performance categories. We can slap more cores in, more sockets, more memory. But the next big thing is either going to be quantum computing, or photon based.
Either way until I see hard numbers from a source I trust, AMD is a zombie.
Only thing that AMD will mess up is the way how mainstream and enthusiast is now separated. Because if Intel will have to bump mainstream up quickly, it means they'll have to bump up enthusiast as well, otherwise they'll make them equal which means they'll sell less of the more expensive enthusiast platforms and CPU's. But that's good as far as consumers go, assuming the prices stay low...
Intel's immediate response would probably be cutting prices across the board which likely moves LGA 2011 into mainstream prices. As I said in the post, a shocker from AMD could turn LGA 1151 into a budget socket overnight and LGA 2011 into mainstream.
Obviously there were improvement and new instruction sets but they were underwhelming in current applications to say the least.
Of course Intel will still have the performance crown for a long time from now on as they have strong Enthusiast parts, but that is not the mainstream market. The mainstream market is formed by the i3, i5 and i7 non E series.
If anyone, can produce a competitive product here then they are in business, and intel in this area was pretty much sleeping over the years. You said it yourself in those long posts, ivy was die shrink, haswell brought very good power consumption, broadwell its a different beast but I'd exclude as it has expensive edram, and its mostly a very expensive GPU with CPU cores, good for laptops ... but not very viable in my opinion from cost perspective for desktop. Lake mostly die shrink ... again a bit underwhelming.
I found an article which compares different generation performance at same clock also.
It doesn't include the latest architecture though, but it is focused on gaming, which is basically the reason for which most of the people buy these processors, otherwise for browsing or office even the atom is good. If you think is too extreme, fine go with an i3.
See here: wccftech.com/intel-sandy-bridge-ivy-bridge-haswell-graphics-compared-10-difference-average/
I put them in a table also, and compared similar products:
Lowest details i7 2600K @4.5 i7 4770K @4.5 % increase
Crysis 3 91 93 2%
Black Ops 2 355.4 382.8 8%
Bioshock 243.2 265.9 9%
Battlefield 3 199.8 200 0%
Unigine Heaven 4243 4280 1%
Firestrike 7292 7466 2%
Average: 4%
This increase you will probably get by just shrinking the original sandy without any architecture change. Maybe things will change once the new instructions sets will start to be used ... but if they are not supported by most of the computers in use, they will mostly be scattered optimizations in one app or another.
So yes, if anybody can get sandy level performance, good clocks and reasonable power consumption they are back in business.
You'll see gains in synthetic performance and mostly encoding and rendering. Other things are in the same ballpark as Sandy Bridge.