Tuesday, August 23rd 2016

AMD Details ZEN Microarchitecture IPC Gains

AMD Tuesday hosted a ZEN microarchitecture deep-dive presentation in the backdrop of Hot Chips, outlining its road to a massive 40 percent gain in IPC (translated roughly as per-core performance gains), over the current "Excavator" microarchitecture. The company credits the gains to three major changes with ZEN: better core engine, better cache system, and lower power. With ZEN, AMD pulled back from its "Bulldozer" approach to cores, in which two cores share certain number-crunching components to form "modules," and back to a self-sufficient core design.

Beyond cores, the next-level subunit of the ZEN architecture is the CPU-Complex (CCX), in which four cores share an 8 MB L3 cache. This isn't different from current Intel architectures, the cores share nothing beyond L3 cache, making them truly independent. What makes ZEN a better core, besides its independence from other cores, and additional integer pipelines; subtle upscaling in key ancillaries such as micro-Op dispatch, instruction schedulers; retire, load, and store queues; and a larger quad-issue FPU.
AMD also improved the cache system. The hierarchy is similar to pre-Bulldozer AMD architectures, with L3 cache being shared between full-fledged cores, and each core having a dedicated L2 cache. The L1 cache is now write-back (and not write-through), the SRAM that makes up the L2 and L3 caches are faster.
The L3 cache SRAM has 5 times higher bandwidth than the L3 cache found on current AMD architectures. The L1 and L2 caches have 2 times the bandwidth. Load from cache to FPU is now faster. The core is endowed with 64 KB each of L1I cache, 32 KB L1D cache; 512 KB of dedicated L2 cache, and 8 MB of L3 cache shared between four cores in a CCX.
ZEN introduces simultaneous multi-threading (SMT) to AMD processors. Intel's SMT implementation is the popular HyperThreading Technology. AMD's SMT is similar in that each core is addressed to as two threads, with each thread competing for the resources on the core.
The third key area is lower-power, and this is attributed not just to the silicon-level gains yielded from the move to the 14 nm FinFET process. The design team focused on power-draw from the very inception of the ZEN core project. The L1 write-back cache, and the Op cache lower power-draw; the various components on ZEN processors feature aggressive clock-gating, although there's no power-gating.
AMD expanded the ISA CPU instruction-sets, with AVX, AVX2, BMI1, BMI2, AES, RDRAND, sMEP, SHA1/SHA256, ADX, CFLUSHopt, XSAVEC/XSAVES/XRSTORS, and SMAP. The company also introduced a few AMD-exclusive instruction sets, which can be taken advantage of for better performance, including CLzero, and PTE Coalescing.
Add your own comment

80 Comments on AMD Details ZEN Microarchitecture IPC Gains

#76
overlord
Super XPQuestion for the Author or anybody that my have an answer. Will ZEN be using Hyper Transport Technology? I visited the website and the latest upgrade was in 2009 HTT 3.1.
They also have something called Hyper Share.
www.hypertransport.org/default.cfm?page=Home
HT replaced by Coherent Fabric www.anandtech.com/show/10591/amd-zen-microarchiture-part-2-extracting-instructionlevel-parallelism/5
fudzilla.com/news/processors/38381-amd-s-new-interconnect-tech-is-coherent-fabric

Might be Freedom Fabric IP acquired when they bought Seamicro www.theinquirer.net/inquirer/news/2221111/amd-will-not-license-seamicros-freedom-fabric-to-cpu-vendors
Posted on Reply
#77
overlord
medi01Larrabee was not a GPU.
It was an attempt to build something rather different, with x86 instruction set and idea that specialized hardware for z-buffering et all is not needed and it's better done in software.

It was basically a bunch of (simpler) x86 cores.

It would trounce usual GPUs at stuff like ray tracing and yadayada... but in the end Intel abandoned the idea as it didn't quite perform as expected.
Even Intel were confused www.cnet.com/news/intel-initial-larrabee-graphics-chip-canceled/
Posted on Reply
#80
BiggieShady
TheGuruStudMight as well run 2004 PCmark lol.
2004 PCmark? The one where 6 GHz Phenom2 matches 5 GHz Ivy Bridge ... dunno, people are hoping geekbench results are worst case scenario, not the best case scenario :ohwell:
Posted on Reply
Add your own comment
Dec 18th, 2024 21:43 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts