Thursday, July 19th 2018
MSI Drops First Hint of AMD Increasing AM4 CPU Core Counts
With Intel frantically working on an 8-core socket LGA1151 processor to convincingly beat the 8-core AMD Ryzen 2000 series processor, AMD could be working on the next cycle of core-count increases for the mainstream-desktop platform. Motherboard maker MSI may have dropped the first hint that AMD is bringing >8 cores to the socket AM4 mainstream-desktop platform by mentioning that its upcoming motherboards based on the AMD B450 chipset support 8-core "and up" CPU in a marketing video.
AMD will get its next opportunity to tinker with key aspects of its CPU micro-architecture with "Zen 2," being built on the 7 nm silicon fabrication process. If it decides to stick with the CCX approach to multi-core processors, the company could increase per-CCX core counts. A 50 percent core-count increase enables 12-core processors, while a 100 percent increase brings 16-cores to the AM4 platform. MSI video confirms that these >8-core processors will have backwards-compatibility with existing 400-series chipsets, even if they launch alongside newer 500-series chipset.The video follows.
AMD will get its next opportunity to tinker with key aspects of its CPU micro-architecture with "Zen 2," being built on the 7 nm silicon fabrication process. If it decides to stick with the CCX approach to multi-core processors, the company could increase per-CCX core counts. A 50 percent core-count increase enables 12-core processors, while a 100 percent increase brings 16-cores to the AM4 platform. MSI video confirms that these >8-core processors will have backwards-compatibility with existing 400-series chipsets, even if they launch alongside newer 500-series chipset.The video follows.
88 Comments on MSI Drops First Hint of AMD Increasing AM4 CPU Core Counts
And take a look at the 7740x it's the king of single thread
If Intel hasn't upped that for years now, is it even possible? The core count is something different, there was no push for that. But IPC... why not? People wold buy such CPU like crazy.
If you were to bench an R5 2400X against a Phenom II x4 980 - the 2400X would absolutely crush it. They both have the same amount of cores and clockspeed.
P.S. Yes lol, it is possible to increase IPC. AMD increased it by ~55% between Piledriver vs Zen 1. Intel hasn't really changed architectures since Haswell (2013).
In other words, if you compare, say, a Core2Quad, a Kaby Lake i5, a Ryzen 3 and a Phenom x4 (all of which have 4 cores and threads), over-/underclock them all to the same frequency, and then benchmark them, you'll end up with a chart of their relative IPC. Of course, this is dependent on both the benchmarks used (as different architectures perform differently in different tasks - as an example, Ryzen overperforms in Cinebench and rendering tasks compared to some other tasks when compared to modern Intel archs) and the various idiosyncrasies of the architecture. Both cache, uncore clock speeds, RAM speed, interconnect speeds and a whole host of lower-level factors affect IPC. As an example, the IPC increase between Zen and Zen+ (Ryzen 1st and 2nd gen) comes mostly from lower cache latencies and better optimized inter-chip communication.
As such, IPC is
a) always an approximation, not to mention task-dependent, and really a summary of a lot of lower-level, difficult to identify performance parameters
b) still a reasonable term for speaking of the performance of an architecture regardless of core count and clock speed.
For example, AMD's FX-series CPUs had lots of cores and sometimes crazy clock speeds, yet were trounced by Intel due to the Core architecture's significantly superior IPC. Which is how a 3.5GHz 4c4t i5 could outperform a 8c8t FX in >90% of tasks. The cores and the clock speed don't matter if the cores are processing fewer instructions per clock (cycle).
As for how/why Intel's IPC hasn't improved in years, it's due to their roadmap shakeup (going from the "tick-tock" release rythm (where one was an architecture refresh, the other a production process node improvement) to the current (yet not really followed) "PAO" or "Process, Architecture, Optimization") and their ever expanding issues with getting their next-gen 10nm process node into volume production.
To clarify: Intel is currently using the 14nm node, either in 14nm+ or 14nm++ versions. The latter two are minor improvements upon the initial 14nm process node, which was launched alongside Broadwell in 2014 (mobile) and 2015 (desktop). As such, Broadwell was a "tick" (process shrink) (although Broadwell was also architecturally different from its predecessor, but never mind that right now). Skylake, following Broadwell, was a "tock", or an architecture improvement. Architecture improvement = IPC increase, at least most of the time.
Then came the issues. 10nm was supposed to launch in 2016. That didn't happen. Instead we got Kaby Lake, which is identical to Skylake architecturally, but produced on an optimized process node (14nm+) which gives it higher clock speeds. Hence the jump in performance from the 6700k to the 7700k was tiny, due to there being only the few-hundred-MHz increase in clock speed to make it faster.
The 10nm issues and delays keep piling up. Currently, it's delayed to 2019, and some (SemiAccurate, among others) believe it won't even be ready by then. Intel has one or two gimped, barely-functional 10nm CPUs out now, which seems like an effort to calm irate investors demanding progress.
As such, Intel has launched Coffee Lake instead. Again, Coffee Lake is architecturally unchanged from Kaby Lake (and thus Skylake), at least in terms of IPC. Of course, Coffee Lake has more cores, is slightly more power efficient (but really not much), and clocks even higher. As such, Coffee Lake CPUs are faster, but their IPC is unchanged. The performance increase comes from small clock increases and a significant increase in cores and threads (i7 from 4/8 to 6/12, i5 from 4/4 to 6/6, and i3 from 2/4 to 4/4).
The reason for Intel doing this is likely that the next "process" step of their "PAO" cadence is yet to show up. As such, they're reiterating "O" (Optimization) steps to tread water until 10nm is ready. To recap, currently Broadwell/14nm was P(rocess), Skylake was A(rchitecture), and both KBL and CFL are O(ptimization). All the while, AMD is catching up, and improving both IPC and moving to smaller process nodes rapidly. If this keeps up, it won't be long until they catch up.
I know nothing about all this, but it seems like people are incorrectly saying IPC increase instead of performance increase.
After all, if IPC was "clearly defined, and not affected by anything", you'd either have an incredibly simple CPU or an absolutely perfect CPU with zero bottlenecks (which is both impossible and would be extremely impractical to attempt). And of course faster cache can make the CPU process more instructions per clock if the various processing blocks in the CPU are bottlenecked by cache bandwidth or latency. Pretty much all CPUs have either too little or too slow cache (as increasing cache size also increases latency), so there's always a push to optimize cache size and latency and find a sweet spot for this. And that's just one of the myriad factors affecting CPU performance internally. Pipeline length and width affects IPC, as do all the various functional blocks along the pipeline - both how many of them there are and how they work, as well as how efficiently they can be fed instructions (where the various caches come into play). Then there's branch prediction (of Spectre fame) and all the stuff attached to that - time penalties from mispredictions, time gains from caching correct predictions for repeated instructions, and all the other tricks and tweaks CPU designers have figured out over the years.
Remember, even what constitutes an "instruction" varies across CPU architectures (as they all support varying instruction sets, even within the X86 family). And different instructions require different resources and different amounts of time to be processed. IPC, in other words, is an abstraction that approximates an extremely complex set of performance parameters. Intel's inclusion of AVX-512 instruction decoding hardware on recent HEDT chips is a good example. It's an extremely useful term, as it lets us discuss performance on a less variable level than if we had to include clock speeds and core counts, which to a certain degree obfuscate the true performance of a CPU design.
Higher performance doesn't necessarily mean higher IPC because you can have better performance due to higher clock speeds while not touching IPC. The more recent Intel architectures are a prime example of this: the base / turbo clock speeds increased and so did the performance but the IPC stayed pretty much the same.
Due to how Zen architecture works, if you can reduce the latency (by whatever means), the chips increases performance: it's why it's much better to pair a Zen based chip with faster memory (as long as it's supported by the board used). If AMD can manage to reduce latency (as apparently they did in Ryzen 2000 VS 1000 series) in subsequent Zen based architectures, that already translates into an IPC increase, by itself.
Zen 2 could have zero architectural improvements and the node shrink alone would give them enough of an IPC, die space, and power savings boost to beat Intel's upcoming offerings which look very dim.
There's a reason they brought Kim Keller on board, they need something other then the annual 3% IPC boost and the cost of increasing die sizes is getting to them. IPC is really the best measure of performance for people on this forum, so that's probably why it's so popular. If you want to talk about multi-threaded performance or memory performance, there are the handbrake and microsoft certified professional forums for those respectively.
It would be great if more applications were able to tap more memory performance or more cores but as it stands right now IPC is king for consumers, enthusiasts, and gamers.
Wake UP, dude!!!
Ofc, it's not stock, but you also weren't specific in that regard.
Enough OT: if you wish to discuss this further, by all means PM me, and we can continue there.
Lets talk about the most recent causality to the IPC argument, The 7700K. Even mondo overclocked there is simply not enough power to keep up with 6 and 8 core CPUs. That thing went into obsolescence with the quickness. I suspect the 8700K will reach the same fate if we get 12-16 cores on mainstream desktops.