Tuesday, April 2nd 2019
Intel Unleashes 56-core Xeon "Cascade Lake" Processor to Preempt 64-core EPYC
Intel late Tuesday made a boat-load of enterprise-relevant product announcements, including the all important update to its Xeon Scalable enterprise processor product-stack, with the addition of the new 56-core Xeon Scalable "Cascade Lake" processor. This chip is believed to be Intel's first response to the upcoming AMD 7 nm EPYC "Rome" processor with 64 cores and a monolithic memory interface. The 56-core "Cascade Lake" is a multi-chip module (MCM) of two 28-core dies, each with a 6-channel DDR4 memory interface, totaling 12-channel for the package. Each of the two 28-core dies are built on the existing 14 nm++ silicon fabrication process, and the IPC of each of the 56 cores are largely unchanged since "Skylake." Intel however, has added several HPC and AI-relevant instruction-sets.
To begin with, Intel introduced DL Boost, which could be a fixed-function hardware matrix multiplier that accelerates building and training of AI deep-learning neural networks. Next up, are hardware mitigation against several speculative execution CPU security vulnerabilities that haunted the computing world since early-2018, including certain variants of "Spectre" and "Meltdown." A hardware fix presents lesser performance impact compared to a software fix in the form of a firmware patch. Intel has added support for Optane Persistent Memory, which is the company's grand vision for what succeeds volatile primary memory such as DRAM. Currently slower than DRAM but faster than SSDs, Optane Persistent Memory is non-volatile, and its contents can be made to survive power-outages. This allows sysadmins to power-down entire servers to scale down with workloads, without worrying about long wait times to restore uptime when waking up those servers. Among the CPU instruction-sets added include AVX-512 and AES-NI.Intel Speed Select is a fresh-spin on a neglected feature most processors have had for decades, allowing administrators to select specific multipliers for CPU cores on the fly, remotely. Not too different from this is Resource Director Technology, which gives you more fine-grained QoS (quality of service) options for specific cores, PIDs, virtual machines, and so on.
Unlike previous models of Xeon Scalable, the first Xeon Scalable "Cascade Lake" processor, the Xeon Platinum 9200, is an FC-BGA package and not socketed. This 5,903-pin BGA package uses a common integrated heatspreader with the two 28-core dies underneath. The two dies talk to each other over a UPI x20 interconnect link on-package, while each die puts out its second UPI x20 link as the package's two x20 links, to scale up to two packages on a single board (112 cores).
Source:
HotHardware
To begin with, Intel introduced DL Boost, which could be a fixed-function hardware matrix multiplier that accelerates building and training of AI deep-learning neural networks. Next up, are hardware mitigation against several speculative execution CPU security vulnerabilities that haunted the computing world since early-2018, including certain variants of "Spectre" and "Meltdown." A hardware fix presents lesser performance impact compared to a software fix in the form of a firmware patch. Intel has added support for Optane Persistent Memory, which is the company's grand vision for what succeeds volatile primary memory such as DRAM. Currently slower than DRAM but faster than SSDs, Optane Persistent Memory is non-volatile, and its contents can be made to survive power-outages. This allows sysadmins to power-down entire servers to scale down with workloads, without worrying about long wait times to restore uptime when waking up those servers. Among the CPU instruction-sets added include AVX-512 and AES-NI.Intel Speed Select is a fresh-spin on a neglected feature most processors have had for decades, allowing administrators to select specific multipliers for CPU cores on the fly, remotely. Not too different from this is Resource Director Technology, which gives you more fine-grained QoS (quality of service) options for specific cores, PIDs, virtual machines, and so on.
Unlike previous models of Xeon Scalable, the first Xeon Scalable "Cascade Lake" processor, the Xeon Platinum 9200, is an FC-BGA package and not socketed. This 5,903-pin BGA package uses a common integrated heatspreader with the two 28-core dies underneath. The two dies talk to each other over a UPI x20 interconnect link on-package, while each die puts out its second UPI x20 link as the package's two x20 links, to scale up to two packages on a single board (112 cores).
88 Comments on Intel Unleashes 56-core Xeon "Cascade Lake" Processor to Preempt 64-core EPYC
Did they use glue to put that together?
For dual socket systems, it will be Intel's 112c VS AMD's 128c / 256t: Intel's system will have to be much faster in order to counter AMD's more cores and threads but that will skyrocket TDP, so ...
www.anandtech.com/show/14146/intel-xeon-scalable-cascade-lake-deep-dive-now-with-optane
So this one since its pretty much two 8180 clued together. Twice the cores = Twice the price?
Also, I was looking forward to BGA in desktops, when first rumors came out. Being a PC enthusiast doesn't imply being an enthusiast of replacing CPUs.
Simpler, less work, less problems, cheaper, more power. Even if - it's still a bargain for datacenters.
While there is no 28-core CPU in smaller categories, there is a 24-core Xeon Gold 6262 for $2900 (max 4 CPU). I would bet lack of 28-core option has something to do with harvesting cores for the 9200 series.
1.4/2.2GHz and 250W seems to be the expectation based on rumors so far and that should be realistic considering what we have seen of Zen2.
If that 56-core thingie can do 2.6/3.8GHz at 400W it might actually be rather competitive.
When buying a server, you're paying for the machine and for a particular service that comes with it.
The fact that a CPU is cheaper doesn't mean e.g. Dell will sell you the whole package for less.
Just the fact that Intel has 20x larger market share means companies have larger stock of CPUs and other parts. The same SLA should cost less when going with Blue.
But even if there actually was a price difference, it's not exactly huge.
Let's assume every other part costs exactly the same and EPYC equivalent is $10k less per CPU in a 2P machine (because we can!).
Over 3 years you save $556 per month per server. Not much.
There $556 buy you a homogeneous architecture and simpler procedures/training inside the company.
So Intel won't be rather competitive. AMD have stated multiply times that when they designed ROME they expected to compete FAVORABLY against 10nm.
Even Intel said they where afraid that AMD could grab around 20% of the market.
Again it's all just guesses, so everything might be wrong, but we will probably see at Computex. :)
Intel really need to get there next nm die shrink out fast, else amd might gonna take the long straw this time.
I wonder what sort of cooling they'll have to use if it's going in servers??....