Tuesday, April 2nd 2019
Intel Unleashes 56-core Xeon "Cascade Lake" Processor to Preempt 64-core EPYC
Intel late Tuesday made a boat-load of enterprise-relevant product announcements, including the all important update to its Xeon Scalable enterprise processor product-stack, with the addition of the new 56-core Xeon Scalable "Cascade Lake" processor. This chip is believed to be Intel's first response to the upcoming AMD 7 nm EPYC "Rome" processor with 64 cores and a monolithic memory interface. The 56-core "Cascade Lake" is a multi-chip module (MCM) of two 28-core dies, each with a 6-channel DDR4 memory interface, totaling 12-channel for the package. Each of the two 28-core dies are built on the existing 14 nm++ silicon fabrication process, and the IPC of each of the 56 cores are largely unchanged since "Skylake." Intel however, has added several HPC and AI-relevant instruction-sets.
To begin with, Intel introduced DL Boost, which could be a fixed-function hardware matrix multiplier that accelerates building and training of AI deep-learning neural networks. Next up, are hardware mitigation against several speculative execution CPU security vulnerabilities that haunted the computing world since early-2018, including certain variants of "Spectre" and "Meltdown." A hardware fix presents lesser performance impact compared to a software fix in the form of a firmware patch. Intel has added support for Optane Persistent Memory, which is the company's grand vision for what succeeds volatile primary memory such as DRAM. Currently slower than DRAM but faster than SSDs, Optane Persistent Memory is non-volatile, and its contents can be made to survive power-outages. This allows sysadmins to power-down entire servers to scale down with workloads, without worrying about long wait times to restore uptime when waking up those servers. Among the CPU instruction-sets added include AVX-512 and AES-NI.Intel Speed Select is a fresh-spin on a neglected feature most processors have had for decades, allowing administrators to select specific multipliers for CPU cores on the fly, remotely. Not too different from this is Resource Director Technology, which gives you more fine-grained QoS (quality of service) options for specific cores, PIDs, virtual machines, and so on.
Unlike previous models of Xeon Scalable, the first Xeon Scalable "Cascade Lake" processor, the Xeon Platinum 9200, is an FC-BGA package and not socketed. This 5,903-pin BGA package uses a common integrated heatspreader with the two 28-core dies underneath. The two dies talk to each other over a UPI x20 interconnect link on-package, while each die puts out its second UPI x20 link as the package's two x20 links, to scale up to two packages on a single board (112 cores).
Source:
HotHardware
To begin with, Intel introduced DL Boost, which could be a fixed-function hardware matrix multiplier that accelerates building and training of AI deep-learning neural networks. Next up, are hardware mitigation against several speculative execution CPU security vulnerabilities that haunted the computing world since early-2018, including certain variants of "Spectre" and "Meltdown." A hardware fix presents lesser performance impact compared to a software fix in the form of a firmware patch. Intel has added support for Optane Persistent Memory, which is the company's grand vision for what succeeds volatile primary memory such as DRAM. Currently slower than DRAM but faster than SSDs, Optane Persistent Memory is non-volatile, and its contents can be made to survive power-outages. This allows sysadmins to power-down entire servers to scale down with workloads, without worrying about long wait times to restore uptime when waking up those servers. Among the CPU instruction-sets added include AVX-512 and AES-NI.Intel Speed Select is a fresh-spin on a neglected feature most processors have had for decades, allowing administrators to select specific multipliers for CPU cores on the fly, remotely. Not too different from this is Resource Director Technology, which gives you more fine-grained QoS (quality of service) options for specific cores, PIDs, virtual machines, and so on.
Unlike previous models of Xeon Scalable, the first Xeon Scalable "Cascade Lake" processor, the Xeon Platinum 9200, is an FC-BGA package and not socketed. This 5,903-pin BGA package uses a common integrated heatspreader with the two 28-core dies underneath. The two dies talk to each other over a UPI x20 interconnect link on-package, while each die puts out its second UPI x20 link as the package's two x20 links, to scale up to two packages on a single board (112 cores).
88 Comments on Intel Unleashes 56-core Xeon "Cascade Lake" Processor to Preempt 64-core EPYC
If SQL Server worked fine on 10 cores, why give it more?
Per-core licensing is pretty normal and acceptable. And makes sense. I've seen worse.
For example: there's a database called Vertica (it's a columnar engine, designed for fast queries, BI, modelling etc).
You pay for data limit.
Vertica has very few data types and it's hard to optimize data usage (I imagine: not by coincidence :) ).
Best example: integers. On most databases you have many variants.
SQL Server provides four: 1, 2 ,4 and 8 bytes.
Vertica provides just one: 8 bytes. Possibly water. It's been used in servers for a while. But high-airflow fans can do miracles too.
ark.intel.com/content/www/us/en/ark/products/192467/intel-xeon-platinum-8256-processor-16-5m-cache-3-80-ghz.html
7000$ for Quad Core XEON!.
However, other software with core-based licenses (like databases) is priced according to the cores it can use. So if you install SQL Server on a VM with 4 cores, you pay for 4 cores (regardless of how many physical cores are in the server/cluster).
Hypervisors are usually licensed per socket.
BTW:
Enterprise Linux distros aren't free. Red Hat with 24/7 support costs $1300 per socket pair (per year).
Similar license for Windows (Server Standard) costs $972 per 16 cores (per year).
the comparison becomes quite clear, the price per server was the same,
spesificaly the CPUs were:
Xeon Gold 6130 16 core 2,1 GHz / All core boost 2,8 GHz
VS
EPYC 7401 24 core 2 GHz / All core boost 2,8 GHz
For only the CPUs the exact quoted price was 34 350 NOK for the Intel CPU, and 32 180 NOK for the AMD.
Total cost for the Intel server was 136 599 NOK, the AMD server was 140 804 NOK.
This is with the same amount of ram (8 sticks on the intel server, 16 on the AMD), same storage, PSU, raid controller and network card.
According to test the only test where a Gold 6130 beats a 7401 is when AVX was used, and that is pre Spectre and Meltdown patches.
Power consumption is favourable for the Gold 6130 vs the 7401, I think. Have no measurements for that. But both use the same PSU (800 W) and same cooling fans.
So the old Epic platform was already competitive with the Xenos, Give rome more AVX power and its understandable that Intel is fearing that AMD will give them trouble.
We'll see if AMD improves it in Zen 2.
Yeah, sure.
Hold on, how do we know that?
PS
AMD stock gained 9%, and close to 30 bucks, lol:
www.nasdaq.com/symbol/amd/real-time
P.S: What do you think Intel can make high volume Xeon 9282 parts? or it's mostly for Shaw?
Since for AMD it's just lego game with binning 8 tiny chiplets.
Sockets on the other hand make sense when you need to be able to configure a broad selection of configurations. Most high-end servers are built for a specific purpose, and cost per core is probably the least relevant metric in such cases. Performance on specific server workloads varies a lot, and the difference between AMD and Intel is much larger here, and in some cases the performance difference can be 2-3×. Intel does have the upper hand in generic performance per core, but this is usually less relevant for servers. What matters here is performance for a specific task, and Intel performs well in many of these, and even Zen 2 will probably not threaten Intel's place in the enterprise market. It's the mainstream desktop that Intel needs to worry about.
Also, enterprise customers usually have various deals which gives them huge discounts. :D
If your usecase is heavily bandwidth bottlenecked, then 8×6 channels of DDR4 2933 MHz is going to kick some serious butt…
We should compare 7401 to Xeon 6150: 18 cores, 2.7/3.7GHz - it's way faster than 6130 and should match 7401 even in EPYC's best case scenarios. The exact same page tested EPYC 7351: www.servethehome.com/amd-epyc-7351p-single-socket-cpu-linux-benchmarks-and-review/
7351 is weaker than 7401.
But the key is not having it, its keeping it. Intel's dominance was a long term effort, AMD will need nothing less.
Should a book enthusiast be able to write one?
Is a food enthusiast always a great cook?
What makes computers so special that for so many the word "enthusiast" means "assembling" and not using? That's nothing if it forces you to hire one more person to tune and optimize it.
And honestly, $10-20k difference on CPUs per server that costs 10x as much... Not a big deal.
And as I said earlier: despite EPYC CPUs being cheaper, otherwise identical servers could cost the same. And someone already gave an example which supports this. And here we're back to the hardware side.
No offense, but who cares? Surely not the people buying these servers.
By "homogeneous architecture" I meant that the servers behave similarly, so moving systems between them is fast and cheap. By all means, no. These processors behave differently. You move a system from one Intel server to another and it works more or less the same.
You move a system from an Intel server to an AMD server and it's a lottery.
You just can't look at these CPUs and say "AMD CPU costs $10k less, so it saves money". $10k is nothing in the scale we're talking about.
It could cost $100k to train people and tune systems for a different architecture. And then it could costs millions if the system doesn't work as you wanted.
That's why enterprises will go with the less risky Blue option. Because $10k is nothing, but a bad server is a big problem.