Tuesday, April 2nd 2019

Intel Unleashes 56-core Xeon "Cascade Lake" Processor to Preempt 64-core EPYC

Intel late Tuesday made a boat-load of enterprise-relevant product announcements, including the all important update to its Xeon Scalable enterprise processor product-stack, with the addition of the new 56-core Xeon Scalable "Cascade Lake" processor. This chip is believed to be Intel's first response to the upcoming AMD 7 nm EPYC "Rome" processor with 64 cores and a monolithic memory interface. The 56-core "Cascade Lake" is a multi-chip module (MCM) of two 28-core dies, each with a 6-channel DDR4 memory interface, totaling 12-channel for the package. Each of the two 28-core dies are built on the existing 14 nm++ silicon fabrication process, and the IPC of each of the 56 cores are largely unchanged since "Skylake." Intel however, has added several HPC and AI-relevant instruction-sets.

To begin with, Intel introduced DL Boost, which could be a fixed-function hardware matrix multiplier that accelerates building and training of AI deep-learning neural networks. Next up, are hardware mitigation against several speculative execution CPU security vulnerabilities that haunted the computing world since early-2018, including certain variants of "Spectre" and "Meltdown." A hardware fix presents lesser performance impact compared to a software fix in the form of a firmware patch. Intel has added support for Optane Persistent Memory, which is the company's grand vision for what succeeds volatile primary memory such as DRAM. Currently slower than DRAM but faster than SSDs, Optane Persistent Memory is non-volatile, and its contents can be made to survive power-outages. This allows sysadmins to power-down entire servers to scale down with workloads, without worrying about long wait times to restore uptime when waking up those servers. Among the CPU instruction-sets added include AVX-512 and AES-NI.
Intel Speed Select is a fresh-spin on a neglected feature most processors have had for decades, allowing administrators to select specific multipliers for CPU cores on the fly, remotely. Not too different from this is Resource Director Technology, which gives you more fine-grained QoS (quality of service) options for specific cores, PIDs, virtual machines, and so on.

Unlike previous models of Xeon Scalable, the first Xeon Scalable "Cascade Lake" processor, the Xeon Platinum 9200, is an FC-BGA package and not socketed. This 5,903-pin BGA package uses a common integrated heatspreader with the two 28-core dies underneath. The two dies talk to each other over a UPI x20 interconnect link on-package, while each die puts out its second UPI x20 link as the package's two x20 links, to scale up to two packages on a single board (112 cores).
Source: HotHardware
Add your own comment

88 Comments on Intel Unleashes 56-core Xeon "Cascade Lake" Processor to Preempt 64-core EPYC

#26
notb
phillMS will love that.... Server licences anything over 8 cores (not sure about the threads) I think you need to pay over and above for...
You don't have to pay for all the cores on a machine - just those used by your software.
If SQL Server worked fine on 10 cores, why give it more?

Per-core licensing is pretty normal and acceptable. And makes sense. I've seen worse.

For example: there's a database called Vertica (it's a columnar engine, designed for fast queries, BI, modelling etc).
You pay for data limit.

Vertica has very few data types and it's hard to optimize data usage (I imagine: not by coincidence :) ).
Best example: integers. On most databases you have many variants.
SQL Server provides four: 1, 2 ,4 and 8 bytes.
Vertica provides just one: 8 bytes.
I wonder what sort of cooling they'll have to use if it's going in servers??....
Possibly water. It's been used in servers for a while. But high-airflow fans can do miracles too.
Posted on Reply
#27
phill
notbYou don't have to pay for all the cores on a machine - just those used by your software.
If SQL Server worked fine on 10 cores, why give it more?

Per-core licensing is pretty normal and acceptable. And makes sense. I've seen worse.

For example: there's a database called Vertica (it's a columnar engine, designed for fast queries, BI, modelling etc).
You pay for data limit.

Vertica has very few data types and it's hard to optimize data usage (I imagine: not by coincidence :) ).
Best example: integers. On most databases you have many variants.
SQL Server provides four: 1, 2 ,4 and 8 bytes.
Vertica provides just one: 8 bytes.

Possibly water. It's been used in servers for a while. But high-airflow fans can do miracles too.
From what I've seen at work, when buying the OS, it's what's physically in the server rather than what you set it to use but I could be wrong? Either way I wouldn't like the bill for this one!! Linux for me....
Posted on Reply
#29
notb
phillFrom what I've seen at work, when buying the OS, it's what's physically in the server rather than what you set it to use but I could be wrong? Either way I wouldn't like the bill for this one!! Linux for me....
This varies. Windows always has to be licensed based on physical cores on the server (but you can run multiple instances).

However, other software with core-based licenses (like databases) is priced according to the cores it can use. So if you install SQL Server on a VM with 4 cores, you pay for 4 cores (regardless of how many physical cores are in the server/cluster).

Hypervisors are usually licensed per socket.

BTW:
Enterprise Linux distros aren't free. Red Hat with 24/7 support costs $1300 per socket pair (per year).
Similar license for Windows (Server Standard) costs $972 per 16 cores (per year).
Posted on Reply
#30
Vya Domus
Mad gluing skills, but they're too late, their competition has mastered this skill already.
Posted on Reply
#31
Vlada011
I hope Intel will give us some bigger improvement with new Sunny Cove cores.
Posted on Reply
#32
Wavetrex
Wait, is it still 1st of April ?
Posted on Reply
#33
Darmok N Jalad
Being an Intel TDP value, won’t 400W be the starting point for all core at base 2.6GHz clocks? Boosting to 3.8GHz with even some cores will set the TDP to ludicrous.
Posted on Reply
#34
Brusfantomet
notbThat's because you're using an extremely narrow definition of "a PC enthusiast".

Very unlikely.
When buying a server, you're paying for the machine and for a particular service that comes with it.
The fact that a CPU is cheaper doesn't mean e.g. Dell will sell you the whole package for less.
Just the fact that Intel has 20x larger market share means companies have larger stock of CPUs and other parts. The same SLA should cost less when going with Blue.

But even if there actually was a price difference, it's not exactly huge.
Let's assume every other part costs exactly the same and EPYC equivalent is $10k less per CPU in a 2P machine (because we can!).
Over 3 years you save $556 per month per server. Not much.
There $556 buy you a homogeneous architecture and simpler procedures/training inside the company.
Well if you compare the HPE DL380 Gen10 (Intel) vs the HPE DL385 Gen10 (AMD)

the comparison becomes quite clear, the price per server was the same,
spesificaly the CPUs were:
Xeon Gold 6130 16 core 2,1 GHz / All core boost 2,8 GHz
VS
EPYC 7401 24 core 2 GHz / All core boost 2,8 GHz

For only the CPUs the exact quoted price was 34 350 NOK for the Intel CPU, and 32 180 NOK for the AMD.

Total cost for the Intel server was 136 599 NOK, the AMD server was 140 804 NOK.

This is with the same amount of ram (8 sticks on the intel server, 16 on the AMD), same storage, PSU, raid controller and network card.

According to test the only test where a Gold 6130 beats a 7401 is when AVX was used, and that is pre Spectre and Meltdown patches.

Power consumption is favourable for the Gold 6130 vs the 7401, I think. Have no measurements for that. But both use the same PSU (800 W) and same cooling fans.

So the old Epic platform was already competitive with the Xenos, Give rome more AVX power and its understandable that Intel is fearing that AMD will give them trouble.
Posted on Reply
#35
notb
Vya DomusMad gluing skills, but they're too late, their competition has mastered this skill already.
You know that Intel mesh is better than IF at the moment, right?
We'll see if AMD improves it in Zen 2.
Posted on Reply
#36
R0H1T
No we don't, besides mesh is intra die only.
Posted on Reply
#38
Caring1
notbYou know that Intel mesh is better than IF at the moment, right?
Yep, according to Intel. ;)
Posted on Reply
#39
medi01
Haha, hilarious pricing:

notbYou know that Intel mesh is better than IF at the moment, right?
Yeah, sure.
Hold on, how do we know that?

PS
AMD stock gained 9%, and close to 30 bucks, lol:

www.nasdaq.com/symbol/amd/real-time
Posted on Reply
#40
TheGuruStud
TomgangSo now intel also makes glued together cpu's:kookoo: and still on 14 nm. I am afraid to think op tdp on this thing or how low core clock needs to be to hold with in a tdp that dosent need water cooling.

Intel really need to get there next nm die shrink out fast, else amd might gonna take the long straw this time.
Darmok N JaladBeing an Intel TDP value, won’t 400W be the starting point for all core at base 2.6GHz clocks? Boosting to 3.8GHz with even some cores will set the TDP to ludicrous.
28 core uses 400 when boosting.... Yeah, this chip is on life support even before release.
Posted on Reply
#41
HwGeek
Why Dual 28C 9282 has only 77MB Cache while 28C part has 39MB Cache?
P.S: What do you think Intel can make high volume Xeon 9282 parts? or it's mostly for Shaw?
Since for AMD it's just lego game with binning 8 tiny chiplets.
Posted on Reply
#42
kapone32
If they had to use a whole home cooler to cool the 28 core CPU form last year's Computex they would need to use an entire HVAC system to cool this 56 core CPU
Posted on Reply
#43
enxo218
Does it come with a ln2 custom cooler?!
Posted on Reply
#44
SoNic67
HwGeekWhy Dual 28C 9282 has only 77MB Cache while 28C part has 39MB Cache?
Rounding errors.
Posted on Reply
#45
Wavetrex
SoNic67Rounding errors.
This CPU has 55.999994285729120736 cores.
Posted on Reply
#46
efikkan
There is a review over at Phoronix.
hatI wonder if the fact that it's BGA means anything in this market. As a desktop enthusiast, the idea of a BGA chip is pretty horrific, but I don't suppose admins in charge of server farms are upgrading CPUs alone very often.
These CPUs will be used in purpose-built servers, probably with custom motherboards and cases. Machines like this are usually replaced when the warranty expires, if not before.

Sockets on the other hand make sense when you need to be able to configure a broad selection of configurations.
NoztraNot really a bargin if the CPU cost 20-30K and uses 400 Watt. TCO would be fairly high compared to EPYC/ROME.
Most high-end servers are built for a specific purpose, and cost per core is probably the least relevant metric in such cases. Performance on specific server workloads varies a lot, and the difference between AMD and Intel is much larger here, and in some cases the performance difference can be 2-3×. Intel does have the upper hand in generic performance per core, but this is usually less relevant for servers. What matters here is performance for a specific task, and Intel performs well in many of these, and even Zen 2 will probably not threaten Intel's place in the enterprise market. It's the mainstream desktop that Intel needs to worry about.

Also, enterprise customers usually have various deals which gives them huge discounts.
HwGeekOMG- Look:

ark.intel.com/content/www/us/en/ark/products/192467/intel-xeon-platinum-8256-processor-16-5m-cache-3-80-ghz.html

7000$ for Quad Core XEON!.
:D
If your usecase is heavily bandwidth bottlenecked, then 8×6 channels of DDR4 2933 MHz is going to kick some serious butt…
Posted on Reply
#47
notb
R0H1TNo we don't, besides mesh is intra die only.
The post I quoted was about intra die realm ("glue").
BrusfantometXeon Gold 6130 16 core 2,1 GHz / All core boost 2,8 GHz
VS
EPYC 7401 24 core 2 GHz / All core boost 2,8 GHz
That Xeon uses a lot less power. It's not a good comparison.
We should compare 7401 to Xeon 6150: 18 cores, 2.7/3.7GHz - it's way faster than 6130 and should match 7401 even in EPYC's best case scenarios.
Power consumption is favourable for the Gold 6130 vs the 7401, I think. Have no measurements for that. But both use the same PSU (800 W) and same cooling fans.
The exact same page tested EPYC 7351: www.servethehome.com/amd-epyc-7351p-single-socket-cpu-linux-benchmarks-and-review/
7351 is weaker than 7401.
Posted on Reply
#48
Vayra86
TomgangSo now intel also makes glued together cpu's:kookoo: and still on 14 nm. I am afraid to think op tdp on this thing or how low core clock needs to be to hold with in a tdp that dosent need water cooling.

Intel really need to get there next nm die shrink out fast, else amd might gonna take the long straw this time.
IMO AMD already has the long straw, Intel just tries really hard to hide that fact with a rain of press releases and powerpoint slides.

But the key is not having it, its keeping it. Intel's dominance was a long term effort, AMD will need nothing less.
Posted on Reply
#49
hat
Enthusiast
notbThat's because you're using an extremely narrow definition of "a PC enthusiast".
I would imagine anyone calling themselves a PC enthusiast would at least be able to build one. But perhaps this is a matter of opinion?
notbVery unlikely.
When buying a server, you're paying for the machine and for a particular service that comes with it.
The fact that a CPU is cheaper doesn't mean e.g. Dell will sell you the whole package for less.
Just the fact that Intel has 20x larger market share means companies have larger stock of CPUs and other parts. The same SLA should cost less when going with Blue.

But even if there actually was a price difference, it's not exactly huge.
Let's assume every other part costs exactly the same and EPYC equivalent is $10k less per CPU in a 2P machine (because we can!).
Over 3 years you save $556 per month per server. Not much.
There $556 buy you a homogeneous architecture and simpler procedures/training inside the company.
$556/mo over 3 years per server seems like a decent amount to me, especially when you have a lot of servers. This architecture is anything but homogeneous, though? Intel is falling back on the same old "glue" trick they criticized AMD for. It's not one big monolothic die. I'm also failing to see why anything would be any simpler with one chip maker vs the other. Whether you buy AMD or Intel, you're buying essentially the same thing. They're both processors that perform the same functions.
Posted on Reply
#50
notb
hatI would imagine anyone calling themselves a PC enthusiast would at least be able to build one. But perhaps this is a matter of opinion?
Do you also expect a car enthusiast to be able to build one?
Should a book enthusiast be able to write one?
Is a food enthusiast always a great cook?

What makes computers so special that for so many the word "enthusiast" means "assembling" and not using?
$556/mo over 3 years per server seems like a decent amount to me, especially when you have a lot of servers.
That's nothing if it forces you to hire one more person to tune and optimize it.
And honestly, $10-20k difference on CPUs per server that costs 10x as much... Not a big deal.

And as I said earlier: despite EPYC CPUs being cheaper, otherwise identical servers could cost the same. And someone already gave an example which supports this.
This architecture is anything but homogeneous, though? Intel is falling back on the same old "glue" trick they criticized AMD for. It's not one big monolothic die.
And here we're back to the hardware side.
No offense, but who cares? Surely not the people buying these servers.
By "homogeneous architecture" I meant that the servers behave similarly, so moving systems between them is fast and cheap.
Whether you buy AMD or Intel, you're buying essentially the same thing. They're both processors that perform the same functions.
By all means, no. These processors behave differently. You move a system from one Intel server to another and it works more or less the same.
You move a system from an Intel server to an AMD server and it's a lottery.

You just can't look at these CPUs and say "AMD CPU costs $10k less, so it saves money". $10k is nothing in the scale we're talking about.
It could cost $100k to train people and tune systems for a different architecture. And then it could costs millions if the system doesn't work as you wanted.
That's why enterprises will go with the less risky Blue option. Because $10k is nothing, but a bad server is a big problem.
Posted on Reply
Add your own comment
Dec 21st, 2024 06:07 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts