Tuesday, April 2nd 2019
Intel Unleashes 56-core Xeon "Cascade Lake" Processor to Preempt 64-core EPYC
Intel late Tuesday made a boat-load of enterprise-relevant product announcements, including the all important update to its Xeon Scalable enterprise processor product-stack, with the addition of the new 56-core Xeon Scalable "Cascade Lake" processor. This chip is believed to be Intel's first response to the upcoming AMD 7 nm EPYC "Rome" processor with 64 cores and a monolithic memory interface. The 56-core "Cascade Lake" is a multi-chip module (MCM) of two 28-core dies, each with a 6-channel DDR4 memory interface, totaling 12-channel for the package. Each of the two 28-core dies are built on the existing 14 nm++ silicon fabrication process, and the IPC of each of the 56 cores are largely unchanged since "Skylake." Intel however, has added several HPC and AI-relevant instruction-sets.
To begin with, Intel introduced DL Boost, which could be a fixed-function hardware matrix multiplier that accelerates building and training of AI deep-learning neural networks. Next up, are hardware mitigation against several speculative execution CPU security vulnerabilities that haunted the computing world since early-2018, including certain variants of "Spectre" and "Meltdown." A hardware fix presents lesser performance impact compared to a software fix in the form of a firmware patch. Intel has added support for Optane Persistent Memory, which is the company's grand vision for what succeeds volatile primary memory such as DRAM. Currently slower than DRAM but faster than SSDs, Optane Persistent Memory is non-volatile, and its contents can be made to survive power-outages. This allows sysadmins to power-down entire servers to scale down with workloads, without worrying about long wait times to restore uptime when waking up those servers. Among the CPU instruction-sets added include AVX-512 and AES-NI.Intel Speed Select is a fresh-spin on a neglected feature most processors have had for decades, allowing administrators to select specific multipliers for CPU cores on the fly, remotely. Not too different from this is Resource Director Technology, which gives you more fine-grained QoS (quality of service) options for specific cores, PIDs, virtual machines, and so on.
Unlike previous models of Xeon Scalable, the first Xeon Scalable "Cascade Lake" processor, the Xeon Platinum 9200, is an FC-BGA package and not socketed. This 5,903-pin BGA package uses a common integrated heatspreader with the two 28-core dies underneath. The two dies talk to each other over a UPI x20 interconnect link on-package, while each die puts out its second UPI x20 link as the package's two x20 links, to scale up to two packages on a single board (112 cores).
Source:
HotHardware
To begin with, Intel introduced DL Boost, which could be a fixed-function hardware matrix multiplier that accelerates building and training of AI deep-learning neural networks. Next up, are hardware mitigation against several speculative execution CPU security vulnerabilities that haunted the computing world since early-2018, including certain variants of "Spectre" and "Meltdown." A hardware fix presents lesser performance impact compared to a software fix in the form of a firmware patch. Intel has added support for Optane Persistent Memory, which is the company's grand vision for what succeeds volatile primary memory such as DRAM. Currently slower than DRAM but faster than SSDs, Optane Persistent Memory is non-volatile, and its contents can be made to survive power-outages. This allows sysadmins to power-down entire servers to scale down with workloads, without worrying about long wait times to restore uptime when waking up those servers. Among the CPU instruction-sets added include AVX-512 and AES-NI.Intel Speed Select is a fresh-spin on a neglected feature most processors have had for decades, allowing administrators to select specific multipliers for CPU cores on the fly, remotely. Not too different from this is Resource Director Technology, which gives you more fine-grained QoS (quality of service) options for specific cores, PIDs, virtual machines, and so on.
Unlike previous models of Xeon Scalable, the first Xeon Scalable "Cascade Lake" processor, the Xeon Platinum 9200, is an FC-BGA package and not socketed. This 5,903-pin BGA package uses a common integrated heatspreader with the two 28-core dies underneath. The two dies talk to each other over a UPI x20 interconnect link on-package, while each die puts out its second UPI x20 link as the package's two x20 links, to scale up to two packages on a single board (112 cores).
88 Comments on Intel Unleashes 56-core Xeon "Cascade Lake" Processor to Preempt 64-core EPYC
But, surely the price tag means something, to a lot of people. Lots of businesses like to maximize profit, and if one way they can do that is to reduce hardware costs, why not go for it? What if you're running a bunch of old Sandy Bridge servers and it's time for an upgrade? You say $10k means nothing on this scale, but I think the bigger the scale, the more it would matter, no? What if you save $10k per server and you upgrade 20 servers, or more?
A similar increase in cpu price for Epic gives you a 7501, 32 core and 2,2 GHz base / 2,7 GHz all core boost, and now both CPUs are close in power usage. (165 W for the 6150 and 170 W for the 7501).
Also, the 6150 Gold is 18 core, 2,7 Ghz base 3,4 Ghz all core boost, single core boost is of less importance when using many core server CPUs.
I mean: why would single-core performance not be important in servers? It is important in PCs. Nothing changes. It's still a computer, it can be used for the same tasks.
Some clients will prefer low core count and high clocks, some the opposite.
That's why there are so many variants of server CPUs - a client can get one that's best for his needs. The choice already got halved since I built my first custom PC 20 years ago.
We have 5-6 big consumer motherboard makers today (not including OEMs and e.g. Supermicro).
They make very similar products, using the same parts, at very similar price point. In many cases the key differentiating factor is color scheme or LEDs (both between companies and in one's lineup).
10-20 years ago we had more chipset manufacturers (Nvidia!), we had IGPs on mobo, we had more interfaces, we had more suppliers of network adapters and controllers.
There was an actual choice - it affected performance and features.
And if BGA was a thing in Intel desktop CPUs, I'd expect Intel to provide motherboards by default. And that would be fine, because their mobos were among the best before they left this market. And NUCs are just mind-blowingly good.
I mean I am expecting BGA to become the norm in desktop space as well, sooner or later. But not for a while yet.
Take the two servers I have mentioned here, the CPUs are roughly 25 % of the Price, so if the CPU price is dubbed, then the price of the server increases by 25 %. That is not a trivial price hike. If the use of the server will benefit from the extra computing power the expensive CPU will be worth it.
If you are buying a server with 16, 18, 24 or 32 cores per CPU is should be because you need the number of cores, otherwise both Intel and AMD has CPUs with lower core count but higher frequencies, usually for a better price then the ones with maximum number of cores.
I can't get over how much they are asking for these CPUs...
Price per CPU, price per core or even CPU cores per socket is not really relevant. No one buying high-end servers compare a X core server from AMD vs. a X core server from Intel, instead they compare specific performance according to their constraints (price, thermals, server count etc.). And those who are running heavy simulations usually utilize custom software, which obviously normally use AVX (or GPUs), and as you should know Intel is really crushing it when it comes to raw AVX performance. Even with Zen 2 which is expected to be on par with Skylake in AVX2 performance, Skylake-SP/Cascade Lake-SP still have the upper hand with AVX-512, which offers over twice the peak throughput of AVX2. There are basically no benchmarks of software using AVX-512 yet, but if you're running custom software, then converting existing algorithms from AVX(2) is a trivial task and will yield a massive "free" performance upgrade.
Its easy to understand the confusion. This site is not geared for that kind of usage and "server" generally means "host minecraft". When you have big boy nutanix nodes and ceph clusters and hundreds of gigs or more of ram I can tell you I stopped caring how much the CPU for the server cost a long long time ago.
EDIT:: Its also hard to believe when you start getting into HA servers that drive all this technology and log servers that consolidate and display/chart/graph all the various sensors of all these various network/server/data appliances and the software itself that runs on them, that in some cases this types of CPUs are while new still only barely enough to keep up with the environment they are put in.
And it's not like you have to look for exotic software. Here we are, discussing whether $20000 for a server CPU is a lot, when even a stupid SQL Server costs $14000 per core. Exactly. If this site is meant to cover "pro" stuff as well (and we're getting a lot of news about servers since EPYC arrived), there should be some effort to educate the community.
Why can't TPU just leave to subject and remain focused on gaming?
I mean: software discussions are limited to games and benchmarks, while in the hardware part there's hardly anything about laptops, about Macs, about ARM systems (phones).
Why force enterprise topics? It makes no sense.
www.microsoft.com/en-us/sql-server/sql-server-2017-pricing
But yeah, for small teams 4-16 cores will be enough easily - sometimes even better.
Databases have intrinsic issues with parallel operation, so it's usually a good idea to limit threads anyway (i.e. to 4 per user).
Here I just wanted to provide a context. We've spent 3 pages discussing whether spending $50k on a Xeon server is a good idea if *theoretically* an EPYC one could cost $40k.
It's worth understanding that if the client buys this to run a big database (which is by far the most popular scenario), he'll pay over $500k for the db license.
Also, let's be honest, implementation of the system (including training) will consume hundreds of thousands as well.
Could you give any examples of the sort of work these servers do please?
I guess the long and short of it is, if you are doing a job, you have to pay whatever you have to :)
Building PC is akin "replacing wheels" on a car, something definitely within "enthusiast" level. That's the most expensive version of SQL server there is: enterprise edition. (I doubt any enterprise actually pays that much)
Standard are about 3k.
Basic 900.
Lovely attempt though.Do you need more time to somehow support this claim, or shall we pretend you never said it?
How will you call someone who really likes driving?
He doesn't have to be very good at it. He just really likes it.
So after work some people read books, some delid CPUs. He gets into the car and drives around for 4 hours. These licenses have different use cases. Standard can be used up to 24 cores. Look for Xeon vs Epyc database performance tests.
For fun. Because he likes driving so much, but he can't fix basic stuff in cars, because he loves driving, but not cars.
Yeah, well, ok. If such people exist I'd call them rather unusual. 24+ cores is an extreme scenario that is very rarely used.
I get what you are hinting at, someone vertically scaling a server, well... Perhaps.
I'd think this was targeted at companies that are in cloud business.
But I do feel the annoyance of the ignorance displayed among forum members every time a Quadro card or Xeon CPU is released. Clearly you aren't familiar with enterprise software and hardware. Prices are high, but usually there are support and service included.
But if this is a shock to you, let's hope you never get to see the price of (mostly crappy) consultant-made custom software, you might suffer a heart attack…
I mean: they can't afford the compromises AMD made to push so many cores at low price point.
Also, even if Cascade Lake has problems in OLTP similar to those of EPYC, you can still get a "glue-less" Xeon with 28 cores, whereas the whole AMD lineup shares the same architecture. You call having pleasure from driving unusual? Seriously?
OK. So what about someone who likes the way cars look? There are countless people fascinated by car design. They take photos, they make 3D models etc.
What are they? By your definition they aren't car enthusiast as well, right?
You see where I'm going with this?
car enthusiasts
|--- car driving enthusiasts
|--- car mechanics enthusiasts
|--- car design enthusiasts
|--- car servicing enthusiasts
etc
computer enthuasiasts
|--- computer building enthusiasts
|--- computer programming enthusiasts
|--- computer looks enthusiasts (yes, it's a thing!)
etc
I don't know why this is happening, but many people - like you - define an enthusiast of something as someone who likes physical tinkering, not using.
The reason cars exist is driving. The reason why computers exist is running software.
A system analyst or a programmer is just "a user" for you, but someone who can put a graphics card into a slot is "an enthusiast"... I find this bizzare.
Honestly, it seems superficial and really unfair for the whole phenomenon we call "computing". :-/ What...? Sure, it's great value for large datacenters. But it's also a nice product for normal companies that simply need a server for their work.
Anyway, the big selling point of Cascade Lake servers is Optane support. This is the feature that will push sales, not the core count. :-)
The other software, is generally virtualization, licensing for virtual machines can be separated into per core/node and in itself is expensive of course that doesnt include the cost of licensing of the operating systems you install in this environment.
Additionally, you may also have virtualized appliances like sophos paloalto etc. for routing and switching. Big boy routers have many different types of licensing depending on the manufacturer but they can be split up all kinds of ways. Some are done by port/features (vpn etc) or speed (allowed to negotiate past 1gb/10gb/40/100gb?) sometimes and often a collection of all of them. Less we forget like everything else if you want to license your network equipment so it functions, that requires the separate support contract. Big boy iron swithes run 10's of thousands of $$. for the equipment before licensing and support.
In many cases a very large or small complex (think science lab) build out, costs 100's of $ to several million. so $20k or so for a server is nothing. Not to mention we are talking enterprise grade HDDs or SSD/NVME drives. think $700+ per drive.
There will always be variations. And not all ENT switches are 20k, not all routers are 8 grand, not all devices need you to renew a support license, but the idea is that in this field the numbers are very big all the time, think of your wallet having 1, 5s and 10s in it. Thats pretty normal and you are used to seeing it. In ENT network/systems design the wallets are 10,000 / 500,000 / 1,000,000 its just the norm. The cost of playing in the field.
As for the convo itself and not singling you out @phill just for the record, but lets make sure we are keeping the convo cool. I dont want people arguing and getting pissy. I encourage the open discussion of these types of systems, but everyone needs to play nice from both sides of the fence.
With all due respect, I do believe a big part of this community is limited to watercooler RGB gaming desktops.
Even if you look at this forum's structure it has to be one of the most focused among mainstream PC tech websites.
However, forum (community) and website are 2 separate things. News are meant to attract people from outside the "hardcore circle". Maybe they do, I don't know.
For me that content is very shallow, with hardly any technical, interesting stuff. I only come to TPU for gaming GPU reviews - I find them very clean and well organized.
But since Ryzen came out TPU decided to cover CPUs more. And, since Ryzen relatively sucked at gaming, they included that "productivity" part, which is really bad - clearly out of reviewers' comfort zone.
Again: why do it? Does AMD expect this (they provide the CPUs)? I'd be fine with that, but maybe they should also give some guideline.
Wouldn't it be nice if we got a "how stuff works" article once in a while? Or how to do something useful using a computer?
I recall only one such "feature series" during the last few years: the series of texts about cryptocurrencies. It still makes me sad.