Saturday, November 17th 2018
Intel Could Upstage EPYC "Rome" Launch with "Cascade Lake" Before Year-end
Intel is reportedly working tirelessly to launch its "Cascade Lake" Xeon Scalable 48-core enterprise processor before year-end, according to a launch window timeline slide leaked by datacenter hardware provider QCT. The slide suggests a late-Q4 thru Q1-2019 launch timeline for the XCC (extreme core count) version of "Cascade Lake," which packs 48 CPU cores across two dies on an MCM. This launch is part of QCT's "early shipment program," which means select enterprise customers can obtain the hardware in pre-approved quantities. In other words, this is a limited launch, but one that's probably enough to upstage AMD's 7 nm EPYC "Rome" 64-core processor launch.
It's only by late-Q1 thru Q2-2019 that the Xeon "Cascade Lake" family would be substantially launched, including lower core-count variants that are still 2-die MCMs. This aligns to preempt or match AMD's 7 nm EPYC family rollout through 2019. "Cascade Lake" is probably Intel's final enterprise microarchitecture to be built on the 14 nm++ node, and consists of 2-die multi-chip modules that feature 48 cores, and a 12-channel memory interface (6-channel per die); with 88-lane PCIe from the CPU socket. The processor is capable of multi-socket configurations. It will also be Intel's launch platform for substantially launching its Optane Persistent Memory product series.
Source:
Anandtech
It's only by late-Q1 thru Q2-2019 that the Xeon "Cascade Lake" family would be substantially launched, including lower core-count variants that are still 2-die MCMs. This aligns to preempt or match AMD's 7 nm EPYC family rollout through 2019. "Cascade Lake" is probably Intel's final enterprise microarchitecture to be built on the 14 nm++ node, and consists of 2-die multi-chip modules that feature 48 cores, and a 12-channel memory interface (6-channel per die); with 88-lane PCIe from the CPU socket. The processor is capable of multi-socket configurations. It will also be Intel's launch platform for substantially launching its Optane Persistent Memory product series.
67 Comments on Intel Could Upstage EPYC "Rome" Launch with "Cascade Lake" Before Year-end
In any case, this doesn't look like a good product launch... more like an attempt to bring in investors/appease shareholders.
AMD :toast: 7nm
AMD :slap: Intel
-----
Cascade Lake SP XCC means up to 28 cores, this die has been publicly known since summer last year. Intel's current roadmap shows Cooper Lake SP late 2019 and Ice Lake SP mid 2020. Cooper Lake SP will still be on 14nm, but will be on the same LGA4189 socket as Ice Lake SP, featuring "architectural improvements" and 8 memory channels. So Epyc "Rome" will compete with Cascade Lake in the beginning and then Cooper Lake later in its product cycle.
Epyc "Rome" may offer more and cheaper cores, but Intel still have faster cores. And when it comes to AVX workloads, which many enterprise workloads rely on, Intel will still have a 2× advantage over Epyc "Rome". I would expect AMD to get more of a foothold in the server market, but they don't yet have any product that will "crush" Intel's offering.
Don't forget that large shipments of Zen 2 is not right around the corner. While we could expect to see something in Q1 2019, the transition to Zen 2 will be gradual for both consumer and enterprise markets. One telling sign is that AMD launched Epyc 7000 (Zen 1) series just yesterday; it's not going to be replaced in 2-3 months. But we should look forward to Q3-Q4 2019, it will be the most interesting time for CPUs in decades; AMD will be having Zen 2 based Threadripper, Ryzen and Epyc, Intel will have Ice Lake and Cooper Lake X/SP, then we will finally start to see the effects of competition.
www.techpowerup.com/249450/amd-zen-2-ipc-29-percent-higher-than-zen#g249450-1
AMD Zen: 4x 128-bit FPU
AMD Zen 2: 4x 256-bit FPU (double FP width)
Unless Intel could do 4x 512-bit AVX, how could Intel maintain 2x advantage AVX workload over Zen 2?
Skylake-X/SP have 2× 512-bit MUL, 2× 512-bit ADD and 2× 512-bit FMA. Except for uncertain details of the execution ports, this is ~2× max throughput.
Zen could do 2x 128-bit MUL and 2x 128-bit ADD, could be fused to 2x 128-bit FMA
www.anandtech.com/show/10591/amd-zen-microarchiture-part-2-extracting-instructionlevel-parallelism/4
The FP Unit uses four pipes rather than three on Excavator, and we are told that the latency in Zen is reduced as well for operations (though more information on this will come at a later date). We have two MUL and two ADD in the FP unit, capable of joining to form two 128-bit FMACs, but not one 256-bit AVX.
Doubling that would be 2x 256-bit FMA or 2x 256-bit ADD + 2x256-bit MUL
www.anandtech.com/show/11544/intel-skylake-ep-vs-amd-epyc-7000-cpu-battle-of-the-decade/4
On the defensive and not afraid to speak their mind about the competition, Intel likes to emphasize that AMD's Zen core has only two 128-bit FMACs, while Intel's Skylake-SP has two 256-bit FMACs and one 512-bit FMAC. The latter is only useable with AVX-512. On paper at least, it would look like AMD is at a massive disadvantage, as each 256-bit AVX 2.0 instruction can process twice as much data compared to AMD's 128-bit units. Once you use AVX-512 bit, Intel can potentially offer 32 Double Precision floating operations, or 4 times AMD's peak.
For Skylake SP. It has 2x512-bit FMA (Port 0+1 and Port 5) or 2x512-bit ADD / MUL.
For server with less dependent threads such as Virtual Machines, AMD servers will shine versus Intel servers.