Monday, January 13th 2020

Core i9-10990XE 22-core Processor Last Gasp of the X299 Platform?
Way back in June 2018, when the first Threadrippers made landfall, it was reported that Intel was working on a new 22-core "Skylake-X" silicon that sat in between the 18-core HCC (high core-count) die, and the 28-core XCC (extreme core-count) die. The roughly 700 mm² XCC die, with its 6 memory channels, couldn't be integrated with the LGA2066 package, and was reserved for the enterprise LGA3647 package that made a workstation/quasi-client debut with the 28-core Xeon W-3175X. It was hence rumored that an in-between 22-core silicon was under development that could be integrated with LGA2066. Fast forward to 2020, and Intel's client HEDT processor lineup doesn't look much different from its 2017 one. The 18-core i9-10980XE leads the pack, and despite its $1,000 price, has received largely lukewarm reviews. If screenshots surfacing on Chinese tech forums are to be believed, Intel is toying with the idea of the 22-core die meant for LGA2066 once again.
Referenced as Core i9-10990XE in straight-up CPU-Z screenshots, the processor is based on the "Cascade Lake-X" microarchitecture, and has the same I/O as the i9-10980XE, looking at the instruction sets featured. It has 22 cores and HyperThreading enables 44 threads. Cache hierarchy and balance are characteristic of "Cascade Lake," with 1 MB of dedicated L2 cache per core, and 30.25 MB of shared L3 cache. The I/O is likely identical to the i9-10980XE as that's a function of the platform and the socket. What's more interesting are the clock-speeds. The name-string of the engineering sample references a nominal clock-speed of 4.00 GHz, and in the screenshot, the chip is shown running at 5.00 GHz (at least on one core). There's also a performance benchmark to go with the leak, possibly CineBench R20 nT. Here, the i9-10990XE is shown scoring 14,005 points, which is in the same ballpark as the 24-core Ryzen Threadripper 3960X.
Sources:
ChipHell, ChipHell (2)
Referenced as Core i9-10990XE in straight-up CPU-Z screenshots, the processor is based on the "Cascade Lake-X" microarchitecture, and has the same I/O as the i9-10980XE, looking at the instruction sets featured. It has 22 cores and HyperThreading enables 44 threads. Cache hierarchy and balance are characteristic of "Cascade Lake," with 1 MB of dedicated L2 cache per core, and 30.25 MB of shared L3 cache. The I/O is likely identical to the i9-10980XE as that's a function of the platform and the socket. What's more interesting are the clock-speeds. The name-string of the engineering sample references a nominal clock-speed of 4.00 GHz, and in the screenshot, the chip is shown running at 5.00 GHz (at least on one core). There's also a performance benchmark to go with the leak, possibly CineBench R20 nT. Here, the i9-10990XE is shown scoring 14,005 points, which is in the same ballpark as the 24-core Ryzen Threadripper 3960X.
42 Comments on Core i9-10990XE 22-core Processor Last Gasp of the X299 Platform?
So, I ask you, how would you validate the existence of this (still) hypothetical CPU ? What exactly does this product bring to the table that's new and exciting ?
Let's face it, we can beat around the bush all day, it wont change the fact that Intel simply got obliterated this round in HEDT. Usually I try to avoid expressions like those but there's really no other way to put it, it's like 6950X vs FX-9590 back a few years ago, except the roles are reversed and the gap is maybe even worse. It's not about your favorite team, it's an objective observation.
So it's not that workloads don't scale, it's that scaling in most workloads only goes so far. At that point, performance is derived from the combination of IPC and clock speed. It's the balanced breakfast of CPU performance scaling.
No consumer is spending 4k on a 64 core threadripper.
Just wanted to add that correction.
As far as per core throughput goes, zen 2 most definitely matches these mesh intel parts, the ring parts still have better memory latency so have a tendency to pull ahead in some stuff, and they can be clocked faster... That said, the diminishing returns you get as far as power consumption goes makes the argument for pushing those clocks for "workloads that don't scale well past 10 cores" really marginal...
The only real advantage 2066 has at this point is the wide PCIe interface, 20x4.0 of AM4 is pretty close to the overall bandwidth but 4.0 devices are still too sparse on the market to make use of that... Intel needs to push down the price of this platform more for it to be attractive, I'm talking of order 2/3rd the prices for the motherboards and a prospective 22 core be well under $900... Like 1151, it's a really dead end platform and will be well outclassed by it's AMD counterparts in 2-3 years time.
18 cores
3GHz base clock
4.6GHz Boost
4.8GHz Turbo Boost Max 3.0
Compare with a hypothetical 10990xe:
22 cores
4GHz base clock
4.xGHz Boost
5.0GHz Turbo Boost Max 3.0
Which of these chips is faster?
As for the comparisons with Zen 2 uarch, some workloads still run better on Intel. Adobe products are a good example of this.
For me, I find Intel chips to be faster for my video editing workflow, thanks to higher clock speeds and AVX 512. Personally, I am holding out for Zen 3 before I make a final decision on my next workstation chip. Rumor has it AMD will finally introduce AVX 512 support with Zen 3.
Intel no longer has an advantage with AVX 512, not until they get more cores. I'm also yet to find out how of the much commercial software out there uses that, it's not very practical to implement and comes with some other issues. Are they really ?
I'm being kind of pesky, I know, but that's the reality. No matter how you spin it Intel lost pretty much every big advantage they had. I really hope they don't. That brings more problems than improvements. They've explicitly stated their dislike for larger SIMD and for good I reason, they don't have a place in a world where GPUs can do the same things orders of magnitude faster and easier. If they will support it I hope they do it by fusing 2x 256 bit instructions, it's just not worth ruining their power envelope with the amount of cores that they have.
Let's say this 10990XE = $1300, im putting more than $300 for 4 core CPU.
And he didn't said his workflow. And not everything is about rendering.
- For Adobe Premiere Pro and After Effects, the following CPUs are our recommendations depending on your budget:
- AMD Ryzen 7 3800X ($399)
- Intel Core i9 9900K ($499)
- Intel Core i9 10920X ($689)
- AMD Ryzen 9 3950X ($749)
- Intel Core i9 10940X ($784)
- Intel Core i9 10980XE ($979)
- AMD Threadripper 3960X ($1,399)
Source: www.pugetsystems.com/labs/articles/What-is-the-Best-CPU-for-Video-Editing-2019-1633/#PremiereProCPUPerformanceThe threadripper performs good with Premiere and After, but the price difference isn't worth. Its like 5% performance for $400 difference.
Are more cores better? Sure, if you can take advantage of them. Most workloads simply don't. Intel still leads in workloads that only scale to mild-moderate thread counts, again, thanks to IPC and clock speed.
I'm not saying this hypothetical CPU is the perfect CPU for every workload, but it would bring more performance to me. I hope they introduce AVX512 support, it's pretty much the only thing keeping me from switching at this point. They likely don't need to increase AVX unit width to 512-bits natively, I would be fine with a fused approach.
We're scraping the bottom of the barrel here. It doesn't assume anything about that, both Zen 2 and Cascade-Lake drop in clock speed immediately as soon as there is a vector load. Zen 2 much less so, taking that into account a 3990X would probably end up being a fair bit quicker than a 10990XE under vector loads.
Listen, you don't have to believe me that AVX512 isn't as amazing as you think it is:
That's pretty much the most ideal case for AVX 512 and two 7742 with just 256 bit AVX are just as fast as two 8280s.
Those are server CPUs but there is no reason to believe it's any different on desktops. And keep in mind those are 28 core Xeons, not 22 where it would have been faster. It's not about sustained peak, it's just how the math adds up. You wouldn't touch any of these CPUs to run workloads that don't. Come one, this argument is ... uninspired, to put it in a more elegant way. No, whoever buys these has a clear purpose for them in mind where they actually make sense, because they're made exactly for when the workloads scale well.
I've always found this argument to be exceedingly bizarre, clearly neither AMD or Intel actually believes in what you're saying, otherwise they wouldn't put dozens of cores in their CPUs and try and sell them for a premium. It's painfully obvious that core counts are the priority for both.
The advantage xLake has is that its cores have better throughput in certain programs which weren't yet optimised for zen 2 cores, my understanding is that in some cases updates to the software have changed that.
The real advantage that xLake really lies in the lower memory latency, so anything with unpredictable memory accesses will be less gated due to the longer wait times on memory accesses. This can also interact back with the optimisation and how the programs play with the OoO schedulers in the different architectures, because the penalty for bad branch predicts becomes worse for distant memory accesses.
Anyone building with these should generally be earning income with these machines not playing games. My gut says it will be implemented the same way AVX2 was in Ryzen one.
So Zen3 will process AVX512 in 256x2.
And they will wait for 5nm to do AVX512 in a single clock cycle.