• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Intel Xeon Server Processor Shipments Fall to a 13-Year Low

I hope Intel continues purposely to disable AVX-512 instructions in its home user CPUs. For those who don't know, AVX-512 instructions have been present in all "Core ix" home CPUs since the 6th generation Sky Lake launched in 2015, but Intel chose to leave them disabled in home user CPUs (Core line) and keep them enabled only in its Xeon CPUs, because the Xeon CPUs are more expensive.

Intel's managers must think the world is still in the 80s and 90s... They will reap all what they sow...
 
Last edited:
I hope Intel continues purposely to disable AVX-512 instructions in its home user CPUs. For those who don't know, AVX-512 instructions have been present in all "Core ix" home CPUs since the 6th generation Sky Lake launched in 2015, but Intel chose to leave them disabled in home user CPUs (Core line) and enabled only in its Xeon CPUs, because the Xeon CPUs are more expensive.

Intel's managers must think the world is still in the 80s and 90s... They will reap all what they sow...
AVX-512 has been available on some consumer chips. (Ice Lake, Rocket Lake, Tiger Lake)
It was disabled in Alder Lake because the E-cores didn't have it, though some early chips using early BIOS can use it if E-cores are turned off.
 
Intel MUST stop making home CPUs with E-cores because home applications don't use many cores.

Intel needs to realize that people don't want these nonsense E-cores. Dozens of cores are only useful for server applications, not for home PCs.

Much of the software is still poorly optimized for many cores, and even those that are optimized for multicore still overload 1 or 2 CPU cores. Therefore, Intel should abandon these nonsense/useless E-cores and put only P-cores along with 1 "Super-core" for every 3 P-cores.

This way:
qKB2HDi.png
 
He has a point. I tend to main servers that are all effectively retired home desktops or enthusiast builds mixed with enterprise equipment.
Some chips seem like exceptions to the usual home user build pattern but they definitely had their time.
Pentium 233, K6-2/300, Pentium 4, Athlon 2650e, Phenom II X4 955, FX-8370, Ryzen 5 3600...These are split into three categories:
Single core low clock (antique 32-bit), Single core high clock (aging 32/64-bit), Multicore high clock (64-bit modern standard).
Sooo all of these have a place in modern day computing. The antiques are great for running old games and apps at native speed/resolution.
The newer systems are still fine for anything but modern compute and gaming. They have been obsoleted by newer chips with weird technologies.

Tell me where an E-core sits in this formula. You can't. I'm not sold on any chips with this E-core technology as there's no measurable improvement.
E-cores typically do a great job with low power idle tasks which is where my servers tend to sit 98% of the time, standing by to stand by.
So why don't they? WRONG application. If there were something relevant to having these E-cores, that impact would be visible. It's not.
So I stick with the classic general purpose core count. None of this P-core E-core mix and match. Super core is a wild idea though. Pair those for hot dual cores.
 
Here are specs for Intel Xeon Phi Processor 7210 I was working with:


Intel Xeon Phi Processor 7210 ( 16GB, 1.30 GHz, 64 core )
Processor name: Intel(R) Xeon Phi(TM) 7210
Packages ( sockets ): 1
Cores: 64
Processors ( CPUs ): 256
Cores per package : 64
Threads per core: 4
Peak Processing Power: 2.662 TFLOPs
Note: Calculated as follows: 1.30 * 64 * ( 512-bit / 32-bit ) * 2 = 2662.4 GFLOPs for Single-Precision FPU data type

Another tech-mess was related to AVX-512 because of fragmentation the AVX-512 ISA.

It was hard to imaging that Intel did it!
...
Intel AVX-512 family of fnstructions:
- Intel AVX-512F Foundation instructions.
- Intel AVX-512CD Conflict Detection instructions.
- Intel AVX-512ER Exponential and Reciprocal instructions.
- Intel AVX-512PF Prefetch instructions.
- Intel AVX-512BW Integer operations on 8-bit and 16-bit operands.
- Intel AVX-512DQ Enhanced Integer and Floating-Point operations on 32-bit and 64-bit operands.
- Intel AVX-512VL Vector Length Extensions.

Intel Xeon Phi processor x200 products support:
- Intel AVX-512F Foundation instructions.
- Intel AVX-512CD Conflict Detection instructions.
- Intel AVX-512ER Exponential and Reciprocal instructions.
- Intel AVX-512PF Prefetch instructions.

Intel Xeon processors support:
- Intel AVX-512F Foundation instructions.
- Intel AVX-512CD Conflict Detection instructions.
- Intel AVX-512BW Integer operations on 8-bit and 16-bit operands.
- Intel AVX-512DQ Enhanced Integer and Floating-Point operations on 32-bit and 64-bit operands.
- Intel AVX-512VL Vector Length Extensions.
...


Update for a broken weblink to the processor specs:


Intel MUST stop making home CPUs with E-cores because home applications don't use many cores.
...

This is actually Not a problem of Intel. Why? Because software vendors of many home applications still do Not fully understand how to correctly use multithreading ( I'm serious about it! ) and how to properly set thread affinity of logical processors to distribute processing between processor cores.

Example 1: Correct processing on Red Hat Linux on a server with Intel Xeon Phi 64-Core processor ( 256 Logical Processors )

IntelKNLThreadsToPUsBindings.jpg


Example 2: Correct processing on Windows on a Mobile Workstation with Extreme Edition of 4-Core Intel CPU ( 8 Logical Processors )

OnWindows.Slide09.jpg


Example 3: Not Correct processing on Windows on a Mobile Workstation with Extreme Edition of 4-Core Intel CPU ( 8 Logical Processors )

OnWindows.Slide11.jpg
 
Last edited:
Intel MUST stop making home CPUs with E-cores because home applications don't use many cores.

Intel needs to realize that people don't want these nonsense E-cores. Dozens of cores are only useful for server applications, not for home PCs.

Much of the software is still poorly optimized for many cores, and even those that are optimized for multicore still overload 1 or 2 CPU cores. Therefore, Intel should abandon these nonsense/useless E-cores and put only P-cores along with 1 "Super-core" for every 3 P-cores.

Well, the server customers want no part of heterogenous processors either, and that is why Intel and AMD don't offer any. They do offer e-core processors, but not mixed with p-cores in the same machine. Because the CPU scheduling is pretty much impossible to sort out unless you hardcode for specific applications.

So it's not just the home segment that has useless e-cores.

Ironically I have one of the few workloads that would benefit from heterogenous CPUs - compiling software in parallel. But I do AMD these days and older Xeons.
 
5th Gen Scalable (EMR) and Xeon 6 (GNR) are genuinely great processors. But monolithics and tightly integrated tiled processors will never keep up with chiplet designs, especially in highly scalable architectures like AMD's at the level most of these servers operate at.

These Xeons... belong in workstations. But they're too costly for that... so I expect Intel's situation to significantly worsen before it gets better on this front.

Intel MUST stop making home CPUs with E-cores because home applications don't use many cores.

Intel needs to realize that people don't want these nonsense E-cores. Dozens of cores are only useful for server applications, not for home PCs.

Much of the software is still poorly optimized for many cores, and even those that are optimized for multicore still overload 1 or 2 CPU cores. Therefore, Intel should abandon these nonsense/useless E-cores and put only P-cores along with 1 "Super-core" for every 3 P-cores.

This way:
qKB2HDi.png

This is the Intel Royal Core design and it was scrapped

 
Here are specs for Intel Xeon Phi Processor 7210 I was working with:


Intel Xeon Phi Processor 7210 ( 16GB, 1.30 GHz, 64 core )
Processor name: Intel(R) Xeon Phi(TM) 7210
Packages ( sockets ): 1
Cores: 64
Processors ( CPUs ): 256
Cores per package : 64
Threads per core: 4
Peak Processing Power: 2.662 TFLOPs
Note: Calculated as follows: 1.30 * 64 * ( 512-bit / 32-bit ) * 2 = 2662.4 GFLOPs for Single-Precision FPU data type

Another tech-mess was related to AVX-512 because of fragmentation the AVX-512 ISA.

It was hard to imaging that Intel did it!
...
Intel AVX-512 family of fnstructions:
- Intel AVX-512F Foundation instructions.
- Intel AVX-512CD Conflict Detection instructions.
- Intel AVX-512ER Exponential and Reciprocal instructions.
- Intel AVX-512PF Prefetch instructions.
- Intel AVX-512BW Integer operations on 8-bit and 16-bit operands.
- Intel AVX-512DQ Enhanced Integer and Floating-Point operations on 32-bit and 64-bit operands.
- Intel AVX-512VL Vector Length Extensions.

Intel Xeon Phi processor x200 products support:
- Intel AVX-512F Foundation instructions.
- Intel AVX-512CD Conflict Detection instructions.
- Intel AVX-512ER Exponential and Reciprocal instructions.
- Intel AVX-512PF Prefetch instructions.

Intel Xeon processors support:
- Intel AVX-512F Foundation instructions.
- Intel AVX-512CD Conflict Detection instructions.
- Intel AVX-512BW Integer operations on 8-bit and 16-bit operands.
- Intel AVX-512DQ Enhanced Integer and Floating-Point operations on 32-bit and 64-bit operands.
- Intel AVX-512VL Vector Length Extensions.
...
Yes, the page indicates SMT4. The Xeon Phi line introduced AVX-512 so the fragmentation is on its mainstream successors: Skylake X and all of the other Xeons that followed. In addition, Intel didn't support AVX-512 in consumer processors after Tiger Lake which serves to limit adoption and access to developers.
 
Super core is a wild idea though. Pair those for hot dual cores.

So, a home CPU with 7 P-cores could have 1 "super core", instead of 2. As I said, applications and especially games, even if optimized for multithreading, almost always overload 1 of the cores. And it is precisely in this situation of overloading 1 of the cores that the "super-core" would be super useful (with the app directing the overload to the super-core). The Ryzen 9700X has 8 P-cores and doesn't get as hot as an electric shower.

eQRm6j7.png
 
So, a home CPU with 7 P-cores could have 1 "super core", instead of 2. As I said, applications and especially games, even if optimized for multithreading, almost always overload 1 of the cores. And it is precisely in this situation of overloading 1 of the cores that the "super-core" would be super useful (with the app directing the overload to the super-core). The Ryzen 9700X has 8 P-cores and doesn't get as hot as an electric shower.

...< Picture Deleted >...

Do you have a system with P- and E-Cores, or just P-Cores? If Yes, could play a game on the system for 5 or 10 minutes and upload a Windows Task Manager screenshot ( CPU Performance tab )?

I really would like to see how it looks like.
 
For the record..

13 years ago might as well be 26 in computer years.

That is brutal.
 
Do you have a system with P- and E-Cores, or just P-Cores? If Yes, could play a game on the system for 5 or 10 minutes and upload a Windows Task Manager screenshot ( CPU Performance tab )?

I really would like to see how it looks like.

Zenless Zone Zero - E-cores are not a waste. Any ioverhead tasks, usually driver threading: it all goes on them. Top 16 boxes are 8C16T, bottom 16 boxes are 16c

1738978084957.png
 
So, a home CPU with 7 P-cores could have 1 "super core", instead of 2. As I said, applications and especially games, even if optimized for multithreading, almost always overload 1 of the cores. And it is precisely in this situation of overloading 1 of the cores that the "super-core" would be super useful (with the app directing the overload to the super-core). The Ryzen 9700X has 8 P-cores and doesn't get as hot as an electric shower.

eQRm6j7.png

Another nightmare for the OS kernel scheduler.
 
Another nightmare for the OS kernel scheduler.

I think that a nightmare for the scheduler is this:

 
Right? That statement was a bit silly.

Core scheduling may seem daunting to the human mind, but it is trivial for an OS that is properly optimized and most are. Windows and the latest Linux kernels all are.

It is mildly amusing to watch people yammer on about things though..
 
Right? That statement was a bit silly.

Core scheduling may seem daunting to the human mind, but it is trivial for an OS that is properly optimized and most are. Windows and the latest Linux kernels all are.

It is mildly amusing to watch people yammer on about things though..

Trivial of your scheduler can look into the future :) At the time where you have to place a thread or process you don't know what it will be doing a couple microseconds from now.

My understanding is that the Win11 scheduler is full of hardcoded recognition of well-known software.
 
Back
Top