Monday, December 6th 2021
Intel Prepares Raptor Lake Designs With 24 Cores and 32 Threads, More E-Cores This Time
With the launch of Intel's Alder Lake processors, Intel has switched from a homogeneous to a heterogeneous design of processors, where smaller, high-efficiency cores are mixed with high-performance cores to create a highly efficient and high-performance processor for all kinds of workloads. And it seems like Intel is not over with adding more E-cores to its future products, as the latest leaks suggest. According to the BAPCO's Crossmark benchmark database, Intel's upcoming Raptor Lake processors will feature more E-cores than the high-performance P-cores in the SoC design. As to why this design choice is present, we are not sure and don't have a definitive answer.
E-Cores are suitable for background tasks, and adding more would potentially leave space for P-cores to do heavier workloads. In the benchmark submission, which is now offline, the samples used were a configuration with eight P-cores and sixteen E-cores. Since the big cores are hyperthreaded, it makes up for a total composition of 24 cores with 32 threads. The platform "RPL-S ADP-S DDR5 UDIMM OC CRB" was used with DDR5-4800 memory, indicating an early stage engineering sample with a probably unfinished memory controller. The Raptor Lake generation will also use LGA 1700 socket, DDR5 memory and be present in the desktop and mobile sector once it launches in Q4 of 2022. It will also use Intel's 7 semiconductor manufacturing process, similar to Alder Lake. The only difference with the next-generation design is the updated Raptor Cove core design that brings a significant IPC uplift.
Sources:
Tom's Hardware, KOMACHI_ENSAKA (Twitter), via VideoCardz
E-Cores are suitable for background tasks, and adding more would potentially leave space for P-cores to do heavier workloads. In the benchmark submission, which is now offline, the samples used were a configuration with eight P-cores and sixteen E-cores. Since the big cores are hyperthreaded, it makes up for a total composition of 24 cores with 32 threads. The platform "RPL-S ADP-S DDR5 UDIMM OC CRB" was used with DDR5-4800 memory, indicating an early stage engineering sample with a probably unfinished memory controller. The Raptor Lake generation will also use LGA 1700 socket, DDR5 memory and be present in the desktop and mobile sector once it launches in Q4 of 2022. It will also use Intel's 7 semiconductor manufacturing process, similar to Alder Lake. The only difference with the next-generation design is the updated Raptor Cove core design that brings a significant IPC uplift.
81 Comments on Intel Prepares Raptor Lake Designs With 24 Cores and 32 Threads, More E-Cores This Time
I'd rather have 4-8 E-cores dedicated to the OS, and the big boys for programs and games
The situation you describe is exactly where the strengths of E cores lies actually background CPU utilization contention that bogs down P core performance. Since you've bog down the P cores performance with less of that from the E cores that occupy less die space you have higher overall performance than you would otherwise under certain general use circumstances. There is certainly design balances between the core types, but I like the trade off myself. Your results look encouraging. I've wanted to see more of this kind of overclocking on Alder Lake and how it impacts results. That actually is 3W less than the stock 12600K multithreaded results TPU measured is that undervolted!!? Seems wild given you've got both P core and E cores types overclocked over stock though maybe that wasn't measured while stress testing under same workload circumstances with Cinebench. This Intel can't push the P core frequency curve much higher at this point because voltage curve and heat output to do so is completely asinine at this point. Even with carbon nanotubes and move away from silicone the power draw would still be crazy as loon for a rather tiny increase to frequency scaling. E-cores are the right choice and more of them. A better balance medium between E cores and P cores with another core designs would further improve things, but won't happen overnight. I think we'll see kind of a stacked pyramid and inverted pyramid design of sorts eventually with TSV shingling.
or
How to get away with 300W TDPs.
In sales now.
With Alder and Raptor Lake, Intel's laying a foundation for high-performance manycore processors in the future. I believe the company will focus on increasing E-cores' performance, while retaining the density advantage. With Foveros 3D packaging + densely packed E-core clusters, they may very well achieve GPU-like core counts per socket without giving up on IPC, and that's where the master stroke is.
I would not be surprised to see HEDT processors with wild configs like 16 P-cores + 128 E-cores in the future.
www.fullh4rd.com.ar/prod/12425/micro-amd-ryzen-5-3600
AMD RYZEN 5 3600 $35.637,00
www.fullh4rd.com.ar/prod/17680/micro-amd-ryzen-5-5600x
AMD RYZEN 5 5600X $42.290,00
www.fullh4rd.com.ar/prod/18814/micro-intel-core-i5-11400f-sin-video
INTEL CORE I5 11400F $34.890,00
I have actually - my issue on this board with BCLK OC is if I touch it at all, one of my sata drives in windows disappears and my USB ports randomly shut off, so I just leave that on 100. It does help to dial in max ring/ e core clocks but i don't have separate clock domains. I want to take some time to see if I can get a frame pacing software set up to show difference between e cores on and off with all my garbage that I run and youtube running in the background. This is what my gaming task manager usually looks like when I fire up a game:
So I measure using HWinfo -- I not sure TPU uses a different methodology. Here is a shot during CB 23:
^ I actually draw around 189-192W in R23 (not 187, so I was a tiny bit off). Let me know if you want me to run any before / after benches on E core OC. I am sure if I go full FPU load using another stress software I can push that past 200W (still not terrible).
CB R23 full run with e cores @ 4.3
Late next year will be very exciting and really can't go wrong IMO with either camp. Torn between updating my 2016 Zen 1700X system with Zen4/RL or waiting for Zen 5/Meteor Lake and pushing back update to 2024. Zen 5 introduces big.little and Meteor Lake cores bring large architectural changes and probably sees the end of ringbus topology. Zen 5's little cores will be 4c cores from Zen 4.
Mainly I don't trust the scheduler to handle things right.
And the unknown performance drop / crashing when the scheduler decides to move my tasks from P-cores to E-cores are concerning.
On the other hand, I agreed a "Pure E-core" CPU is interesting.
A 12900k sized 40 cores CPU will be extremely handy .
Both camps are looking at density for mt applications it seems. Thing is I don't think AMD is planning on launching those to consumers, so a pure E core CPU, if intel decided to launch one, would be super interesting for people who need tons of mt.
For gaming, the cheapest i5 or i3 are the only valid options.
E-cores to win MT benchmarks
And low CPU prices, to in the darkness bind them
Intel seem to be in trouble with these P cores, they need to shrink these things down dramatically to get the thermals and power under control, and Intel suck at new process nodes.
And another thing is that AMD don't seem to have a problem with 128 full performance cores in the server line next year, and yes, they will be low clocked, but they have all the features and IPC, unlike Intels E cores.
SMT is the other but given the size and possible density of E-cores it can be mitigated by adding more cores.
64-core EPYCs run at 2GHz base clock (highest SKU was 2.25MHz IIRC). 40-core Ice Lake Xeon runs at 2.3GHz. That is quite a bit less than what we see E-cores in Alder Lake running at.
E-core IPC today is in the same range as Skylake or Zen+ which is not bad at all.
By the way, AMD's 128-core is Zen4C, whatever that exactly ends up being. Space-optimized (=smaller) they said but looks like it is power optimized as well.
What you describes sounds like PCIE getting overclocked due to BCLK that's thing causing issues to the SATA/USB ports tied to PCIE. Makes me think of all the classic VIA chipsets that had those same basic overclocking issues. Vicious cycle of fixed then broken in regard to that.
The consistency of the tempson the E cores is kind of surprising. It looks to me like temps on P core could get in the way more readily than the E cores. The E cores don't look overly hot, but P cores certainly heat up a bit more and combined probably the bigger heat concern or seems that way. Perhaps or maybe they want to put a PC in a phone and tax it like Apple.
Most new architectures are up to ~40-50% faster than their predecessors. We should expect this much.
We have to wait and see how much a massive L3 cache matters for various real world use cases. The ringbus vs. mesh design has to do with core layout. We've had this discussion since the quad core days, yet the ring bus is keeping up just fine. I see no reason why the ringbus would be a problem for mainstream use for even 16 cores. Sure, synthetic benchmarks matters a lot to the enthusiast market, but you're missing the bigger picture. The main reason for the big-little design in desktops is they have hit the clock speed "wall" and (big) core count "wall", and the big PC makers like Dell, HP, Lenovo, etc. mostly sells upgrades based on "specs". With a shared L2 the real world performance would be quite different with load on multiple small cores. This is one of the reasons why it's important to distinguish performance and IPC.
I have no time and resources to do that, so I would avoid these products for some type of use cases for now. The largest ring ever created by Intel was in Xeon e5 v4.
With a total of 17 ring stops in the largest ring.
An Intel mainstream CPU needs 1 ring stop for each of the following : IMC , PCI-E controller , QPI link , iGPU
That leaves 13 ring stops left.
Since Intel does not do odd number cores anymore, it is 12 cores max.
Maybe , just maybe, sometime they will come up with 16 cores single ringbus.
But that means a 20 ring stop ringbus.
Will core to core latency become a huge concern ?
Mesh is one, but there are many others and Ian Cutress at Anandtech made a great article about that.
www.anandtech.com/show/16930/does-an-amd-chiplet-have-a-core-count-limit
Intel, like any other CPU vendor can do something else if they want to...
The numbers seem quite stratospheric, but in reality, it's more of Americans often not being aware of how good they've got things. It's mostly a result of the combination of it being a small store, taxes and the Argentinian economy's very poor performance. In general, though, the COVID-19 pandemic's economic recoil in South America (including Brazil) has been generally felt recently because of our devaluing currency. 1 USD is currently trading for 5.60 BRL and just north of 101 ARS. As long as the prices are kept in check, I welcome this development with open arms.