• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

E-cores still evolve. But is there a reason for it?

But... why not making P-cores a little less complex so they take up less space so we get, like, 5 to 15 % less speed per P-core but 16 P-cores instead of 8+16? What is wrong in this approach? It's a genuine question, I'm not saying it's definitely better.
That's a good question. It should be possible; AMD's cores are significantly smaller than Intel's bloated P cores and that is despite the handicap of the ability to sustain up to 4 AVX-512 operations. If things keep going as they are, we may see the E cores displace the P cores altogether as only high clock speeds are keeping the P cores ahead now.
 
E-Cores aren't all that glamorous IMO

From 8 cores (16T) CBR20 max I gotten was 10Kpts. With ALL 28 threads, that score went up to 15.5K. (14700K)

A whopping 5.5K increase, and that's OC best case scenario both ways too. I'm talking the E-Cores where running 4950Mhz (I have screen shots for requests)

The E-cores are efficient at taking up space!! Sure a 285K should have better E-cores and will do better than 14th gen, but at the cost of HT?? I'm not buying into that personally.
 
• Hybrid structure cries for an impeccable prediction mechanism which can never be invented. At least with our current state of knowledge.
What do you mean by that? Hybrid CPUs have existed for ages, just look at how it's done in the mobile space. It works fine in there.
Issue seems to be mostly with Windows' scheduler.
• E-cores are mocked by last gen architectures in gaming even if the game is coded so well E-cores actually improve the experience in all aspects.
Games have not really made great use of multiple cores ever, so E-cores become moot if you have another 6~8 P-cores lying around.
• E-cores are mocked by P-cores in terms of performance per watt if you downclock the latters to around 4.3 (Alder Lake) or 4.6 (Raptor Lake) GHz.
E-cores are not meant to beat P-cores in performance per watt, but rather performance per area. In the same space of a single P-core from Intel you can fit 4 E-cores.
For AMD, a Zen 5c core is 25% smaller than the regular core, so more cores per CCD as well.
That's really releavant for tasks that can scale across multiple cores, just take a look at Sierra Forest, a Xeon with E-cores only. Great perf/watt for tasks that can scale to many cores.
• Software development is currently in a state that promotes fast releases but doesn't tolerate actual bug fixing if it takes more than a manhour to deploy. Which means scheduling is virtually thrown outta window.
Agreed on that, but seems like you're referring specifically to Windows. See linux as a counter example where the hybrid scheduling just works.
Why did Intel abandon HT
They only did so for consumer. Not what you want to discuss, I know, but relevant since the Xeon parts will likely still have HT.
and not E-cores
E-cores give a great boost in MT performance, that's it.
And I'm willing to bet everything I can that slotting just four more thereof would amount to a total and irrecoverable murder of any E-core config in gaming.
My personal bet is that it would make no difference whatsoever.
Or, they got some ideas from looking at Qualcomm.
Qualcomm went the other way around with Oryon, it's just a single core design now lol
 
The E-cores are efficient at taking up space!! Sure a 285K should have better E-cores and will do better than 14th gen, but at the cost of HT?? I'm not buying into that personally.
Depending on the task, HT can be really great or really awful:
Screenshot 2024-10-21 at 20.13.37.png


It really depends on how well a task's instructions can be dispatched to the execution units.
 
Depending on the task, HT can be really great or really awful:
View attachment 368454

It really depends on how well a task's instructions can be dispatched to the execution units.
Without going into technical details, it's mindblowing that HT can have such a negative impact 22 years after introduction.
 
Without going into technical details, it's mindblowing that HT can have such a negative impact 22 years after introduction.
If the applcation can fully saturate the execution units with a single thread, then SMT is just adding needless overhead and scheduling (both at software and hardware levels) to dispatch those instructions.
SMT is bad for applications that are backend bound, and awesome for applications that are front-end bound.
 
E-cores are not meant to beat P-cores in performance per watt, but rather performance per area. In the same space of a single P-core from Intel you can fit 4 E-cores.
For AMD, a Zen 5c core is 25% smaller than the regular core, so more cores per CCD as well.
That's really releavant for tasks that can scale across multiple cores, just take a look at Sierra Forest, a Xeon with E-cores only. Great perf/watt for tasks that can scale to many cores.
Thank you. Came here to say this.

But I also heard, although its just a leak, but intel may be going in that same direction eventually, where ecores rather than being a different architecture, are just smaller, and can take less frequency. But this is still several generations in the future.

I personally do not care about the loss of HT. It will make scheduling easier for windows, and MT performance is higher. So, it shouldn't matter, in the use cases where it would matter.

Most games still don't need more than 8 cores.... or benefit much from ecores or hyperthreading when you have 8 pcores.
 
If the applcation can fully saturate the execution units with a single thread, then SMT is just adding needless overhead and scheduling (both at software and hardware levels) to dispatch those instructions.
SMT is bad for applications that are backend bound, and awesome for applications that are front-end bound.
I knew you wouldn't accept the first part of my reply lol.

I do understand why, but it was just a long time since I actually saw some numbers showing worst case situations.
 
When doing P to E comparisons, please look at die shots. An E core cluster, including the L2 cache, is larger than one P core. The ratio is about 4 to 3. However, there are some logic blocks around the E cluster, which I've never seen annotated, and it's not clear where they belong.
 
I knew you wouldn't accept the first part of my reply lol.

I do understand why, but it was just a long time since I actually saw some numbers showing worst case situations.
And here I thought I had managed to not get too much into techinical details, oops :laugh:

FWIW, it's not only worst case but also best cases, and also most (if not all) tests in there are not even representative of workloads of the average user in this forum (as in, games since it seems that's what most people here only care about).
 
Thank you. Came here to say this.

But I also heard, although its just a leak, but intel may be going in that same direction eventually, where ecores rather than being a different architecture, are just smaller, and can take less frequency. But this is still several generations in the future.

I personally do not care about the loss of HT. It will make scheduling easier for windows, and MT performance is higher. So, it shouldn't matter, in the use cases where it would matter.

Most games still don't need more than 8 cores.... or benefit much from ecores or hyperthreading when you have 8 pcores.
IDK for how much longer this will be true... I've i'm old enough to where i've heard this said about 1, 2, 4 cores, 8 cores with no HT, and now 8/16T cores -- I think releasing an 8 core no HT in 2026, which seems to be the current intel roadmap, is a bit risky tbh, unless you have some ridiculously fast e cores.
 
When doing P to E comparisons, please look at die shots. An E core cluster, including the L2 cache, is larger than one P core. The ratio is about 4 to 3. However, there are some logic blocks around the E cluster, which I've never seen annotated, and it's not clear where they belong.
That's the thing though. you DON'T compare P-cores to E-cores.

You compare Intel's NEW 8 Performance core package, missing Hyper Threading. Essentially, the new 8 core package is weaker than almost all previous generations when we want to start talking about MULTITASKING, which is the VERY REASON they came out with DUAL core processors longer than 2 decades ago.

Right so, since Area cores, E-cores, whatever, are great in mobile applications and all that, it may not be a great marketing scheme for desktop.

20 Core processor still. Performance uplift isn't great enough to call this a performance processor anymore. It's now a POWER HUNGRY Area Efficient core complex with 8 extra P cores, just in case you need em' (My opinion, not factual) 250w and no HT
 
Last edited:
IDK for how much longer this will be true... I've i'm old enough to where i've heard this said about 1, 2, 4 cores, 8 cores with no HT, and now 8/16T cores -- I think releasing an 8 core no HT in 2026, which seems to be the current intel roadmap, is a bit risky tbh, unless you have some ridiculously fast e cores.
True, 8 cores for gaming wont be true forever... it will probably follow whatever consoles end up doing. Hopefully intel will be able to adapt and stay competitive. Then again its hard to imagine a 12 - 16 core ps6 but you never know. Thats still years away.

Like I was mentioning I heard a leak that intel will be moving away from the current form of ecores and more inline with what amd is doing with small but same architecture cores.....but thats just a rumour and still several years away.

Intel might have to do what AMD is doing and make two different lines, one aimed at home workstation, and one aimed at gaming, as the needs are becoming quite different and it will be hard to make a do it all CPU AND be ahead in either one AND be price competitive. We shall see....
 
Last edited:
I like the idea of having multiple types of processors (that excel at different workloads) in the same system. As always, load transition and integration from one proc to another is where the rub is.
 
What do you mean by that?
I mean that anything that runs real-time needs a reliable source of predictive calculations which hybrid systems are not. Gaming, rendering, video editing etc.
Mobile phones are mostly used for one or two actively used pieces of turd like a bank app and a FB client running in the foreground and about 69 various apps idling in the background.
Laptops aren't a common number crunching tool, either. Some niche users are unable to have a desktop/HEDT for that but most laptop users do nothing more complex than "meditating" and playing random video games. And, sometimes, removing their cellulite in Photoshop.
Desktop and especially HEDT is where the fun starts. And the fun is being a little bit slown down by an architecture that the most common OS has very little idea how to work with. Linux, I remind you, is still a niche product.
if you have another 6~8 P-cores lying around.
But we only have 8 in total. Sans HT, it's too little.
but rather performance per area.
Desktop CPUs have about a square kilometre of unused space so it only comes down to manufacturing costs. Kinda moot since these ain't gonna skyrocket if you enlarge your CPU by, like, 20 percent. I think simplification will also enable less factory defects = cost optimisation. Having CPUs more all-rounded also enables higher retail pricing.
Also do I need to remind you where PhysX ended up at?
E-cores give a great boost in MT performance, that's it.
Only space-wise which I already mentioned before.
My personal bet is that it would make no difference whatsoever.
There's a load of game developers that don't care about E-cores and just do whatever. Even some rich AAA titles exhibit some micro- or nano-stuttering with E-cores enabled. FPS might be great, 1% lows might be improved but having just more P-cores crunching it will amount to smoother experience.

E-cores are very terrible in gaming. I tried playing E-cores only and despite theoretically being similar to 3 or 4 P-cores, 16 E-cores managed to lose to 2 P-core configurations in almost every single game. And it wasn't particularly close. One of examples: Cyberpunk 2077 was laggy but barely playable with 2 P-cores (with HT) at 5.4 GHz (50 FPS average, bad but not terrible lag spikes to about 10 to 20 FPS in most CPU-taxing areas); totally fine with 3 P-cores + HT / 5 P-cores sans HT (almost always 60+ FPS, only some select areas in City Centre and Dogtown spiked below 40 FPS); and it was just constantly below 30 FPS with 0 P-cores and 16 E-cores at 4.3 GHz with some areas just outright crashing. Of course a lot of gamers don't care about this title but it's not the only one showcasing such a massive E-core gaming performance deficite.

If Intel ever plan on targeting gamers (which they obviously don't as of yet) they need a P-core exclusively SKU. Preferrably with a thread count exceeding 10.
 
True, 8 cores for gaming wont be true forever... it will probably follow whatever consoles end up doing. Hopefully intel will be able to adapt and stay competitive. Then again its hard to imagine a 12 - 16 core ps6 but you never know. Thats still years away.
Just like GPU advancements, console gaming dictates the mainstream. This is why we have 12GB on a RTX 3060. Consoles have 16GB (4 for the OS). People thought it was overkill and never will be needed...until the PS5/X1 came out.l

Sure enough, now games commonly take up 10-12GB. For CPU Cores. We are at 8 and I can see the next gen being a sort of E-Cores on AMD front. M$ and Sony signed up for AMD again. It's it won't be over over 8 cores for another 8-10 years, but it could very well be lightweight cores instead. AI driven tiny cores is the future. Goodbye big cores.
 
E cores may very well be a live experiment, too. Someone mentioned the P core is bloated. Its risky to redesign the high performance core - what if you're left with no better result, or a give/take result where you just moved more performance to another task and lost it where it was? Intel may very well develop the E core further to get its performance close as possible to a P core, and add more features to it, while hoping to keep the die space smaller. And until - and if - that succeeds, they can eventually move to an all E core design for another efficiency bump. If it doesn't, they'll just have a big little setup for thread count.

I don't think 'power' is Intel's problem per say, its a performance issue, and performance is linked to cost and die space. Power is just a way to fix whatever's missing on the other axis. Intel is fighting a chiplet opponent with much better yields and design space with its archaic monolithic 'base core', and is scrambling to find stuff to add to it to keep pace. That's what we've been seeing past Coffee Lake imho. I think for that reason too there's no telling if Intel has a definitive strategy for their cores. Its clear Intel doesn't really dare to start a grounds-up mentality project to make a better CPU. They keep iterating on Core.

If Intel ever plan on targeting gamers (which they obviously don't as of yet) they need a P-core exclusively SKU. Preferrably with a thread count exceeding 10.
I'm not so sure that'll fix Intel's gaming performance proper; is it the lack of P cores that screws them up? Or is it scheduling? Or is it just the fact they can't feed those P cores enough data. They're fast enough, after all. The single thread performance is stellar, especially at the stupid clocks they can run. AMD however surpasses them in gaming with 'P' Cores on X3Ds at much lower clock speeds. And they do that because they can feed the CPU faster from cache. E cores similarly boost gaming performance as more cores are able to be fed some data from the added cache per core. The side effects however make them unfit for duty and additional scheduling also adds overhead. I'm not so sure more P cores will act differently. There's also still a ring bus, albeit unified.
 
Last edited:
I mean that anything that runs real-time
Just be aware that your regular desktop devices are NOT realt-time by any means. Even on a single big core you are not running in real-time, due to OoOE and all the speculative execution stuff.
Linux only recently got support for RT after around 20 years in the works.
Mobile phones are mostly used for one or two actively used pieces of turd like a bank app and a FB client running in the foreground and about 69 various apps idling in the background.
Mobile gaming is a far greater market than desktop or consoles. There are some really nices games in mobile (with some pretty interesting graphics), and those do run fine.
Linux, I remind you, is still a niche product.
I mean, you did mention lots of niche use cases that require a desktop/HEDT. For development, which does fit in one of those scenarios, Linux has a really big market share.
Your fun is being slowed down by the market for such products becoming ever small.
But we only have 8 in total. Sans HT, it's too little.
For games? I don't think so. Difference between 6 and 8 core CPUs of the same gen, with SMT on or off is pretty much negligible.
For other tasks the fact that AMD has stuff split in many CCDs is irrelevant, and the E-cores do help.
Desktop CPUs have about a square kilometre of unused space so it only comes down to manufacturing costs. Kinda moot since these ain't gonna skyrocket if you enlarge your CPU by, like, 20 percent. I think simplification will also enable less factory defects = cost optimisation. Having CPUs more all-rounded also enables higher retail pricing.
Area is not about the space in your CPU socket, but rather the actual die area. Do a die that's too big and you get into bigger yield issues, which is indeed a manufacturing cost issue.
But that's a cost that CPU manufacturers apparently don't think is worth to overcome for desktop, whereas on mobile you see chips way bigger than what's on desktop.
Only space-wise which I already mentioned before.
No, they do improve in general. You are just thinking about games.
There's a load of game developers that don't care about E-cores and just do whatever
I guess all your points are only focused on gaming...
If Intel ever plan on targeting gamers (which they obviously don't as of yet) they need a P-core exclusively SKU. Preferrably with a thread count exceeding 10.
Let me bring you some sad news: no hardware company is targeting gamers. Not Nvidia, not AMD, not Intel.
All desktop products are an afterthought of other line of products, be it scaling down server stuff or scaling up mobile designs.

Its clear Intel doesn't really dare to start a grounds-up mentality project to make a better CPU. They keep iterating on Core.
Tbh Lion Cove (found on Arrow Lake and Lunar Lake) is a pretty big µarch change from past iterations. It's the first 1T 8-wide decoder on x86 (Zen 5's is an 8-wide but broken down into 2 4-wide clusters for SMT).
The core itself is pretty interesting and efficient (just look at lunar lake), great for mobile but doesn't seem to be that great when scaling up.
 
Linux, I remind you, is still a niche product.
Linux, at the very least, demonstrates what Windows could achieve... If Windows wanted to. On the Win+AMD side, I believe we'll see the realisation of Fine Wine soon enough. I'm less sure about Wintel.
 
Tbh Lion Cove (found on Arrow Lake and Lunar Lake) is a pretty big µarch change from past iterations. It's the first 1T 8-wide decoder on x86 (Zen 5's is an 8-wide but broken down into 2 4-wide clusters for SMT).
The core itself is pretty interesting and efficient (just look at lunar lake), great for mobile but doesn't seem to be that great when scaling up.
Scaling up the performance is exactly where Intel needs to work harder. Clocks won't get them there. Their development is a mess, they're chipping and chiseling here and there, but they're still stuck in their old ways. There are a lot of market entities capable of making low power efficient CPUs today. But who knows, I'm curious to see what the big changes amount to.
 
I've never seen an E-core enabled CPU consuming less power on idle than an equally clocked E-core disabled CPU of the exact same architecture. Was quite the opposite. Why?

E-cores are not power-efficient cores, they are area-efficient cores. Four of these Gracemont cores occupy similar space to one Raptor Cove core. By cramming 4 cores in the space of one, Intel manages to raise multithreaded performance while keeping die area under control.

E-Cores aren't all that glamorous IMO

From 8 cores (16T) CBR20 max I gotten was 10Kpts. With ALL 28 threads, that score went up to 15.5K. (14700K)

A whopping 5.5K increase, and that's OC best case scenario both ways too. I'm talking the E-Cores where running 4950Mhz (I have screen shots for requests)

The E-cores are efficient at taking up space!! Sure a 285K should have better E-cores and will do better than 14th gen, but at the cost of HT?? I'm not buying into that personally.

I mean, it seems that Arrow Lake is largely matching Raptor Lake in multithreaded scores without the help of SMT. That's pretty remarkable. 5.5K points in your case translates to a 55% increase in score, and you're still missing a cluster, an i9 would take you a step further still. That's not so bad, considering E-cores have a resource pool of their own and affect very little on how the P-cores perform.

So you would trade 4E for 1P? Give up 2 threads (since E doesn't have HT).

Personally yes, I hope Bartlett Lake-S Core non-Ultra chips remain on the table. Gaming-wise, it would make a slick upgrade from the 13900KS that'd fit on my Apex Encore. On the flip side, a 12P/24T processor is all but guaranteed to be hotter (translating to throttle happy under conventional cooling) and slower for multithreaded tasks. Still, if this allows them to re-enable AVX-512 support officially, I would gladly accept it
 
Let's pretend marketing, business and all that economy stuff are completely irrelevant. I'm about to ONLY talk engineering aspects of this phenomenon.

From what I've gathered so far (and I might be totally wrong. Correct me if I am):
• Hybrid structure cries for an impeccable prediction mechanism which can never be invented. At least with our current state of knowledge.
• E-cores are mocked by last gen architectures in gaming even if the game is coded so well E-cores actually improve the experience in all aspects.
• E-cores are mocked by P-cores in terms of performance per watt if you downclock the latters to around 4.3 (Alder Lake) or 4.6 (Raptor Lake) GHz.
• Software development is currently in a state that promotes fast releases but doesn't tolerate actual bug fixing if it takes more than a manhour to deploy. Which means scheduling is virtually thrown outta window.
• It's not impossible to land 16ish properly working P-cores on one die and make them feel at home, likely cutting about a half or two GHz all-core turbo so it actually doesn't go kaboom.
• Average Joes and Janes (and attack helicopters for that matter, too) don't have any idea what these cores are actually good at. They render confused at best.
• There's no evidence that heterogenous architecture helps alleviating background loads any better than just throwing more P-cores.
• It seems it's also more complex and failure prone than a good ol' technique of just having X cores of the same arch.

Why did Intel abandon HT (which I don't mind at all and it's not to be discussed in this thread) and not E-cores since they already implemented segmental layout? Is there anything real engineers can see going wrong that I don't? Once again, if it's all only limited to cash and marketing then I don't even know what to say.
1 - The performance of HT is often overstated, most people on this forum e.g. think its really good when its really kind of meh except in certain workloads where it is kind of good, but still those very workload is where e-cores shine as well.
2 - It has security issues.
3 - four e-cores on die for one p-core so e-cores are basically better bang for buck on performance vs sticking an extra logical core on a p-core.
4 - e-cores in my experience can help a ton on dealing with background loads, however I do agree with you if you have excess p-cores they can also do the same thing. both are valid solutions in my view. however HT is inadequate for that.
5 - Intel had seemed to hit a limit of around 8-10 p-cores on a CPU die, so hence e-cores were born. This will ultimately be the main reason, and HT is no substitute for an extra real core even if its an e-core.
 
Last edited:
Back
Top