Monday, July 1st 2024
Intel "Arrow Lake-S" to See a Rearrangement of P-cores and E-cores Along the Ringbus
Intel's first three generations of client processors implementing hybrid CPU cores, namely "Alder Lake," "Raptor Lake," and "Meteor Lake," have them arranged along a ringbus, sharing an L3 cache. This usually sees the larger P-cores to one region of the die, and the E-core clusters to the other region. From the perspective of the bidirectional ringbus, the ring-stops would follow the order: one half of the P-cores, one half of the E-core clusters, iGPU, the other half of E-cores, the other half of the P-cores, and the Uncore, as shown in the "Raptor Lake" die-shot, below. Intel plans to rearrange the P-cores and E-core clusters in "Arrow Lake-S."
With "Arrow Lake," Intel plans to disperse the E-core clusters between the P-cores. This would see a P-core followed by an E-core cluster, followed by two P-cores, and then another E-core cluster, then a lone P-core, and a repeat of this pattern. Kepler_L2 illustrated what "Raptor Lake" would have looked like, had Intel applied this arrangement on it. Dispersing the E-core clusters among the P-cores has two possible advantages. For one, the average latency between a P-core ring-stop and an E-core cluster ring-stop would reduce; and secondly, there will also be certain thermal advantages, particularly when gaming, as it reduces the concentration of heat in a region of the die.Every P-core would be no more than one ring-stop away from an E-core cluster, which should benefit migration of threads between the two core types. Thread Director prefers E-cores, and when a workload overwhelms an E-core, it is graduated to a P-core. This E-core to P-core migration should see reduced latencies under the new arrangement.
Source:
Kepler_L2 (Twitter)
With "Arrow Lake," Intel plans to disperse the E-core clusters between the P-cores. This would see a P-core followed by an E-core cluster, followed by two P-cores, and then another E-core cluster, then a lone P-core, and a repeat of this pattern. Kepler_L2 illustrated what "Raptor Lake" would have looked like, had Intel applied this arrangement on it. Dispersing the E-core clusters among the P-cores has two possible advantages. For one, the average latency between a P-core ring-stop and an E-core cluster ring-stop would reduce; and secondly, there will also be certain thermal advantages, particularly when gaming, as it reduces the concentration of heat in a region of the die.Every P-core would be no more than one ring-stop away from an E-core cluster, which should benefit migration of threads between the two core types. Thread Director prefers E-cores, and when a workload overwhelms an E-core, it is graduated to a P-core. This E-core to P-core migration should see reduced latencies under the new arrangement.
100 Comments on Intel "Arrow Lake-S" to See a Rearrangement of P-cores and E-cores Along the Ringbus
I actually now regret not going DDR5, as I think in some workloads its probably hindering the performance and its crazy that the improvement I am noticing would have been even higher. However I am not going to swap out the board and swap the RAM at this point, not rebuilding again.
I think the two most sensible things Intel can do on Arrow Lake is ditch HTT, its out dated and inefficient now, and also ship CPUs lower on the v/f curve.
My comment was about intel not having Ht on arrow lake, or did you not get that?
My current specs have nothing to do with my comment
I have a brain, so yes, I disabled HT. But I won't tell you what I found after trying this for 2 weeks... I'll let you put your money where your mouth is first. Please come back and share.
How i run my rig has nothing to do with Arrow lake or this thread.
7950x at around 100w with undervolt is just nuts, still going to come ahead of 13700k (same number of cores) with same amount of tuning time dedicated to it no matter the sample you take. TSMC has risen making mobile chips therefore their nodes are tuned for low power from the start. You are comparing 8 cores to 16, that is how multithreaded efficiency shines. Intel needs TWICE the cores to be more efficient. Now, take a 12700k against a 7950x wonder wich one is more efficient at 100w. You are clearly new to multicore processing, efficiency scaling and V/F curves. you never opened intel power gadget on intel laptops, i see Yeah, on a desktop board where you manually input your limit. But they are talking about mobile cpus. You know, these things you buy with a screen attached, a battery, they also have locked bios where you can't adjust anything. Vendors are allowed to do whatever they want on the power limit even when they are clearly set by intel. Also, everyone can take throttlestop and tune that to their liking and also find out how much these limits are not enforced on the vast majority of consumer products.
Speaking of unknown stuff, you ver heard of Apple? They used to make laptops with intel mobile cpus, you know, with i9 9880h wich is a 45w CPU
Now take a look at this video around 50 seconds in:
But yeah, intel nominal TDP. They never lie. Intel does not provide any factual information. Actual proven testing does. Welcome to the real world where i deal with refurbishing and upgrading machines on a daily basis. I look at power draw from the wall and INTEL MADE power reporting software.
Yes, intel has all the tools to have their CPUs respect the limits. They also allow for them to be easily bypassed by anyone, from end user to board and laptop manufacturers. Everything you pointed out should lead to the cpu using LESS, not MORE power than the nominal TDP :roll:
And we are talking about almost than DOUBLE the nominal value.
Now imagine, this thin and light macbook does it. Sure huge beefy gaming notebooks will comply to 45w :roll:
You are all really joking right? I can't read this stuff on a tech forum. Have you ever actually tested the stuff you talk about? Now to be fair, they are working with different less efficient nodes. But the fact that AMD can beat them with a mobile-oriented node and stil be more efficient is rough. They have been almost sitting all the way from haswell to rocket lake, so much that apple had time to beat their absolute performance with mobile CPUS.
The reported consumption is OBVIOUSLY accurate, as that clock and performance would never come with 45w being enforced.
With cooling mods as in the video, that cpu will come close to 9900 non K 95w performance, wich is another CPU that should be limited to 65w but will default to unlimited and use more than 100w on many boards without any prompt even without XMP enabled: intel/comments/dfkqjq
I have been building almost only on intel since i do hackintosh and i have deep tested maybe a hundred different boards (using wall meter for efficiency evaluation) in my life because i have to choose wich ones go in production on my machines. Every time i have to work with AMD i am impressed. They are such a tiny company yet they have so many strong points over intel, like efficiency, being honest in their specs, making stuff like pcie bifurcation avaiable on the whole lineup..
What about my 5600h? TDP 45, hits 65.
What about my 5800u? 25w max tdp, hits 45.
That's why they totally don't take anti-competitive measures the second their product takes a clear lead. By which metric is AMD tiny again? Their market cap is more than twice that of Intel. Mine does too. I think it's got something to do with that 1.35*TDP for PPT rule (which effectively makes it a 60W processor).
The only efficient parts amd has are their mobile chips and their desktop APUs. The monolithic ones. Those are freaking great. The actual desktop chips are just not. Multiple dies cannot and are not efficient. Not for actual desktop usage. An almost idle 7950x (running syncthing, steam and discord) hits 70w. Not a spike, average. Same crap running on an intel part, 15 watts. But yeah, that amd efficiency man....
Sure, let's close all background apps. Now let's go browse youtube and watch a video with no background apps running. Spikes to 60+ watts and averages 45 just scrolling youtube comments!
Whenever people say AMD are efficient, it makes it obvious to me they have never tried them or they've never compared it to an equivalent intel chip.
But yhea, saying that Intel CPUs will always run way beyond the PL2, even when it's manually set by the user is the take of someone who doesn't know what he's talking about. There's lot of people in the SFF community that are tuning their i9 for low power and cool it with a low profile air cooler with sucess.
Power Consumption - The Intel 9th Gen Review: Core i9-9900K, Core i7-9700K and Core i5-9600K Tested (anandtech.com)
Intel Core i9 13900K: Impact of MultiCore Enhancement (MCE) and Long Power Duration Limits on Thermals and Content Creation Performance | Puget Systems
No one said vendors are not allowed to do the same with AMD chips, just that you cannot use TDP to measure efficiency in any shape or form, as most benchmark numbers on the internet are done on unlocked power limits. Also, provide some evidence for your numbers, as i have bought many AMD machines and they were all reaching above average performance (compared to online results in benchmarks like cinebench and geekbench) on platforms with enforced TDP (chinese machines, huawei laptops and minisforum mini pc)
Now, to provide some actual evidence i use a platform i trust, notebookcheck. The Minisforum 5600h machine sticks to apparent 42w power limit and outperforms their 12900h one by the same vendor in all but synthetic heavily multithreaded scenarios.
Again, you bring out stuff that doesn't matter in the initial discussion, and keep failing. Yeah, show me some evidence, like european lawsuits for anti competitive behaviour. You are young and never heard of pentium4, i guess.
Hey, take a look at wikipedia!
Number of employees: en.wikipedia.org/wiki/AMD 26.000
en.wikipedia.org/wiki/Intel 124,800
Total assets: AMD 68 billion, intel 190 billion
Using market cap to compare company size? Like what, i will say a Ferrari is bigger than a truck because it cost more? Yeah. Again, the initial point was made from the user that said intel cpus stick to their PL2 and TDP out of the box.
I never said anything like never or always about intel not enforcing their tdp, as that is not "intel" that controls it, i am talking about what actually happens in most consumer products, like very common Apple laptops, and i even stated that some business laptopt will actually respect the intel limits.
Most machines will not, either just ignoring the PL and going with the thermal limit, or enforcing a higher tier PL2 with no prompt, running i9 9900 non K with unlimited Pl2: i now clearly recall having it happen on a gigabyte h310 from a client build, wich resulted in awful efficiency for such a terrible VRM and incurring in VRM temp limit. It was the default behavior, OOB, for that combination, i enforced a 80w limit wich resulted in much better benchmark scores.
I built hundreds of itx systems, thanks. I ran passive i9 and i5, i tested so much stuff in my life, don't worry. And i am very careful and precise with my statements, and answer to actual quotes.
Here is one article i have written on a similar topic, as you migh see english is not my native language but the points still stand: www.pizzaundervolt.com/choosing-a-computer-for-audio-production-laptop/
Regarding your "actual evidence" your own links prove you wrong. The 5600h which you claimed never exceeds 42 watts is hitting 70w in your link. And of course it's slower than the 12900h, again, according ot the links you provided
Clearly, you are not used to see that, but 75 is the temperature, not the wattage. Intel users are not used to these numbers, i understand. And no, the 12900h is only faster in cinebench, wich doesn't matter for 99% of users, the PcMark numbers are better for the 5600h wich is older and has a lot less cores.
I understand now how you can state intel has some advantages, probably misread all the graphs you see and never actually tested these machines.
www.notebookcheck.net/fileadmin/_processed_/2/c/csm_cpu_metrik_r15_loop_7fa5102a3c.jpg
Dude, are you troll posting or is this serious? Uhm, no, the 12900h was faster on the whole test suite they used. If you click the Performance Rating bar it gives you the average across their whole testing suite. The 12900h was faster than the 5900HX. The 5900HX is much faster than the 5600h. Now link the dots and tell me how the hell can the 5600h be faster than the 12900h when even the 5900hx isn't???