Friday, November 15th 2024

Intel Plans to Copy AMD's 3D V-Cache Tech in 2025, Just Not for Desktops

Intel is coming around to the idea of large last-level caches on its processors. Florian Maislinger, a tech communications manager for Intel, in an interview with Der8auer and Bens Hardware, revealed that the company is working on augmenting its processors with large shared L3 caches, however, it will begin doing so only with its server processors. The company is working on a new server/workstation processor for 2025 that comes with cache tiles that augment the shared L3 cache on its server processor, so it excels in the kind of workloads AMD's EPYC "Genoa-X" processors and upcoming "Turin-X" processors excel at—technical computing. On "Genoa-X" processors, each of the up to 12 "Zen 4" CCDs comes with stacked 3D V-Cache, which is found to have a profound impact on performance in applications that are cache-sensitive, such as the Ansys suite, OpenFOAM, etc.

The interview reveals that the server processor with large last-level cache should come out in 2025, however there is no such effort on the horizon for the company's client processors, such as the Core Ultra "Arrow Lake-S," at least not in the year 2025. The company's recently launched "Arrow Lake-S" desktop processors do not provide a generational gaming performance uplift over the 14th Gen Core "Raptor Lake Refresh," however, Intel claims to have identified certain correctable reasons for the gaming performance falling below expectations, and is hoping to release updates to the processor (possibly in the form of a new microcode, or something at the OS-vendor level). This, the company claims, should improve the gaming performance of "Arrow Lake-S."
Sources: Der8auer (YouTube), VideoCardz, HardwareLuxx.de
Add your own comment

77 Comments on Intel Plans to Copy AMD's 3D V-Cache Tech in 2025, Just Not for Desktops

#1
phanbuey
Yeah - intel needs to be careful not to sell too many CPUs or they might not be able to fab to all that demand. Makes sense.
Posted on Reply
#2
docnorth
Better late than never later.:D
Posted on Reply
#3
Quicks
GLWS

They still need to figure out how to lower their CPU's wattage is just crazy.

Anyways lost opportunity for Intel.
Posted on Reply
#4
_roman_
I do not agree with that. Intel already had such a processor with extra "cache". i7-5775C

www.intel.com/content/www/us/en/products/sku/88040/intel-core-i75775c-processor-6m-cache-up-to-3-70-ghz/specifications.html

Again, the CPU includes 6MB of L3 cache and 128MB of eDRAM.

www.tomshardware.com/reviews/intel-core-i7-5775c-i5-5675c-broadwell,4169.html

It's up to discussion. I see the 7800X3d Cache as 4th level one like the EDRAM cache of the i7-5775C
Posted on Reply
#6
phints
Intel should plan to copy AMD's 3D V-cache performance and efficiency, just for desktops.
Posted on Reply
#7
Makaveli
imitation best form flattery.
_roman_It's up to discussion. I see the 7800X3d Cache as 4th level one like the EDRAM cache of the i7-5775C
Not really the same.

64MB of SRAM for L3 vs 128MB of EDRAM for L4.
Posted on Reply
#8
human_error
_roman_I do not agree with that. Intel already had such a processor with extra "cache". i7-5775C

www.intel.com/content/www/us/en/products/sku/88040/intel-core-i75775c-processor-6m-cache-up-to-3-70-ghz/specifications.html

Again, the CPU includes 6MB of L3 cache and 128MB of eDRAM.

www.tomshardware.com/reviews/intel-core-i7-5775c-i5-5675c-broadwell,4169.html

It's up to discussion. I see the 7800X3d Cache as 4th level one like the EDRAM cache of the i7-5775C
It's not really the same though, as the eDRAM in those chips was far higher latency than what you'd get from an L3 cache. It was on-package but not as tightly integrated as the L3 on X3D chips so again not the same. So better than system RAM, but not as good L3 cache.

Otherwise you may as well argue that systems with soldered RAM are better than those with removable as it's attached to the system. Just being attached doesn't make them better or have higher performance.

I hope intel does provide chips with far higher L3 cache. Should give us some competition in the gaming CPU world, and may encourage more game devs and other workloads to be designed to take advantage of it even more.
Posted on Reply
#9
ncrs
human_errorIt's not really the same though, as the eDRAM in those chips was far higher latency than what you'd get from an L3 cache. It was on-package but not as tightly integrated as the L3 on X3D chips so again not the same. So better than system RAM, but not as good L3 cache.

Otherwise you may as well argue that systems with soldered RAM are better than those with removable as it's attached to the system. Just being attached doesn't make them better or have higher performance.
I was about to write something along those lines so instead I'll link Chips and Cheese's recent analysis of Broadwell and Skylake L4 implementations ;)
Posted on Reply
#10
_roman_
Makaveli64MB of SRAM for L3 vs 128MB of EDRAM for L4.
Well I can not argue with that as AMD does not really know what to write themself

I always try to use the datasheet or specification from the manufacturer:
www.amd.com/en/products/processors/desktops/ryzen/7000-series/amd-ryzen-7-7800x3d.html
L1 Cache
512 KB
L2 Cache
8 MB
L3 Cache
96 MB
the same stupidity for the 9800X3d see: www.amd.com/en/products/processors/desktops/ryzen/9000-series/amd-ryzen-7-9800x3d.html

I can not take any manufacturer serious who is unable to provide proper specifications on the specifications page.
Something like
32MiB CACHE TYPE x
64MiB CACHE type y

which also give me the next question. MB or MiB units?

I think the units are wrong also. The units should be MiB. What I read about microcontrollers and such, it's always binary. not the human 10er base. Its base 2.
simple.wikipedia.org/wiki/Mebibyte

e.g. I saw in past months that more and more software in gnu linux already use the correct units. Base 2 or Base 10.

--

It up to discussion. What a Level Cache is? How you define a Level 1, Level 2, Level 3 and Level 4 Cache.
I do not think that the latency of a Level 4 cache is an argument. I do agree - AMD may be first with a 3D-Type of Cache Module for a Level 4 Cache. But maybe not the first who uses Level 4 Cache the first time.
Posted on Reply
#11
human_error
It's not an L4 cache, it's not addressed as an L4 cache by any code. It is a low latency L3 cache with significant performance characteristics that distinguish it from eDRAM or any other on or off die memory options.

Cache bandwidth and latency are two key factors on why the L1/2/3 caches are so impactful to performance. So the high latency of the eDRAM is a massive disadvantage and why it was referred to as an L4 style memory.

If it was similar then we'd see far better performance impact from that older intel chip, and we'd see issues with worse performance for code needing smaller L3 cache on X3D vs non X3D if the extra L3 cache on X3D didn't perform as well as standard L3 cache.
Posted on Reply
#12
igormp
If they pair this up with their previous idea for that Xeon Max lineup (which had some HBM on die), it would make for a killing HPC/CFD processor. Throw in as much memory channels as possible and could beat AMD's current offerings for that use case.
_roman_What a Level Cache is? How you define a Level 1, Level 2, Level 3 and Level 4 Cache.
I'd say latency and speed (both of which correlate to the proximity to the cores), as well as how many of those different stackings you have.

Tbh it's just a matter of memory hierarchy, and at this point we can add storage and different memory levels as well. What is a "cache", "memory" or "storage" ends up moot under this view since what matters is their speed/latency.
Posted on Reply
#13
Fouquin
_roman_Well I can not argue with that as AMD does not really know what to write themself
They write 96MB because while physically it is 32+64MB, due to the cache lines being shared (linked by in-silicon vias) there is no logical differences between the 32MB built into the core die and the 64MB on the 3D cache die. When L2 flushes to L3 cache lines it does not require any extra special loads or stores to interface with the 3D cache, because at the logical level it is identical to the rest of the L3.
Posted on Reply
#14
efikkan
phintsIntel should plan to copy AMD's 3D V-cache performance and efficiency, just for desktops.
AMD has clearly better efficiency, but it's not due to the large L3 cache.
But it's hard to find something more deserving of the title "waste of sand" than throwing a bunch of L3 cache on a die, as it's only a tiny subset of very poorly optimized code which significantly benefit from it, namely certain outliers in applications and games running at very unrealistically low GPU load. It would be much better to have a CPU with 5% more computational power, especially down the road, as future games are likely to become more demanding so the bottleneck will be computational performance, not "artificial" ones running games at hundreds of frames per second.
For CPUs to advance, they should stop focusing on gimmicks and make actual architectural advancements instead. Large L3 caches is a waste of precious development resources as well as production capacity.
Posted on Reply
#15
Makaveli
efikkanFor CPUs to advance, they should stop focusing on gimmicks and make actual architectural advancements instead. Large L3 caches is a waste of precious development resources as well as production capacity.
Not sure I see it as a gimmick.

You can only go so far with shrinking nodes and they are at the mercy of TSMC in that respect. Secondly I don't think Epyc processors that don't run client workloads that benefit greatly from cache is a bad idea. Client Desktop doesn't really drive anything its all enterprise. Better to deal with the low hanging fruit as you continue to address overall processor improvement than trying to hit the ball of the park every single launch. I consider that a better use of development and production capacity as the core counts go up and processors get faster feeding the cores will always be an issue. If you can help that with cache's and not having to add additional memory channels etc I consider that a win.
Posted on Reply
#16
FoulOnWhite
phintsIntel should plan to copy AMD's 3D V-cache performance and efficiency, just for desktops.
It's not AMD's 3D vcache, it's TSMC's
Posted on Reply
#17
human_error
efikkanBut it's hard to find something more deserving of the title "waste of sand" than throwing a bunch of L3 cache on a die, as it's only a tiny subset of very poorly optimized code which significantly benefit from it, namely certain outliers in applications and games running at very unrealistically low GPU load.
Even well optimized games and workloads can benefit if the highly utilized code can be contained in the cache, as it is higher bandwidth and lower latency than waiting to go to system RAM. Even factorio, which is an extremely well optimized game, massively benefits from this, as do many other workloads. You may as well say computers don't need more than 64k of RAM and any applications that do are poorly optimized.

Extra CPU cycles don't do anything if the CPU is waiting for the data from memory. That's why the 1% lows massively benefit, and performance is better even if the frequency is lower. I have a very high end GPU yet I noticed the difference when playing at 4k. Lows, frame pace consistency etc all benefit and games that used to have very periodic stutters have none compared to my 5ghz 9900k.
Posted on Reply
#18
kapone32
human_errorEven well optimized games and workloads can benefit if the highly utilized code can be contained in the cache, as it is higher bandwidth and lower latency than waiting to go to system RAM. Even factorio, which is an extremely well optimized game, massively benefits from this, as do many other workloads. You may as well say computers don't need more than 64k of RAM and any applications that do are poorly optimized.

Extra CPU cycles don't do anything if the CPU is waiting for the data from memory. That's why the 1% lows massively benefit, and performance is better even if the frequency is lower. I have a very high end GPU yet I noticed the difference when playing at 4k. Lows, frame pace consistency etc all benefit and games that used to have very periodic stutters have none compared to my 5ghz 9900k.
Thank you. I have been arguing this with people posting 4K Ultra benchmarks. CPU performance matters in Games and X3D have changed the World.
Posted on Reply
#19
GoldenX
Looks like glueing cores was the way, and glueing cache to them too, in the end.
Posted on Reply
#20
human_error
kapone32Thank you. I have been arguing this with people posting 4K Ultra benchmarks. CPU performance matters in Games and X3D have changed the World.
I don't understand those people honestly. Benchmarks don't show periodic stutters which you can get in some games for example, and those are fully eliminated for me. Plus, you do see the better lows and general performance in benchmarks. If CPUs didn't make a difference we'd all have 4090s paired with ancient processors.

I have my 7800X3D at 40-60W providing a much better, much more consistent experience with the same GPU and screen than my 9900k that was eating 150W.
Posted on Reply
#21
AnarchoPrimitiv
FoulOnWhiteIt's not AMD's 3D vcache, it's TSMC's
AMD most certainly played an important part in developing the final product
Posted on Reply
#22
ZoneDymo
FoulOnWhiteIt's not AMD's 3D vcache, it's TSMC's
weird statement
Posted on Reply
#23
Steevo
ZoneDymoweird statement
They are concerned for Intel….

Back to the days of minor performance increases and new board is a requirement for 2% performance increase….

But good on Intel, doing what they would sue AMD for, copying a good idea. I wonder if they will be paying royalties?
Posted on Reply
#24
R0H1T
_roman_Again, the CPU includes 6MB of L3 cache and 128MB of eDRAM.
That's closer to "infinity cache" than what AMD has on x3d chips!
FoulOnWhiteIt's not AMD's 3D vcache, it's TSMC's
What :wtf:
Posted on Reply
#25
lexluthermiester
btarunrJust Not for Desktops
What a shame! Intel's desktop lineup could really use such a boost.
Posted on Reply
Add your own comment
Dec 11th, 2024 20:28 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts