Tuesday, October 22nd 2024

Die-Shots of Intel Core Ultra "Arrow Lake-S" Surface, Thanks to ASUS

As Intel's Core Ultra "Arrow Lake-S" desktop processors near their launch, ASUS China put out a video presentation about its Z890 chipset motherboards ready for these processors, which included a technical run-down of Intel's first tile-based desktop processor, which included detailed die-shots of the various tiles. This is stuff that would require not just de-lidding the processor (removing the integrated heat-spreader), but also clearing up the top layers of the die to reveal the various components underneath.

The whole-chip die-shot gives us a bird's eye view of the four key logic tiles—Compute, Graphics, SoC, and I/O, sitting on top of the Foveros base tile. Our article from earlier this week goes into the die areas of the individual tiles, and the base tile. The Compute tile is built on the most advanced foundry node among the four tiles, the 3 nm TSMC N3B. Unlike the older generation "Raptor Lake-S" and "Alder Lake-S," the P-cores and E-core clusters aren't clumped into the two ends of the CPU complex. In "Arrow Lake-S," they follow a staggered layout, with a row of P-cores, followed by a row of E-core clusters, followed by two rows of P-cores, and then another row of E-core clusters, before the final row of P-cores, to achieve the total core-count of 8P+16E. This arrangement reduces concentration of heat when the P-cores are loaded (eg: when gaming), and ensures each E-core cluster is just one ringbus stop away from a P-core, which should improve thread-migration latencies. The central region of the tile has this ringbus, and 36 MB of L3 cache shared among the P-cores and E-core clusters.
Next up, is the SoC tile. This chiplet is built on the 6 nm DUV TSMC N6 node. Both edges of the tile has PHYs for the various I/O interfaces. One side has the dual-channel DDR5 PHY, while the other has a portion of the chip's PCI-Express PHY. The SoC tile puts out 16 PCIe Gen 5 lanes meant for the PEG interface (the x16 slot on your motherboard). The I/O tile puts out four PCIe Gen 5 lanes, and four PCIe Gen 4 lanes, besides the DMI 4.0 x8 chipset bus. The Gen 4 x4 from the I/O can be reconfigured as Thunderbolt 4 or USB4. The SoC tile also contains the NPU 3 unit, which appears to be carried over from the SoC tile of "Meteor Lake." It has a peak throughput of 13 AI TOPS. The SoC tile also contains the chip's platform security processors, and a few allied components of the iGPU, namely the display engine, the media accelerators, and the display I/O.

Lastly, there's the Graphics tile. Intel built this on the fairly advanced 5 nm TSMC N5 process (same one that current NVIDIA Ada and AMD RDNA 3 GPUs are built on). This slender tile only contains the 4 Xe cores available to this iGPU variant, and graphics rendering machinery.

The filler tiles appear like voids under the microscope.
Sources: ASUS China (Bilibili), Videocardz
Add your own comment

9 Comments on Die-Shots of Intel Core Ultra "Arrow Lake-S" Surface, Thanks to ASUS

#1
Wirko
btarunreach E-core cluster is just one ringbus stop away from a P-core, which should improve thread-migration latencies
How often do those thread migrations occur anyway? And when they do, the contents of L2 has to be discarded (simple but adds latency) or transferred to the other core/cluster L2 (complicated and adds latency). Both possibilities seem very un-optimal.
Posted on Reply
#5
_roman_
which included a technical run-down of Intel's first tile-based desktop processor,
So Intel is also gluing together their equipment now like AMD did?
Posted on Reply
#6
btarunr
Editor & Senior Moderator
WirkoHow often do those thread migrations occur anyway? And when they do, the contents of L2 has to be discarded (simple but adds latency) or transferred to the other core/cluster L2 (complicated and adds latency). Both possibilities seem very un-optimal.
The L3 cache doesn't have a single ringbus stop, but several. To the software, this is one large piece of cache memory, but on the hardware, Intel can write L2 victims to slices of the L3 cache with the fewest ringbus stops to the intended core-type if a thread is migrating between two core types.
Posted on Reply
#7
RUSerious
DavenMy intuition tells me that Intel is slowly migrating away from P cores to an E core only design after 1-2 more generations. Power usage will be out of control until then.
IMHO, Intel cannot afford to give up that much performance. There are important apps that need that speed.
Posted on Reply
#8
trparky
DavenMy intuition tells me that Intel is slowly migrating away from P cores to an E core only design after 1-2 more generations. Power usage will be out of control until then.
Actually, some reports indicate that somewhere around 2027 e-Cores will be gone and will be replaced by a more unified core architecture similar to that of Zen and Zen C cores.
Posted on Reply
#9
Wirko
trparkyActually, some reports indicate that somewhere around 2027 e-Cores will be gone and will be replaced by a more unified core architecture similar to that of Zen and Zen C cores.
The E-core performance seems to be approaching P-core performance - we'll soon see, and if true then what you said is logical to expect. If necessary, many variations are possible within just one type of core. Less L2, slower AVX-512 or AVX10, smaller transistors for lower clock - all of these make sense without deep architectural changes. Use of the same core type both as independent cores and in clusters? Maybe.

Intel now has LPE cores too, those are much smaller, meaning that they will be developed independently.
Posted on Reply
Dec 21st, 2024 20:15 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts