233

Intel Arc A770 Review - Finally a Third Competitor

Name: Intel Arc A770
Brand: Intel
Price: 350 USD

W1zzard

on Oct 5th, 2022,

in Graphics Cards.

Manufacturer: Intel

Pictures & Teardown »

Architecture

Intel's Xe-HPG "Alchemist" graphics architecture sees its biggest implementation to date with the ACM-G10 silicon powering the A770 and A750. Built on the 6 nm process at TSMC, the ACM-G10 measures 406 mm² in die-area, and packs 21.7 billion transistors. Much like NVIDIA and AMD, Intel has innovated its own hierarchy for the number-crunching machinery of its GPUs, and differentiates SKUs by changing the number of indivisible groups of these to meet performance targets. The ACM-G10 silicon features a PCI-Express 4.0 x16 host interface, a 256-bit wide GDDR6 memory interface, the Xe Media Engine and Xe Display Engine, among a global dispatch processor, and memory fabric that's cushioned by L2 cache. The main SIMD component top-level organization is the Render Slice. Each of these is a self-contained unit with all the number-crunching and raster graphics hardware a GPU needs.

The ACM-G10 features eight such Render Slices. Each of these features four blocks of indivisible processing machinery, called Xe Cores; four Ray Tracing Units (RT units), and DirectX 12 Ultimate optimized raster-graphics units that includes four Samplers, Tessellation geometry processors, 16 ROPs, and 32 TMUs. Since there are 8 Render Slices, the silicon physically has 16 RT units, 128 ROPs, and 256 TMUs.

The Xe Core is the indivisible computation core, with sixteen 256-bit Vector Engines (execution units), sixteen 1024-bit XMX Matrix Engines, and 192 KB of L1 cache. Each Vector Engine has eight each of FP and INT units, besides a register file. Two adjacent Vector Engines share a Thread Control unit to share execution waves. There are 16 VEs per Xe Core, 4 Xe Cores per Render Slice, and 8 Render Slices on the ACM-G10 silicon, so we have 512 execution units in all, each with 8 FP/INT execution stages, which logically work out to 4,096 unified shaders. The Arc A770 is configured with all 32 Xe Cores, so it gets 4,096 shaders. The A750 gets 28 Xe Cores, or 7 out of 8 Render Slices, or 448 execution units, amounting to 3,584 unified shaders.

The XMX Matrix Engine is extremely capable matrix-multiplication fixed-function hardware that can accelerate AI deep-learning neural net building and training. It's also a highly capable math accelerator. Intel originally designed this for the Xe-HP AI processors, but it finds client applications in Arc Graphics, where it is leveraged for ray tracing denoising and to accelerate features such as XeSS. There are 16 XMX units per Xe Core, and 64 per Render Slice, 512 across the ACM-G10 silicon. Each XMX unit can handle 128 FP16 or BF16 operations per clock; up to 256 INT8 ops/clock and 512 INT4 ops/clock. The XMX-optimized native XeSS code is an order of magnitude faster than the industry-standard DP4a codepath of XeSS, as you'll see in our testing.

When it comes to real time ray tracing, Intel's Xe-HPG architecture has technological-parity with NVIDIA RTX, due to its heavy reliance on fixed-function hardware for ray intersection, BVH, and AI-based denoising.

There are several optimizations that further reduce the burden of ray tracing operations on the main SIMD machinery, such as shader execution reordering which optimizes shader work threads for streamlined execution among the SIMD units. NVIDIA is only now implementing such a feature, with its GeForce "Ada" architecture. There's a special component in each Xe Core that reorders shader threads. It's essentially a very intelligent dispatch unit. Intel refers to its ray tracing architecture as Asynchronous.

With Moore's Law tapering, despite Intel claiming otherwise, the writing is on the wall—rendering at native resolution is over, at least in the performance and mainstream GPU segments. High-quality super resolution features such as DLSS and FSR are helping NVIDIA and AMD shore up performance at minimal quality loss, by rendering games at lower resolutions than native to the display; and upscaling them intelligently, with minimal quality losses. Intel's take is XeSS. The company claims that this is a 2nd generation super-res technology, on par with DLSS 2 and FSR 2.0. The XeSS upscaling tech is as easily integrated with a game engine's rendering pipeline as TAA (or AMD FSR). The XeSS algorithm is executed as XMX-optimized AI on Arc GPUs, and as DP4a-compatible code on other GPU brands. The algorithm takes into account low-resolution frame data, motion vectors, and temporal data from previously output high-res frames, to reconstruct details, before passing on the high-res output to the game engine for post-processing and UI/HUD.

The Xe Display Engine is capable of up to two display outputs at 8K 60 Hz + HDR; up to four outputs at 4K 120 Hz + HDR, and up to four 1440p or 1080p with 360 Hz + HDR. VESA Adaptive Sync and Intel Smooth Sync features are supported. The latter is a feature that runs the GPU at its native frame-rate, while attempting to remove the screen-tearing from the display output. A typical Arc desktop graphics card has two each of DisplayPort 2.0 and HDMI 2.0b connections. The Xe Media Engine provides hardware-accelerated decoding and encoding of AV1, and accelerated decoding of H.265 HEVC, and H.264 AVC.

Besides VESA Adaptive Sync, the Xe Display Engine offers a software feature called Smooth Sync. This gives fixed refresh-rate monitors the ability to play games without V-Sync, rendering them at the highest possible FPS (and the least input latency), while attempting to eliminate the screen-tearing using a shader-based dithering filter pass. This is an extremely simple way to solve this problem, and we're surprised AMD and NVIDIA haven't tried it.

Apr 17th, 2025 16:51 EDT change timezone

Latest GPU Drivers

New Forum Posts

16:50 by Ninja Weedle
3DMARK "LEGENDARY" (301)
16:45 by rv8000
SK hynix A-Die (Overclocking thread) only for RYZEN AM5 users (53)
16:33 by io-waiter
Rtx5070 lost 2%-5% performance after last driver 576.02. (1)
16:13 by Mark Draconian
AAF Optimus Modded Driver For Windows 10 & Windows 11 - Only for Realtek HDAUDIO Chips (426)
16:06 by bonehead123
Looking for a rolling workbench recommendation (2)
16:03 by Fluffmeister
RX 9000 series GPU Owners Club (360)
16:00 by CrAsHnBuRnXp
Windows 11 fresh install to do list (39)
15:59 by Splinterdog
What are you playing? (23388)
15:34 by DiceHD
i9-14900HX Running Hot — FIVR Undervolting Locked (7)
15:31 by unclewebb
ThrottleStop Settings i7-10750H (1)

Popular Reviews

Apr 14th, 2025 G.SKILL Trident Z5 NEO RGB DDR5-6000 32 GB CL26 Review - AMD EXPO
Apr 16th, 2025 ASUS GeForce RTX 5060 Ti TUF OC 16 GB Review
Apr 16th, 2025 NVIDIA GeForce RTX 5060 Ti PCI-Express x8 Scaling
Apr 11th, 2025 ASUS GeForce RTX 5080 TUF OC Review
Apr 16th, 2025 Palit GeForce RTX 5060 Ti Infinity 3 16 GB Review
Apr 14th, 2025 DAREU A950 Wing Review
Apr 16th, 2025 ASUS GeForce RTX 5060 Ti Prime OC 16 GB Review
Apr 16th, 2025 Zotac GeForce RTX 5060 Ti AMP 16 GB Review
Apr 16th, 2025 MSI GeForce RTX 5060 Ti Gaming OC 16 GB Review
Apr 16th, 2025 MSI GeForce RTX 5060 Ti Gaming Trio OC 16 GB Review

Architecture

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts