Thursday, September 1st 2022
Intel Details its Ray Tracing Architecture, Posts RT Performance Numbers
Intel on Thursday posted an article that dives deep into the ray tracing architecture of its Arc "Alchemist" GPUs, which are particularly relevant with performance-segment parts such as the Arc A770, which competes with the NVIDIA GeForce RTX 3060. In the article, Intel posted ray tracing performance numbers that put it at-par with, or faster than the RTX 3060, which which it has traditional raster performance parity. In theory, this would make Intel's ray tracing tech superior to that of AMD RDNA2, because while the AMD chips have raster performance parity, their ray tracing performance do not tend to be at par with NVIDIA parts at a price-segment level.
The Arc "Alchemist" GPUs meet the DirectX 12 Ultimate feature-set, and its ray tracing engine supports DXR 1.0, DXR 1.1, and Vulkan RT APIs. The Xe Core is the indivisible subunit of the GPU, and packs its main number-crunching machinery. Each Xe Core features a Thread Sorting Unit (TSU), and a Ray Tracing Unit (RTU). The TSU is responsible for scheduling work among the Xe Core and RTU, and is the core of Intel's "secret sauce." Each RTU has two ray traversal pipelines (fixed function hardware tasked with calculating ray intersections with intersections/BVH. The RTU can calculate 12 box intersections per cycle, 1 triangle intersection per cycle, and features a dedicated cache for BVH data.The TSU, as we said, is the secret sauce of Intel's ray tracing performance. It's key to achieving what Intel calls "Asynchronous Ray Tracing." The TSU organizes ray tracing instructions and data such that rays with similar hit shaders are optimally allocated unified shader resources of the Xe cores, for the best possible allocation of hardware resources. The slide above details the ray tracing pipeline, where the TSU is shown playing a big role in optimizing things for the hit-shader execution stage.Intel posted performance numbers for the Arc A770 at 1080p, compared with the RTX 3060 at the same resolution, across a selection of 17 games. These include Ghostwire Tokyo, which was earlier found to be extremely sub-optimal on the "Alchemist" architecture, but has since been optimized for in the latest beta drivers. The A770 trades blows with the RTX 3060, even if there are a few cases where the NVIDIA chip is slightly ahead. This is certainly a better showing when compared to a Radeon RX 6650 XT pitted against the RTX 3060 in ray tracing.The A770 isn't meant for 1440p + Ray Tracing (nor is the RTX 3060), but performance enhancements like the XeSS and DLSS make both possible. While Intel didn't compare the A770+XeSS to RTX 3060+DLSS at 1440p, it posted a slide about how XeSS makes gaming with ray tracing more than playable at 1440p, across both its "balanced" and "performance" presets.
Below is the video presentation from Intel:
The Arc "Alchemist" GPUs meet the DirectX 12 Ultimate feature-set, and its ray tracing engine supports DXR 1.0, DXR 1.1, and Vulkan RT APIs. The Xe Core is the indivisible subunit of the GPU, and packs its main number-crunching machinery. Each Xe Core features a Thread Sorting Unit (TSU), and a Ray Tracing Unit (RTU). The TSU is responsible for scheduling work among the Xe Core and RTU, and is the core of Intel's "secret sauce." Each RTU has two ray traversal pipelines (fixed function hardware tasked with calculating ray intersections with intersections/BVH. The RTU can calculate 12 box intersections per cycle, 1 triangle intersection per cycle, and features a dedicated cache for BVH data.The TSU, as we said, is the secret sauce of Intel's ray tracing performance. It's key to achieving what Intel calls "Asynchronous Ray Tracing." The TSU organizes ray tracing instructions and data such that rays with similar hit shaders are optimally allocated unified shader resources of the Xe cores, for the best possible allocation of hardware resources. The slide above details the ray tracing pipeline, where the TSU is shown playing a big role in optimizing things for the hit-shader execution stage.Intel posted performance numbers for the Arc A770 at 1080p, compared with the RTX 3060 at the same resolution, across a selection of 17 games. These include Ghostwire Tokyo, which was earlier found to be extremely sub-optimal on the "Alchemist" architecture, but has since been optimized for in the latest beta drivers. The A770 trades blows with the RTX 3060, even if there are a few cases where the NVIDIA chip is slightly ahead. This is certainly a better showing when compared to a Radeon RX 6650 XT pitted against the RTX 3060 in ray tracing.The A770 isn't meant for 1440p + Ray Tracing (nor is the RTX 3060), but performance enhancements like the XeSS and DLSS make both possible. While Intel didn't compare the A770+XeSS to RTX 3060+DLSS at 1440p, it posted a slide about how XeSS makes gaming with ray tracing more than playable at 1440p, across both its "balanced" and "performance" presets.
Below is the video presentation from Intel:
30 Comments on Intel Details its Ray Tracing Architecture, Posts RT Performance Numbers
that looks promising for their first GPUs.
But now it is second-half year 2022 and the cards are still existed in the form of vapourized thin air.
Even if A770 is 20% faster than A750 (it won't) that means A750 in worst case is -5% vs RTX 3060 when Raytracing enabled or just the same if A770/A750 have 14% performance difference.
In synthetic test Intel claims the difference is a lot bigger and as shader complexity goes up so is the difference.
Regarding theoretical throughput which means absolutely nothing (TSU and dedicated BVH cache is the secret sauce), A770 has the same rate per clock as RX6650XT per clock regarding ray-tri peak and 3X/clock regarding ray-box peak. (RDNA2 is 1 ray-triangle intersection per clock per CU and 4 ray-box intersections per clock per CU)
We ain't far from 2023. Amd and nvidia soon will reveal newer gpus. Intel needs to do PR to current generation while they can... intel call this a win, win scenario, probably. While it is fail in real scenario...
Intel gpus would have a value one two years ago where you couldn’t find a gpu.
Now the second hand market is full of powerhouses.
When you have just ray traced shadows or reflections, you still see baked or incorrect lighting. The GI is the most important thing in graphics since GeForce 3 Ti200s pixel shading.
AMD has to bring a Ryzen-like gpu to have a success. FSR 2 matched the DLSS by 99%. I’m ok with that.
But the RT performance is unacceptable.
RT is the future and it is super important. We can probably compare with 16bit vs 32bit color battle 20 years ago. I bet with the 15'' CRT monitors we had back then, there where people who couldn't notice serious graphics differences in the games we where playing back then, not enough to justify the whole fuss about 32bit color especially considering the performance loss.
16-bit vs 32-bit Performance - ATI Radeon 32MB SDR
16-bit vs 32-bit Performance - 3dfx Voodoo4 4500AGP
The company that had the worst performance in 32bit color, didn't survived.
heck imo games have not progressed much at all visually for the last 5? years.
Every brand new game I have seen looks extremely underwhelming visually if not downright old, Dying Light 2, the new Saints Row....
But we are all taken aback when something shows up that does move the needle, Far Cry, Doom 3, FEAR, Crysis just to name a few.
RT is most def a feature you see and notice, Screen Space reflections for example are glaring in their shortcomings, you thought lack of AA is distracting with the jittery lines? try half the image changing drastically just by looking up or down.... and yet we see it a lot.
A friend streamed I think it was "Medium" for me and that had cubemap reflections, which are still there when looking up and down so if that was just the standard then indeed RT there would be less impressive (probably I dont really know the shortcomings of cubemap reflections).
But also light bouncing lighting up dark areas more realistically, its definitely something rather new in games that you do notice, lighting up a red wall and seeing that red be reflected on other objects, you cant really miss that.
In the last of us ps5 remake they fake this effect but only in some areas, if it could/would be faked always and everywhere then im sure RT again would not be that impressive but there is a reason they (or anybody for that matter) dont do that.
you really dont need side by side comparisons to notice RT, though there are things that can fake it quite well.
And if you really dont notice, well that is fine as well, dont need to play on the highest settings then, turn that stuff down and enjoy high fps.....unless you dont notice that either of course.
Current hybrid RT solution actually offer the best of both rasterization and RT, unless you like to play simple looking path-traced games like Quake 2 RTX or Minecraft RTX
And if my memory serves me right Voodoo 4 32bit color performance probably on the bottom of the barrel of reasons regarding why 3DFX sold most of it's assets to Nvidia in late 2000. They didn't delivered (sadly) in many fronts and was behind in the implementation of new features in relation with Nvidia (but they had better picture quality in some of the same features implemented)