Thursday, August 19th 2021
Intel's DLSS-rivaling AI-accelerated Supersampling Tech is Named XeSS, Doubles 4K Performance
Intel plans to go full tilt with gaming graphics, with its newly announced Arc line of graphics processors designed for high-performance gaming. The top Arc "Alchemist" part meets all requirements for DirectX 12 Ultimate logo, including real-time raytracing. The company, during the technology's reveal, earlier this week, also said that it's working on an AI-accelerated supersampling technology. The company is calling it XeSS (Xe SuperSampling). It likely went with Xe in the name, as it possibly plans to extend the technology to even its Xe LP-based iGPUs and the entry-level Iris Xe MAX discrete GPU.
Intel claims that XeSS cuts down 4K frame render-times by half. By all accounts, 1440p appears to be the target use case of the top Arc "Alchemist" SKU. XeSS would make 4K possible (i.e., display resolution set at 4K, rendering at a lower resolution, with AI-accelerated supersampling restoring detail). The company revealed that XeSS will use a neural network-based temporal upscaling technology that incorporates motion vectors. In the rendering pipeline, XeSS sits before most post-processing stages, similar to AMD FSR.
While AMD's FSR technology is purely shader based, the Intel algorithm can either use XMX hardware units (new in Intel Xe HPG), or DP4a instructions (available on nearly all modern AMD and NVIDIA GPUs). XMX stands for Xe Matrix Extensions and is basically Intel's version of NVIDIA's Tensor Cores, to speed up matrix math, which is used in many AI-related tasks. The Intel XeSS SDK will be available this month, in open source, using XMX hardware, the DP4a version will be available "later this year".
Source:
VideoCardz
Intel claims that XeSS cuts down 4K frame render-times by half. By all accounts, 1440p appears to be the target use case of the top Arc "Alchemist" SKU. XeSS would make 4K possible (i.e., display resolution set at 4K, rendering at a lower resolution, with AI-accelerated supersampling restoring detail). The company revealed that XeSS will use a neural network-based temporal upscaling technology that incorporates motion vectors. In the rendering pipeline, XeSS sits before most post-processing stages, similar to AMD FSR.
While AMD's FSR technology is purely shader based, the Intel algorithm can either use XMX hardware units (new in Intel Xe HPG), or DP4a instructions (available on nearly all modern AMD and NVIDIA GPUs). XMX stands for Xe Matrix Extensions and is basically Intel's version of NVIDIA's Tensor Cores, to speed up matrix math, which is used in many AI-related tasks. The Intel XeSS SDK will be available this month, in open source, using XMX hardware, the DP4a version will be available "later this year".
46 Comments on Intel's DLSS-rivaling AI-accelerated Supersampling Tech is Named XeSS, Doubles 4K Performance
Can't believe I have to explain this. It's not "less proprietary, it's competently open. This isn't about money, it's about time and effort. DLSS takes a lot of time to implement, I have some insight into the game development world and I can tell you that developers have to work for months to get DLSS even remotely close to working properly and it's always a side project because it's just not that important to them. FSR on the other hand is pretty much a couple of days worth of work. If it wasn't for the sponsorship campaigns of Nvidia, I bet most studios wouldn't even think about using it.
2) We already can inject FSR using ReShade. There's no reason anyone can't inject FSR, it's very easy and simple to do, since it's just a shader. RDNA2 is the first AMD architecture with INT32 support.
All GPUs have support for INT32, otherwise they wouldn't be able to work and RDNA1 had support for mixed INT32 execution.
Look at PhysX. Nuff said. Where it was implemented, it worked admirably. But it was never everywhere and still at odds with other physics engines even if they worked less good - devs wont be happy to support just half the market with a different experience.
The end result is ALWAYS that its effectively just being used for marketing, appearing in high profile eye catchers. Look at DLSS support history for perfect proof of that. And RTX is more of the same. , but with an even higher dev investment.
Nothing is free and time to market is money too. The bill cant ever get paid in full, and believing it will is just setting yourself up for another deception.
software.intel.com/content/www/us/en/develop/blogs/conservative-morphological-anti-aliasing-cmaa.html
They say it has nothing to do with MLAA, but what who knows?
This is the only option you have in your control panel to insert full-scene post-process (so, if thi is the level of support you can expecrt on Xe, you'd better get used to bone-dry game override options, and maybe q handful of "only invented here" techs.
And stop showing that DLSS 2 (the TAA derivative) stuff showing good results on scenes that barely move. Amazing post.
You should stop watching videos by "3080 is 2 times faster than 2080" and "8k gaming with 3090" (totally not shills) folks. Uh, I suspect you have missed the two elephants in the room:
1) Effort matters. Is it low? Well, heck, devs can slap a bunch of various implementations in, no prob.
2) There is NO competition between "some crap that runs only on one manufacturer's HW" and stuff that runs on everything. Like at all. The former is in survival "do I even still make sense to exit" mode. The latter can be inferior, but will still do fine, all it needs is being better than standard upscaling solutions. (effort does still matter though, but with FSR we have seen "hard to distinguish from true 4k" and super low effort to implement too, so, hell, good luck with that)
Finally something interesting in the PC world after years of "slightly better numbers and more blinking lights" from the same companies over and over again.
That also let me think that the AI portion in DLSS isn't that much AI. There is simply not enough time to do real AI processing there. And cut me with the pre-learned craps, they probably just figured what would be the best algorithm. They probably use the tensor core and do some calculation in int8 or other AI data format but that doesn't means much.
AI seems to be the new Nanotechnology of the past where everything you stamp with the word get trending and get financing. Don't get me wrong, there was good stuff coming out of this, but a lot of things were just not that at all.
To me, a real deep learning up sampling would be able to eliminate completely those ghosting artefact. Also, it could also leverage less the temporal data and more the AI portion to resolve details. But this take a lot of calculation power and you probably can't run that in realtime. So what we have is a glorified proprietary temporal upsampling.
But also, don't get me wrong, Shimmering that FSR can bring are as annoying if not more than temporal upsampling.