Monday, March 16th 2020

Complete Hardware Specs Sheet of Xbox Series X Revealed
Microsoft just put out of the complete hardware specs-sheet of its next-generation Xbox Series X entertainment system. The list of hardware can go toe to toe with any modern gaming desktop, and even at its production scale, we're not sure if Microsoft can break-even at around $500, possibly counting on game and DLC sales to recover some of the costs and turn a profit. To begin with the semi-custom SoC at the heart of the beast, Microsoft partnered with AMD to deploy its current-generation "Zen 2" x86-64 CPU cores. Microsoft confirmed that the SoC will be built on the 7 nm "enhanced" process (very likely TSMC N7P). Its die-size is 360.45 mm².
The chip packs 8 "Zen 2" cores, with SMT enabling 16 logical processors, a humongous step up from the 8-core "Jaguar enhanced" CPU driving the Xbox One X. CPU clock speeds are somewhat vague. It points to 3.80 GHz nominal and 3.66 GHz with SMT enabled. Perhaps the console can toggle SMT somehow (possibly depending on whether a game requests it). There's no word on the CPU's cache sizes.The graphics processor is another key component of the SoC given its lofty design goal of being able to game at 4K UHD with real-time ray-tracing. This GPU is based on AMD's upcoming RDNA2 graphics architecture, which is a step up from "Navi" (RDNA), in featuring real-time ray-tracing hardware optimized for DXR 1.1 and support for variable-rate shading (VRS). The GPU features 52 compute units (3,328 stream processors provided each CU has 64 stream processors in RDNA2). The GPU ticks at an engine clock speed of up to 1825 MHz, and has a peak compute throughput of 12 TFLOPs (not counting CPU). The display engine supports resolutions of up to 8K, even though the console's own performance targets at 4K at 60 frames per second, and up to 120 FPS. Variable refresh-rate is supported.
The memory subsystem is similar to what we reported earlier today - a 320-bit GDDR6 memory interface holding 16 GB of memory (mixed chip densities). It's becoming clear that Microsoft isn't implementing a hUMA common memory pool approach. 10 GB of the 16 GB runs at 560 GB/s bandwidth, while 6 GB of it runs at 336 GB/s. Storage is another area that's receiving big hardware uplifts: the Xbox Series X features a 1 TB NVMe SSD with 2400 MB/s peak sequential transfer rate, and an option for an additional 1 TB NVMe storage through an expansion module. External storage devices are supported, too, over 10 Gbps USB 3.2 gen 2. The console is confirmed to feature a Blu-ray drive that supports 4K UHD Blu-ray playback. All these hardware specs combine toward what Microsoft calls the "Xbox Velocity Architecture." Microsoft is also working toward improving the input latency of its game controllers.
The chip packs 8 "Zen 2" cores, with SMT enabling 16 logical processors, a humongous step up from the 8-core "Jaguar enhanced" CPU driving the Xbox One X. CPU clock speeds are somewhat vague. It points to 3.80 GHz nominal and 3.66 GHz with SMT enabled. Perhaps the console can toggle SMT somehow (possibly depending on whether a game requests it). There's no word on the CPU's cache sizes.The graphics processor is another key component of the SoC given its lofty design goal of being able to game at 4K UHD with real-time ray-tracing. This GPU is based on AMD's upcoming RDNA2 graphics architecture, which is a step up from "Navi" (RDNA), in featuring real-time ray-tracing hardware optimized for DXR 1.1 and support for variable-rate shading (VRS). The GPU features 52 compute units (3,328 stream processors provided each CU has 64 stream processors in RDNA2). The GPU ticks at an engine clock speed of up to 1825 MHz, and has a peak compute throughput of 12 TFLOPs (not counting CPU). The display engine supports resolutions of up to 8K, even though the console's own performance targets at 4K at 60 frames per second, and up to 120 FPS. Variable refresh-rate is supported.
The memory subsystem is similar to what we reported earlier today - a 320-bit GDDR6 memory interface holding 16 GB of memory (mixed chip densities). It's becoming clear that Microsoft isn't implementing a hUMA common memory pool approach. 10 GB of the 16 GB runs at 560 GB/s bandwidth, while 6 GB of it runs at 336 GB/s. Storage is another area that's receiving big hardware uplifts: the Xbox Series X features a 1 TB NVMe SSD with 2400 MB/s peak sequential transfer rate, and an option for an additional 1 TB NVMe storage through an expansion module. External storage devices are supported, too, over 10 Gbps USB 3.2 gen 2. The console is confirmed to feature a Blu-ray drive that supports 4K UHD Blu-ray playback. All these hardware specs combine toward what Microsoft calls the "Xbox Velocity Architecture." Microsoft is also working toward improving the input latency of its game controllers.
128 Comments on Complete Hardware Specs Sheet of Xbox Series X Revealed
So we should compare it to custom SFF PCs. And guess what: AsRock DeskMini GTX/RX is less than half the size. :)
Don't take me wrong. The Xbox looks very promising. If they called it "Surface Desktop" and shipped with Windows, this might have been the first time since Diablo2 that I preorder anything.
And I may still buy it for gaming when prices go down.
But hardware wise... this is fantastic and it will mean a serious boost is on its way for PC graphics too. A new mainstream norm is what they're obviously shooting for, and that norm has 4K in it. I'm not complaining. The baby steps are finally over, its about god damn time. After all on the PC resolution is just one of the many choices to spend resources on.
RDNA2 though. Shit. 2.23 Ghz on the PS5 and here we have a wide GPU doing 1.8. That is good and it will mean these things finally boost and clock proper, capable of doing a wide range. It will be very interesting what Nvidia is going to pull out of the hat now, and its clear they need make a big dent, even despite the Turing headstart. I mean, I'm still not really counting the ridiculous product called 2080ti as a viable thing, and considering that, they've got work to do. Very glad to see AMD return to proper high end, and not trailing a gen or two. A lot of random thoughts... its all going to hammer and do this and that but you really don't know or can't say. And neither do we :)
The spec war however is just not interesting. When in doubt, watch the relevant South Park episode. What really matters is what the majority will do, and that focuses exclusively on the content that the majority can play.
Also, these consoles will hopefully provoke an explosion of RTRT games - something Nvidia is already prepared for, while AMD and Intel merely mentioned working on.
So yeah... not much changes in the balance of power.
But we're likely looking at huge jump in game requirements for games ported from consoles - including a possibility of games that won't run (or won't be playable) without RTRT acceleration...
As for the Xbox - I'm really interested if the non-gaming features will be developed further. If yes, this could be an easy buy.
The PS5 GPU is shaping up to be a lot like the 5700XT.
Memory bandwidth is very important at 4K60.
It'll be interesting either way to see what shakes out between these two. NAND reliability drops like a rock if it's too hot. Not to mention type of NAND, there's also components related to the storage that can very easily die.
If the drive on the board dies, will Sony let the PS5 boot off the USB drive? MS might let you boot off the second drive but I really doubt it.
I guess I am just not a fan of an Apple approach to hardware. If the SSD fails, new MacBook for you! Or at least a motherboard.
Forgive me while I still have my Xbox, 360, PS2, SNES, etc... Sorry if I insult your sensibilities when I buy a console not to puke it's guts out in 5 to 7 years.
Consumerism is a shitty excuse to drive corporate profits.
I really hope right to repair sneaks past the big money trying to stop it.
MS also showed the GoW 5 in-game benchmark in a build "updated" to run on XSX by a single engineer over two weeks (i.e. not optimized whatsoever) matching the performance of the same game with the same settings (PC Ultra IIRC) on a PC with a 2080. Which is a 215-225W GPU. The GoW 5 PC port is also generally regarded as being a very good port that performs very well and scales well with powerful PC hardware.
Beyond that, saying AMD "merely mentioned working on" RT - days after two consoles based on their hardware being announced with full RT support performant enough to run an early/unoptimized build of fully path traced Minecraft - is a serious stretch no matter if their PC GPUs have yet to be announced. While the details of AMD's implementation are scarce, we know it is bound to shader count and will thus scale up with bigger GPUs too.
I still expect Nvidia to have a small performance advantage with Ampere, but it doesn't look likely to be anything more than small, and given the proliferation of new features in the new consoles based on AMD hardware it's not all that likely that Nvidia will maintain an advantage in use/performance of new features simply due to the dominant implementation and development model being AMD-based (even if this ultimately falls back to open APIs like DXR).
As for non-gaming uses, I don't think that will happen. Consoles are not meant for general purpose use, and part of why MS lost the previous generation so badly was due to not focusing on gaming but too much on other features. They're obviously not making that mistake this time around.
We see this "is gonna" discussion with every new CPU release and ever new GPU release ... and when you go back to read the "is gonna" predictions, they never quite live up to the early billing. Save the enthusiasm for post release testing.
RT core is a specialized compute unit optimized for certain workload types and "shaders" are specialized compute units optimized for raster graphics workloads.
RDNA 2 and Turing effectively returns to DX9 style non-unified shader compute units. In 2013, AMD used to provide multiple SKU levels up to high-end GPU for the PC market while providing GPUs for MS and Sony. Server bias Bulldozer era financial problems and focus on TFLOPS bias server GPUs without scaling raster hardware have caused brain drain on AMD's RTG. Reminder for RTG, GPUs are not DSPs. RDNA 2 36 CU at 2230 Mhz vs RDNA 2 52 CU at 1825Mhz is like comparing RTX 2070 with 36 CU equivalent at 2230 Mhz OC (10.28 TFLOPS) with 448 GB/s BW against MSI RTX 2080 Super Gaming X Trio with 12.15 TFLOPS with 496 GB/s BW.
RTX 2070 at 2230 Mhz wouldn't beat MSI RTX 2080 Super Gaming X Trio.
RDNA 2 is not GCN i.e. RDNA (aka NAVI) is designed with scalability. Read AMD's road map.
As for saying GPU shaders are specialized compute units ... well, sure, they're specialized for FP32 (and formerly FP64, lately also FP16 and INT8/INT4) units, but FP32 operations are a quite general class of computation with uses far beyond graphics. RT operations are definitely more specialized than this. As such it's entirely accurate to call one a form of general purpose compute and one specialized hardware.
RT core is less specialized when compared to T&L/TFU (texture filter unit) /ROPS hardware since RT can accelerate non-graphics workloads such as audio and physics logic collision. BVH search tree and collision hardware have a wider application when compared to T&L hardware.
Modern ROPS has re-order layers via ROV (Rasterizer Order Views) feature instead of wasting compute shader resource.
Both Console chips are highly customized. Customized RDNA2 and ZEN2 with everything else.
Conversions like this is like saying an F1 car is 10 times the car a Honda Civic is because its 10 times faster, which ignores that the Civic can do a lot more than go fast - it can seat several people, take you grocery shopping, etc. FP32 is general purpose compute. RT cores do not do general purpose compute. Nor do tensor cores or any other specialized hardware. If Nvidia is copying a particularly stupid and easily misunderstood marketing point from AMD, that does not in any way make it less stupid or easily misunderstood.
Also, reported. Thanks for keeping the discussion civil, dude.
Meanwhile, NVIDIA PR throws in RT cores' TFLOPS into marketing.
Expect AMD PR to weaponize RT cores TFLOPS when "Big Navi" arrives.
Why debate about FP32 general-purpose shader compute (not generalize like SSE) when future game titles have significant RT workloads?
Current shaders accelerate Z-buffer accelerated structures while RT cores accelerate BVH accelerated structures.
And again, as addressed in my previous post: Nvidia adopting a bad marketing practice does not in any way wake it a good marketing practice. You apparently need to be spoon fed, so let's go through this point by point.
-TFLOPS in GPU performance metrics is generally accepted to mean FP32 TFLOPS, as that is the "baseline" industry-standard operation (single-precision compute) as opposed to higher or lower precisions (FP64, FP16, INT8, INT4, etc.).
-In GPUs these operations are performed by shader cores, which are fundamentally FP32 compute cores (though sometimes with various degrees of FP64 support either through dedicated hardware or the ability to combine two FP32 cores), which can also perform lower precision workloads either natively at the same speed or faster by combining several operations in one core.
-FP32 compute is a very broad category of general compute operations. Some of these operations can be done by various forms of specialized hardware, or can be done in lower precisions at higher speed (through methods like rapid packed math) without sacrificing the quality of the end result.
-Due to FP32 being a broad category a lot of FP32 operations can also be performed more efficiently by making specialized hardware for a subset of operations. This hardware, by virtue of being specialized for a specific subcategory of operations, is not capable of performing general FP32 compute operations.
-As the operations done on the specialized hardware can also be done on FP32 hardware, you can give an approximation of the equivalent FP32 performance necessary to match the performance of the specialized hardware. I.e. you can say things like "to match the performance of our RT cores you would need X number of FP32 FLOPS". These calculations are then dependent on - among other things - how efficient your implementation of said operation through general FP32 compute is. Two different solutions will very likely perform differently, and will thus result in different numbers for the same hardware.
-This is roughly equivalent to how fixed-function video encode/decode blocks can do this specialized subset of work faster and more efficiently than the same work performed on a CPU or GPU. That doesn't mean you can run your OS or games off a video encode/decode block, as this block is only capable of a small set of operations.
-These comparisons can't be expanded to other tasks, as the specialized hardware is not capable of general FP32 compute. FP32 hardware can do RT; RT hardware can't do FP32. I.e. you cannot say that "our RT cores are capable of X FP32 FLOPS" - because that statement is fundamentally untrue - your RT hardware is capable of zero FP32 FLOPS. That your F1 car (specialized hardware) can do some of the things your Civic (general hardware) can do - driving on a flat surface - and is "X times better" at that (i.e. faster around a track) does not mean that this can be transferred to the other things the general hardware can do - your F1 car has nowhere to put your groceries and would get stuck on the first speed bump you encountered, so it is fundamentally incapable of grocery shopping. It would also be fundamentally incapable of driving your friends around, or letting you listen to the radio while commuting. Just because specialized hardware can be compared to general hardware in the task the specialized hardware can do does not mean this comparison can be expanded into the other tasks that general hardware can do - because the specialized hardware is fundamentally incapable of doing these things.
-So, to sum up: AMD made a claim in marketing that, while technically true, needs to be understood in a very specific way to be true, and is very easy to misunderstand and thus misrepresent the capabilities of the hardware in question. The Xbox Series X is capable of 12.1 TFLOPS of FP32 compute. When performing combined rasterization and RT graphics workloads, it is capable of performing an amount of RT compute that would require 13 TFLOPS of FP32 compute to achieve if said workload was run on pure FP32 hardware (which it isn't, it's run on RT hardware). It is not, and will never be, capable of 25 TFLOPS of FP32 compute. Nvidia copying this does not in any way make it less problematic - I would say it makes it a lot more problematic, as there's no way of knowing if the two companies' ways of performing RT workloads on FP32 cores is equally performant, and unless they are, any comparisons are entirely invalid. Especially problematic is the fact that conversions like this make worse performance look better: if your RT-through-FP32 implementation is worse than the competition, you can claim that your RT hardware is equivalent to more FP32 hardware than theirs is. This tells us nothing of actual performance, only performance relative to something unknown and unknowable.
This just boils down to a very clear demonstration of how utterly useless FP32 FLOPS are as a metric of GPU performance. Not only is the translation from FP32 compute (TFLOPS) into gaming performance not 1:1 but dependent on drivers, hardware utilization, and architectural features, but this now adds another stack abstraction layers, meaning that any numbers made in this way are completely and utterly incomparable. Comparing FLOPS from pure shader hardware across AMD and Nvidia was already comparing apples and oranges, but now it's more like comparing apples and ... hedgehogs. Or something.
Btw, I would sincerely like to see you point out what of the above (or my previous posts on this) makes me an AMD fanboy. The ball's in your court on that one.