Tuesday, July 14th 2020
Microsoft Details Technologies at the Soul of Xbox Series X: The Velocity Architecture
When we set out to design the Xbox Series X, we aspired to build our most powerful console ever powered by next generation innovation and delivering consistent, sustained performance never before seen in a console with no compromises. To achieve this goal, we knew we needed to analyze each component of the system, to push beyond the limitations in traditional console performance and design. It was critical in the design of the Xbox Series X to ensure we had a superior balance of power, speed and performance while ensuring no component would constrain the creative ambition of the world's best creators, empowering them to deliver truly transformative next gen gaming experiences not possible in prior console generations.
At the heart of the Xbox Series X is our custom processor leveraging the latest RDNA 2 and Zen 2 architectures from our partners at AMD to deliver a best in class next generation processor delivering more than 12 TFLOPs of GPU power and more than 4 times the CPU processing power of the Xbox One X. Xbox Series X includes the highest memory bandwidth of any next generation console with 16 GB of GDDR6 memory, including 10 GB of GPU optimized memory at 560 GB/s to keep the processor fed with no bottlenecks. As we analyzed the storage subsystem, it became clear that we had reached the upper limits of traditional hard drive technology and to deliver on our design aspirations, we would need to radically rethink and revolutionize our approach with the Xbox Series X.
Empowering Next Generation Game Design and Creative Vision
Modern games require a significant amount of data to create the realistic worlds and universes that gamers experience. To enable the processor to work at its optimum performance, all of this data must be loaded from storage into memory. The explosion of massive, dynamic open-world environments and living, persistent worlds with increased density and variety has only increased the amount of data required. From environmental mesh data, high polygon character models, high resolution textures, animation data, audio and video source files and more all combine together to deliver the most immersive game play environment for the player.
Despite the ability for modern game engines and middleware to stream game assets into memory off of local storage, level designers are still often required to create narrow pathways, hallways, or elevators to work around the limitations of a traditional hard drive and I/O pipeline. These in-game elements are often used to mask the need to unload the prior zone's assets from memory while loading in new assets for the next play space. As we discussed developers' aspirations for their next generation titles and the limitations of current generation technology, this challenge would continue to increase exponentially and further constrain the ambition for truly transformative games. This feedback influenced the design and development of the Xbox Velocity Architecture.
Introducing the Xbox Velocity Architecture
The Xbox Velocity Architecture was designed as the ultimate solution for game asset streaming in the next generation. This radical reinvention of the traditional I/O subsystem directly influenced all aspects of the Xbox Series X design. If our custom designed processor is at the heart of the Xbox Series X, the Xbox Velocity Architecture is the soul. Through a deep integration of hardware and software innovation, the Xbox Velocity Architecture will power next-gen gaming experiences unlike anything you have seen before.
The Xbox Velocity Architecture comprises four major components: our custom NVME SSD, hardware accelerated decompression blocks, a brand new DirectStorage API layer and Sampler Feedback Streaming (SFS).
Let's dive deep into each component:
Custom NVME SSD: The foundation of the Xbox Velocity Architecture is our custom, 1TB NVME SSD, delivering 2.4 GB/s of raw I/O throughput, more than 40x the throughput of Xbox One. Traditional SSDs used in PCs often reduce performance as thermals increase or while performing drive maintenance. The custom NVME SSD in Xbox Series X is designed for consistent, sustained performance as opposed to peak performance. Developers have a guaranteed level of I/O performance at all times and they can reliably design and optimize their games removing the barriers and constraints they have to work around today. This same level of consistent, sustained performance also applies to the Seagate Expandable Storage Card ensuring you have the exact same gameplay experience regardless of where the game resides.
Hardware Accelerated Decompression: Game packages and assets are compressed to minimize download times and the amount of storage required for each individual game. With hardware accelerated support for both the industry standard LZ decompressor as well as a brand new, proprietary algorithm specifically designed for texture data named BCPack, Xbox Series X provides the best of both worlds for developers to achieve massive savings with no loss in quality or performance. As texture data comprises a significant portion of the total overall size of a game, having a purpose built algorithm optimized for texture data in addition to the general purpose LZ decompressor, both can be used in parallel to reduce the overall size of a game package. Assuming a 2:1 compression ratio, Xbox Series X delivers an effective 4.8 GB/s in I/O performance to the title, approximately 100x the I/O performance in current generation consoles. To deliver similar levels of decompression performance in software would require more than 4 Zen 2 CPU cores.
New DirectStorage API: Standard File I/O APIs were developed more than 30 years ago and are virtually unchanged while storage technology has made significant advancements since then. As we analyzed game data access patterns as well as the latest hardware advancements with SSD technology, we knew we needed to advance the state of the art to put more control in the hands of developers. We added a brand new DirectStorage API to the DirectX family, providing developers with fine grain control of their I/O operations empowering them to establish multiple I/O queues, prioritization and minimizing I/O latency. These direct, low level access APIs ensure developers will be able to take full advantage of the raw I/O performance afforded by the hardware, resulting in virtually eliminating load times or fast travel systems that are just that... fast.
Sampler Feedback Streaming (SFS): Sampler Feedback Streaming is a brand-new innovation built on top of all the other advancements of the Xbox Velocity Architecture. Game textures are optimized at differing levels of detail and resolution, called mipmaps, and can be used during rendering based on how close or far away an object is from the player. As an object moves closer to the player, the resolution of the texture must increase to provide the crisp detail and visuals that gamers expect. However, these larger mipmaps require a significant amount of memory compared to the lower resolution mips that can be used if the object is further away in the scene. Today, developers must load an entire mip level in memory even in cases where they may only sample a very small portion of the overall texture. Through specialized hardware added to the Xbox One X, we were able to analyze texture memory usage by the GPU and we discovered that the GPU often accesses less than 1/3 of the texture data required to be loaded in memory. A single scene often includes thousands of different textures resulting in a significant loss in effective memory and I/O bandwidth utilization due to inefficient usage. With this insight, we were able to create and add new capabilities to the Xbox Series X GPU which enables it to only load the sub portions of a mip level into memory, on demand, just in time for when the GPU requires the data. This innovation results in approximately 2.5x the effective I/O throughput and memory usage above and beyond the raw hardware capabilities on average. SFS provides an effective multiplier on available system memory and I/O bandwidth, resulting in significantly more memory and I/O throughput available to make your game richer and more immersive.
Through the massive increase in I/O throughput, hardware accelerated decompression, DirectStorage, and the significant increases in efficiency provided by Sampler Feedback Streaming, the Xbox Velocity Architecture enables the Xbox Series X to deliver effective performance well beyond the raw hardware specs, providing direct, instant, low level access to more than 100GB of game data stored on the SSD just in time for when the game requires it. These innovations will unlock new gameplay experiences and a level of depth and immersion unlike anything you have previously experienced in gaming.
Unlocking Next Generation Experiences
What does this all mean for you as a gamer? As the industry's most creative developers and middleware companies have begun to explore these new capabilities, we expect significant innovation throughout the next generation as this revolutionary new architecture enables entirely new scenarios never before considered possible in gaming. The Xbox Velocity Architecture provides a new level of performance and capabilities well beyond the raw specifications of the hardware itself. The Xbox Velocity Architecture fundamentally rethinks how a developer can take advantage of the hardware provided by the Xbox Series X. From entirely new rendering techniques to the virtual elimination of loading times, to larger, more dynamic living worlds where, as a gamer, you can choose how you want to explore, we can't be more excited by the early results we are already seeing. In addition, the Xbox Velocity Architecture has opened even more opportunities and enabled new innovations at the platform level, such as Quick Resume which enables you to instantly resume where you left off across multiple games, improving the overall gaming experience for all gamers on Xbox Series X.
We can't wait for gamers around the world to get to experience these new, next generation gaming experiences on Xbox Series X this holiday and beyond.
At the heart of the Xbox Series X is our custom processor leveraging the latest RDNA 2 and Zen 2 architectures from our partners at AMD to deliver a best in class next generation processor delivering more than 12 TFLOPs of GPU power and more than 4 times the CPU processing power of the Xbox One X. Xbox Series X includes the highest memory bandwidth of any next generation console with 16 GB of GDDR6 memory, including 10 GB of GPU optimized memory at 560 GB/s to keep the processor fed with no bottlenecks. As we analyzed the storage subsystem, it became clear that we had reached the upper limits of traditional hard drive technology and to deliver on our design aspirations, we would need to radically rethink and revolutionize our approach with the Xbox Series X.
Empowering Next Generation Game Design and Creative Vision
Modern games require a significant amount of data to create the realistic worlds and universes that gamers experience. To enable the processor to work at its optimum performance, all of this data must be loaded from storage into memory. The explosion of massive, dynamic open-world environments and living, persistent worlds with increased density and variety has only increased the amount of data required. From environmental mesh data, high polygon character models, high resolution textures, animation data, audio and video source files and more all combine together to deliver the most immersive game play environment for the player.
Despite the ability for modern game engines and middleware to stream game assets into memory off of local storage, level designers are still often required to create narrow pathways, hallways, or elevators to work around the limitations of a traditional hard drive and I/O pipeline. These in-game elements are often used to mask the need to unload the prior zone's assets from memory while loading in new assets for the next play space. As we discussed developers' aspirations for their next generation titles and the limitations of current generation technology, this challenge would continue to increase exponentially and further constrain the ambition for truly transformative games. This feedback influenced the design and development of the Xbox Velocity Architecture.
Introducing the Xbox Velocity Architecture
The Xbox Velocity Architecture was designed as the ultimate solution for game asset streaming in the next generation. This radical reinvention of the traditional I/O subsystem directly influenced all aspects of the Xbox Series X design. If our custom designed processor is at the heart of the Xbox Series X, the Xbox Velocity Architecture is the soul. Through a deep integration of hardware and software innovation, the Xbox Velocity Architecture will power next-gen gaming experiences unlike anything you have seen before.
The Xbox Velocity Architecture comprises four major components: our custom NVME SSD, hardware accelerated decompression blocks, a brand new DirectStorage API layer and Sampler Feedback Streaming (SFS).
Let's dive deep into each component:
Custom NVME SSD: The foundation of the Xbox Velocity Architecture is our custom, 1TB NVME SSD, delivering 2.4 GB/s of raw I/O throughput, more than 40x the throughput of Xbox One. Traditional SSDs used in PCs often reduce performance as thermals increase or while performing drive maintenance. The custom NVME SSD in Xbox Series X is designed for consistent, sustained performance as opposed to peak performance. Developers have a guaranteed level of I/O performance at all times and they can reliably design and optimize their games removing the barriers and constraints they have to work around today. This same level of consistent, sustained performance also applies to the Seagate Expandable Storage Card ensuring you have the exact same gameplay experience regardless of where the game resides.
Hardware Accelerated Decompression: Game packages and assets are compressed to minimize download times and the amount of storage required for each individual game. With hardware accelerated support for both the industry standard LZ decompressor as well as a brand new, proprietary algorithm specifically designed for texture data named BCPack, Xbox Series X provides the best of both worlds for developers to achieve massive savings with no loss in quality or performance. As texture data comprises a significant portion of the total overall size of a game, having a purpose built algorithm optimized for texture data in addition to the general purpose LZ decompressor, both can be used in parallel to reduce the overall size of a game package. Assuming a 2:1 compression ratio, Xbox Series X delivers an effective 4.8 GB/s in I/O performance to the title, approximately 100x the I/O performance in current generation consoles. To deliver similar levels of decompression performance in software would require more than 4 Zen 2 CPU cores.
New DirectStorage API: Standard File I/O APIs were developed more than 30 years ago and are virtually unchanged while storage technology has made significant advancements since then. As we analyzed game data access patterns as well as the latest hardware advancements with SSD technology, we knew we needed to advance the state of the art to put more control in the hands of developers. We added a brand new DirectStorage API to the DirectX family, providing developers with fine grain control of their I/O operations empowering them to establish multiple I/O queues, prioritization and minimizing I/O latency. These direct, low level access APIs ensure developers will be able to take full advantage of the raw I/O performance afforded by the hardware, resulting in virtually eliminating load times or fast travel systems that are just that... fast.
Sampler Feedback Streaming (SFS): Sampler Feedback Streaming is a brand-new innovation built on top of all the other advancements of the Xbox Velocity Architecture. Game textures are optimized at differing levels of detail and resolution, called mipmaps, and can be used during rendering based on how close or far away an object is from the player. As an object moves closer to the player, the resolution of the texture must increase to provide the crisp detail and visuals that gamers expect. However, these larger mipmaps require a significant amount of memory compared to the lower resolution mips that can be used if the object is further away in the scene. Today, developers must load an entire mip level in memory even in cases where they may only sample a very small portion of the overall texture. Through specialized hardware added to the Xbox One X, we were able to analyze texture memory usage by the GPU and we discovered that the GPU often accesses less than 1/3 of the texture data required to be loaded in memory. A single scene often includes thousands of different textures resulting in a significant loss in effective memory and I/O bandwidth utilization due to inefficient usage. With this insight, we were able to create and add new capabilities to the Xbox Series X GPU which enables it to only load the sub portions of a mip level into memory, on demand, just in time for when the GPU requires the data. This innovation results in approximately 2.5x the effective I/O throughput and memory usage above and beyond the raw hardware capabilities on average. SFS provides an effective multiplier on available system memory and I/O bandwidth, resulting in significantly more memory and I/O throughput available to make your game richer and more immersive.
Through the massive increase in I/O throughput, hardware accelerated decompression, DirectStorage, and the significant increases in efficiency provided by Sampler Feedback Streaming, the Xbox Velocity Architecture enables the Xbox Series X to deliver effective performance well beyond the raw hardware specs, providing direct, instant, low level access to more than 100GB of game data stored on the SSD just in time for when the game requires it. These innovations will unlock new gameplay experiences and a level of depth and immersion unlike anything you have previously experienced in gaming.
Unlocking Next Generation Experiences
What does this all mean for you as a gamer? As the industry's most creative developers and middleware companies have begun to explore these new capabilities, we expect significant innovation throughout the next generation as this revolutionary new architecture enables entirely new scenarios never before considered possible in gaming. The Xbox Velocity Architecture provides a new level of performance and capabilities well beyond the raw specifications of the hardware itself. The Xbox Velocity Architecture fundamentally rethinks how a developer can take advantage of the hardware provided by the Xbox Series X. From entirely new rendering techniques to the virtual elimination of loading times, to larger, more dynamic living worlds where, as a gamer, you can choose how you want to explore, we can't be more excited by the early results we are already seeing. In addition, the Xbox Velocity Architecture has opened even more opportunities and enabled new innovations at the platform level, such as Quick Resume which enables you to instantly resume where you left off across multiple games, improving the overall gaming experience for all gamers on Xbox Series X.
We can't wait for gamers around the world to get to experience these new, next generation gaming experiences on Xbox Series X this holiday and beyond.
22 Comments on Microsoft Details Technologies at the Soul of Xbox Series X: The Velocity Architecture
well there might be some collab. from amd/sony/ms/intel etc., that produced new advancement in storage tech (or method) with which they try to debut with .. like saying: - no no, consoles != pc, consoles better for gaming
bcs there we make much more money razer then on latter partNow that SSD are common and can saturate CPU with I/O, it's a good time to revisit the I/O stack and what we can do with it.
If it is fast enough however you can real time stream in the textures and geometry which will allow some neat things, but GPU processing is still king.
Might work for very small data (depending on the size of the algorithm's window), but for things that really matter, textures, audio and video? Lol... Less loading times, less texture popping (though the last games I recall these were an issue in, even on spinners, were Singularity and the first Rage), and perhaps an easier life to whomever develops an engine's texture streamer.
Rendering and disk I/O aren't tightly coupled together for one to matter to the other. Your assets are either in the fast video memory when the appropriate part of the pipeline runs or it isn't. If that texture isn't there when the pixel shader runs, you'll end up with a pink (or whatever default/error-state colour the engine uses) coloured object.
And few games upto now we're. Made for nvme.
Xbox controller is the same as the Nintendo, Dreamcast, and PC controllers that we all love, only PS wants their own layout with the terrible location of the left control stick.
If that give people more performance, these specs will because common. For now there is just no point.
Since the PS3 and especially the 360 which when it launched had a better GPU than anything on PC, now no one is pushing the boundaries of technology in consoles, the PS4 and the One were pathetic and still expensive, they want to milk us that's all. But it's the same everywhere, consumers are dumber these days, they buy phones for 1500€ with less features and with a bunch of dead pixels on the screen...
Ram is very different from storage. You can't do the same things. Same thing with the GPU, it does much more than geometry, so the new Unreal 5 technique that relies on I/O is cool, but it is mainly about allowing higher poly models and no LoD; it won't magically solve all the other GPU bottlenecks, and can't be applied to all the other functions of the GPU (lighting, transparencies, etc.).