• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA RTX IO Detailed: GPU-assisted Storage Stack Here to Stay Until CPU Core-counts Rise

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,230 (7.55/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
NVIDIA at its GeForce "Ampere" launch event announced the RTX IO technology. Storage is the weakest link in a modern computer, from a performance standpoint, and SSDs have had a transformational impact. With modern SSDs leveraging PCIe, consumer storage speeds are now bound to grow with each new PCIe generation doubling per-lane IO bandwidth. PCI-Express Gen 4 enables 64 Gbps bandwidth per direction on M.2 NVMe SSDs, AMD has already implemented it across its Ryzen desktop platform, Intel has it on its latest mobile platforms, and is expected to bring it to its desktop platform with "Rocket Lake." While more storage bandwidth is always welcome, the storage processing stack (the task of processing ones and zeroes to the physical layer), is still handled by the CPU. With rise in storage bandwidth, the IO load on the CPU rises proportionally, to a point where it can begin to impact performance. Microsoft sought to address this emerging challenge with the DirectStorage API, but NVIDIA wants to build on this.

According to tests by NVIDIA, reading uncompressed data from an SSD at 7 GB/s (typical max sequential read speeds of client-segment PCIe Gen 4 M.2 NVMe SSDs), requires the full utilization of two CPU cores. The OS typically spreads this workload across all available CPU cores/threads on a modern multi-core CPU. Things change dramatically when compressed data (such as game resources) are being read, in a gaming scenario, with a high number of IO requests. Modern AAA games have hundreds of thousands of individual resources crammed into compressed resource-pack files.



Although at a disk IO-level, ones and zeroes are still being moved at up to 7 GB/s, the de-compressed data stream at the CPU-level can be as high as 14 GB/s (best case compression). Add to this, each IO request comes with its own overhead - a set of instructions for the CPU to fetch x piece of resource from y file, and deliver to z buffer, along with instructions to de-compress or decrypt the resource. This could take an enormous amount of CPU muscle at a high IO throughput scale, and NVIDIA pegs the number of CPU cores required as high as 24. As we explained earlier, DirectStorage enables a path for devices to directly process the storage stack to access the resources they need. The API by Microsoft was originally developed for the Xbox Series X, but is making its debut on the PC platform.

NVIDIA RTX IO is a concentric outer layer of DirectStorage, which is optimized further for gaming, and NVIDIA's GPU architecture. RTX IO brings to the table GPU-accelerated lossless data decompression, which means data remains compressed and bunched up with fewer IO headers, as it's being moved from the disk to the GPU, leveraging DirectStorage. NVIDIA claims that this improves IO performance by a factor of 2. NVIDIA further claims that GeForce RTX GPUs, thanks to their high CUDA core counts, are capable of offloading "dozens" of CPU cores, driving decompression performance beyond even what compressed data loads PCIe Gen 4 SSDs can throw at them.

There is, however, a tiny wrinkle. Games need to be optimized for DirectStorage. Since the API has already been deployed on Xbox since the Xbox Series X, most AAA games for Xbox that have PC versions, already have some awareness of the tech, however, the PC versions will need to be patched to use the tech. Games will further need NVIDIA RTX IO awareness, and NVIDIA needs to add support on a per-game basis via GeForce driver updates. NVIDIA didn't detail which GPUs will support the tech, but given its wording, and the use of "RTX" in the branding of the feature, NVIDIA could release the feature to RTX 20-series "Turing" and RTX 30-series "Ampere." The GTX 16-series probably misses out as what NVIDIA hopes to accomplish with RTX IO is probably too heavy on the 16-series, and this may have purely been a performance-impact based decision for NVIDIA.

View at TechPowerUp Main Site
 
Joined
Nov 18, 2010
Messages
7,529 (1.47/day)
Location
Rīga, Latvia
System Name HELLSTAR
Processor AMD RYZEN 9 5950X
Motherboard ASUS Strix X570-E
Cooling 2x 360 + 280 rads. 3x Gentle Typhoons, 3x Phanteks T30, 2x TT T140 . EK-Quantum Momentum Monoblock.
Memory 4x8GB G.SKILL Trident Z RGB F4-4133C19D-16GTZR 14-16-12-30-44
Video Card(s) Sapphire Pulse RX 7900XTX. Water block. Crossflashed.
Storage Optane 900P[Fedora] + WD BLACK SN850X 4TB + 750 EVO 500GB + 1TB 980PRO+SN560 1TB(W11)
Display(s) Philips PHL BDM3270 + Acer XV242Y
Case Lian Li O11 Dynamic EVO
Audio Device(s) SMSL RAW-MDA1 DAC
Power Supply Fractal Design Newton R3 1000W
Mouse Razer Basilisk
Keyboard Razer BlackWidow V3 - Yellow Switch
Software FEDORA 41
Basically same stuff as Sony did in PS5 presentation.

Just a fancy new name, and calling we did it first and it is ours.
 
Joined
Dec 29, 2010
Messages
3,809 (0.75/day)
Processor AMD 5900x
Motherboard Asus x570 Strix-E
Cooling Hardware Labs
Memory G.Skill 4000c17 2x16gb
Video Card(s) RTX 3090
Storage Sabrent
Display(s) Samsung G9
Case Phanteks 719
Audio Device(s) Fiio K5 Pro
Power Supply EVGA 1000 P2
Mouse Logitech G600
Keyboard Corsair K95
This seems redundant with 8, 12 and 16 core cpus.
 

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,230 (7.55/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
This seems redundant with 8, 12 and 16 core cpus.
Apparently NVIDIA thinks not. A compressed data stream of game resources can bog down up to 24 cores.
 
Joined
Mar 23, 2016
Messages
4,841 (1.53/day)
Processor Core i7-13700
Motherboard MSI Z790 Gaming Plus WiFi
Cooling Cooler Master RGB something
Memory Corsair DDR5-6000 small OC to 6200
Video Card(s) XFX Speedster SWFT309 AMD Radeon RX 6700 XT CORE Gaming
Storage 970 EVO NVMe M.2 500GB,,WD850N 2TB
Display(s) Samsung 28” 4K monitor
Case Phantek Eclipse P400S
Audio Device(s) EVGA NU Audio
Power Supply EVGA 850 BQ
Mouse Logitech G502 Hero
Keyboard Logitech G G413 Silver
Software Windows 11 Professional v23H2
A compressed data stream of game resources can bog down up to 24 cores.
No benefit from SMT? I’m going off the performance uplift Ryzen gets from SMT.
 
Joined
May 2, 2017
Messages
7,762 (2.81/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
This seems redundant with 8, 12 and 16 core cpus.
Given that MS and Sony claim decompressing NVMe bandwith-amounts of game data can consume an equivalent of >6 Zen 2 CPU cores, I would say no. Remember, that is on paper; these workloads don't scale that well in real life, and would inevitably become a bottleneck. There's a reason MS is moving DirectStorage from the XSX to Windows.
 
Joined
May 15, 2020
Messages
697 (0.42/day)
Location
France
System Name Home
Processor Ryzen 3600X
Motherboard MSI Tomahawk 450 MAX
Cooling Noctua NH-U14S
Memory 16GB Crucial Ballistix 3600 MHz DDR4 CAS 16
Video Card(s) MSI RX 5700XT EVOKE OC
Storage Samsung 970 PRO 512 GB
Display(s) ASUS VA326HR + MSI Optix G24C4
Case MSI - MAG Forge 100M
Power Supply Aerocool Lux RGB M 650W
,
Apparently NVIDIA thinks not. A compressed data stream of game resources can bog down up to 24 cores.
That's most probably a theoretical worst case scenario that has 0 chance of really happening. The PS5 guys had said 12 cores on the same issue, that's probably exaggerated a bit, too. And it was about requiring a 12 cores, not using all of them.
 
Joined
May 2, 2017
Messages
7,762 (2.81/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
That's most probably a theoretical worst case scenario that has 0 chance of really happening. The PS5 guys had said 12 cores on the same issue, that's probably exaggerated a bit, too. And it was about requiring a 12 cores, not using all of them.
No, it was about requiring the equivalent computational power of 12 Zen 2 cores at PS5 speeds to decompress that amount of data at the required rate. It was theoretical in as much as few games will be streaming in 5.5GB/s of entirely compressed data, but beyond that it's entirely real.
 
Joined
Dec 14, 2011
Messages
1,031 (0.22/day)
Location
South-Africa
Processor AMD Ryzen 9 5900X
Motherboard ASUS ROG STRIX B550-F GAMING (WI-FI)
Cooling Corsair iCUE H115i Elite Capellix 280mm
Memory 32GB G.Skill DDR4 3600Mhz CL18
Video Card(s) ASUS GTX 1650 TUF
Storage Sabrent Rocket 1TB M.2
Display(s) Dell S3220DGF
Case Corsair iCUE 4000X
Audio Device(s) ASUS Xonar D2X
Power Supply Corsair AX760 Platinum
Mouse Razer DeathAdder V2 - Wireless
Keyboard Redragon K618 RGB PRO
Software Microsoft Windows 11 - Enterprise (64-bit)
Shouldn't they be adding more ram to GPUs

This is also my concern, when I still used my GTX1070, I came to close to the 8GB usage on several occasions, though enough for the time... I am not so sure for the future, with everything getting as big as they do these days, I don't like it, I don't like it one bit.
 
Joined
May 15, 2020
Messages
697 (0.42/day)
Location
France
System Name Home
Processor Ryzen 3600X
Motherboard MSI Tomahawk 450 MAX
Cooling Noctua NH-U14S
Memory 16GB Crucial Ballistix 3600 MHz DDR4 CAS 16
Video Card(s) MSI RX 5700XT EVOKE OC
Storage Samsung 970 PRO 512 GB
Display(s) ASUS VA326HR + MSI Optix G24C4
Case MSI - MAG Forge 100M
Power Supply Aerocool Lux RGB M 650W
No, it was about requiring the equivalent computational power of 12 Zen 2 cores at PS5 speeds to decompress that amount of data at the required rate. It was theoretical in as much as few games will be streaming in 5.5GB/s of entirely compressed data, but beyond that it's entirely real.
I was replying to btarunr, who was quoting Nvidia's 24 cores claim, you missed that.

Anyway, besides the marketing quoting worst-case scenarios, that's definitely a much more efficient way of doing these transfers, and AMD will most probably be doing the same thing, with a different name.

Edit to add: To the OP and title, I don't see any reason for this kind of optimisation to disappear even when core counts increase. Doing this way it's much more efficient, just like DMA for disk drives is much more efficient, they will be replaced by other technologies, but it makes no sense to make all this data transition through the CPU only for decompression.
 
Last edited:
Joined
Dec 26, 2016
Messages
287 (0.10/day)
Processor Ryzen 3900x
Motherboard B550M Steel Legend
Cooling XPX (custom loop)
Memory 32GB 3200MHz cl16
Video Card(s) 3080 with Bykski block (custom loop)
Storage 980 Pro
Case Fractal 804
Power Supply Focus Plus Gold 750FX
Mouse G603
Keyboard G610 brown
Software yes, lots!
SSDs are huge and cheap, why not just put uncompressed (or less compressed) data there? Even if a game were to use maybe 200 or 300 GB, I would prefer that to the load times of Wasteland 3.... I dont have 10 Games installed at every moment, so i could allocate SSD space for the 1-3 games that I am actually playing at the moment.

Ages ago every game would let you choose how much of the installation you wanted to put on HDD and how much would be left of the CD/DVD. Why not add an option to chose compression level of stored data?
 
Joined
Feb 13, 2017
Messages
143 (0.05/day)
This is also my concern, when I still used my GTX1070, I came to close to the 8GB usage on several occasions, though enough for the time... I am not so sure for the future, with everything getting as big as they do these days, I don't like it, I don't like it one bit.
Same here, even at 1080p. 8/10GB is a bad joke and the great performance these cards have make it even more stupid.
 
D

Deleted member 185088

Guest
Would AMD and Intel dedicate some silicon for data decompression in their CPUs, both the PS5 and Series X are using some silicon to do that.
 
Joined
Dec 15, 2009
Messages
233 (0.04/day)
Location
Austria
SSDs are huge and cheap, why not just put uncompressed (or less compressed) data there? Even if a game were to use maybe 200 or 300 GB, I would prefer that to the load times of Wasteland 3.... I dont have 10 Games installed at every moment, so i could allocate SSD space for the 1-3 games that I am actually playing at the moment.

Ages ago every game would let you choose how much of the installation you wanted to put on HDD and how much would be left of the CD/DVD. Why not add an option to chose compression level of stored data?

Short and Sweet, Server bandwidth, Limited GPU ram and SSDs aren't cheap :)
 
Joined
May 2, 2017
Messages
7,762 (2.81/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
I was replying to btarunr, who was quoting Nvidia's 24 cores claim, you missed that.

Anyway, besides the marketing quoting worst-case scenarios, that's definitely a much more efficient way of doing these transfers, and AMD will most probably be doing the same thing, with a different name.
I didn't miss that, I was responding both to that and to your specific mention of Sony's claim of the equivalent of 12 cores of decompression for the PS5.

As for these numbers being a worst case scenario, I disagree, mainly as the scaling is most likely calculated with 100% scaling, i.e. 1 core working 100% with decompression = X, 12 cores = 12X, despite scaling never really being 100% in the real world. As such this is a favorable comparison, not a worst-case scenario, and saying "would require the equivalent of n cores" could just as well end up requiring more than this to account for imperfect scaling. I sincerely hope AMD also adds a decompression accelerator to RDNA2, which would make a lot of sense given that they designed those for both MS and Sony in the first place.

Edit to add: To the OP and title, I don't see any reason for this kind of optimisation to disappear even when core counts increase. Doing this way it's much more efficient, just like DMA for disk drives is much more efficient, they will be replaced by other technologies, but it makes no sense to make all this data transition through the CPU only for decompression.
Here I entirely agree with you. There's no reason to move this back to the CPU in the future - it's a workload that only really benefits the GPU (nothing but the GPU really uses compressed game assets, and in the edge cases where the CPU might need some it should be able to handle that), thus alleviating load on the PCIe link by bypassing the CPU, and given that GPUs are more frequently replaced than CPUs it also allows for more flexibility in terms of upgrades, adding new compression standards, etc. Keeping this functionality as a dedicated acceleration block on the GPU makes a ton of sense.

SSDs are huge and cheap, why not just put uncompressed (or less compressed) data there? Even if a game were to use maybe 200 or 300 GB, I would prefer that to the load times of Wasteland 3.... I dont have 10 Games installed at every moment, so i could allocate SSD space for the 1-3 games that I am actually playing at the moment.

Ages ago every game would let you choose how much of the installation you wanted to put on HDD and how much would be left of the CD/DVD. Why not add an option to chose compression level of stored data?
Sorry, but what world do you live in? NVMe SSDs have come down a lot in price, but cheap? No. Especially not in capacities like what would be needed for even three games with your 2-300GB install sizes. And remember, even with compressed assets games are now hitting 150-200GB. Not to mention the effect removing compression would have on download times, or install times if data was downloaded and then decompressed directly. Compressing game assets is the only logical way of moving forward.
 
Joined
Dec 26, 2016
Messages
287 (0.10/day)
Processor Ryzen 3900x
Motherboard B550M Steel Legend
Cooling XPX (custom loop)
Memory 32GB 3200MHz cl16
Video Card(s) 3080 with Bykski block (custom loop)
Storage 980 Pro
Case Fractal 804
Power Supply Focus Plus Gold 750FX
Mouse G603
Keyboard G610 brown
Software yes, lots!
Short and Sweet, Server bandwidth, Limited GPU ram and SSDs aren't cheap :)

Server bandwidth would stay the same, since decompressing would take place at installation at the client, the amount of downloaded data during installation would not change. Installation time would rise a bit though.
GPU ram requirements would not change either, because only decompressed data is stored there, so no change there.
How cheap ssds are is for the user to decide. If you wanna save on ssd volume, you can opt for longer loading times, if you have ssd space to spare, you can opt for the uncompressed installation.
 
Joined
Oct 22, 2014
Messages
14,084 (3.82/day)
Location
Sunshine Coast
System Name H7 Flow 2024
Processor AMD 5800X3D
Motherboard Asus X570 Tough Gaming
Cooling Custom liquid
Memory 32 GB DDR4
Video Card(s) Intel ARC A750
Storage Crucial P5 Plus 2TB.
Display(s) AOC 24" Freesync 1m.s. 75Hz
Mouse Lenovo
Keyboard Eweadn Mechanical
Software W11 Pro 64 bit
Here I entirely agree with you. There's no reason to move this back to the CPU in the future - it's a workload that only really benefits the GPU (nothing but the GPU really uses compressed game assets, and in the edge cases where the CPU might need some it should be able to handle that), thus alleviating load on the PCIe link by bypassing the CPU, and given that GPUs are more frequently replaced than CPUs it also allows for more flexibility in terms of upgrades, adding new compression standards, etc. Keeping this functionality as a dedicated acceleration block on the GPU makes a ton of sense.
Why not an independent co-processor using Nvlink that offloads directly, negating the need to use the CPU at all.
 
Joined
May 15, 2020
Messages
697 (0.42/day)
Location
France
System Name Home
Processor Ryzen 3600X
Motherboard MSI Tomahawk 450 MAX
Cooling Noctua NH-U14S
Memory 16GB Crucial Ballistix 3600 MHz DDR4 CAS 16
Video Card(s) MSI RX 5700XT EVOKE OC
Storage Samsung 970 PRO 512 GB
Display(s) ASUS VA326HR + MSI Optix G24C4
Case MSI - MAG Forge 100M
Power Supply Aerocool Lux RGB M 650W
I didn't miss that, I was responding both to that and to your specific mention of Sony's claim of the equivalent of 12 cores of decompression for the PS5.
Let me make it plainer then:
Apparently NVIDIA thinks not. A compressed data stream of game resources can bog down up to 24 cores.
That's most probably a theoretical worst-case scenario that has 0 chance of really happening.
Thwe theoretical worst case is about the 24 core Nvidia claim.
The PS5 guys had said 12 cores on the same issue, that's probably exaggerated a bit, too. And it was about requiring a 12 cores, not using all of them.
The PS5 is probably exaggerated a bit, in that there might be remaining computing capacity on those 12 cores, basically the same thing you are saying.

I sincerely hope AMD also adds a decompression accelerator to RDNA2, which would make a lot of sense given that they designed those for both MS and Sony in the first place.
The trouble here is that the teams working on RDNA2 discrete graphic cards and those working on consoles are different and under NDA's for 2 years. So it's not sure exactly what could trickle from the consoles to this generation, though I definitely hope the same as you do, otherwise RDNA2 cards will have trouble keeping up with the next-gen games.
 
Last edited:
Joined
Nov 11, 2016
Messages
3,403 (1.16/day)
System Name The de-ploughminator Mk-III
Processor 9800X3D
Motherboard Gigabyte X870E Aorus Master
Cooling DeepCool AK620
Memory 2x32GB G.SKill 6400MT Cas32
Video Card(s) Asus RTX4090 TUF
Storage 4TB Samsung 990 Pro
Display(s) 48" LG OLED C4
Case Corsair 5000D Air
Audio Device(s) KEF LSX II LT speakers + KEF KC62 Subwoofer
Power Supply Corsair HX850
Mouse Razor Death Adder v3
Keyboard Razor Huntsman V3 Pro TKL
Software win11
Server bandwidth would stay the same, since decompressing would take place at installation at the client, the amount of downloaded data during installation would not change. Installation time would rise a bit though.
GPU ram requirements would not change either, because only decompressed data is stored there, so no change there.
How cheap ssds are is for the user to decide. If you wanna save on ssd volume, you can opt for longer loading times, if you have ssd space to spare, you can opt for the uncompressed installation.

The point of compress/decompress data is the increase the "effective" bandwidth; meaning if you compress a file to 1/2 the size and send it over a network, you just effectivelly doubling the network bandwidth.
Nvidia is saying they could get 2x the effective bandwidth out of PCIe gen 4 x4 NVMe drive, that is 14GBs of effective bandwidth. Imagine no loading time, no texture pop-in with open world game.
 
Joined
Dec 26, 2016
Messages
287 (0.10/day)
Processor Ryzen 3900x
Motherboard B550M Steel Legend
Cooling XPX (custom loop)
Memory 32GB 3200MHz cl16
Video Card(s) 3080 with Bykski block (custom loop)
Storage 980 Pro
Case Fractal 804
Power Supply Focus Plus Gold 750FX
Mouse G603
Keyboard G610 brown
Software yes, lots!
Sorry, but what world do you live in? NVMe SSDs have come down a lot in price, but cheap? No. Especially not in capacities like what would be needed for even three games with your 2-300GB install sizes. And remember, even with compressed assets games are now hitting 150-200GB. Not to mention the effect removing compression would have on download times, or install times if data was downloaded and then decompressed directly. Compressing game assets is the only logical way of moving forward.

What games hit 150TB? Oh yeah, Flight Simulator, so you're right of course, there is one big Game now! Most AAA games are still in the 50GB range!
Good NVME SSDs cost about 150$ per TB
I don't know what the compression factor of these assets is, but lets say its 1:6 so a 50GB game comes to 300GB of uncompressed data (not counting that not all of the assets are even GPU ralated, like sound assets or pre rendered videos) that would mean you could store three uncompressed Games on a 1TB drive.
Most games are not AAA games that are even this big and a lot of games do fine the way they are now, so only a fraction of the games would even need an option for uncompressed install. Which means, you could propably store even more games on that 1TB drive.

A lot of enthusiasts spent high three digits or even four digits on GPUs, so why not spent another 150$ on an additional SSD to immensely speed up those data intensive AAA games?
 
Joined
Oct 12, 2019
Messages
128 (0.07/day)
At this point, my confidence in anything Mr. Leather Jacket says is at the historical minimum. There is steaming pile of... lies about supposed ray-tracing, where he degraded the official term to 'heavily approximated, always partial ray-tracing with destructive compression-like algorithm, which will work on <1% games' (and none that I play - yes, I'm not interested in Cyberpunk until I see it in real life). He will charge it 1400g, it won't be available to buy and I'll be forced to watch ALL benchmarks done on that bloody thing, which nobody except reviewers will have - oh yes, all 5 games that support that fake-RT will become standard part of benchmark suite, to my great enjoyment, and provide highly skewed results for, say, CPUs for anyone who doesn't have that card and doesn't play that games. Great news!

To clear things up, I consider 'heavily approximated, always partial ray-tracing with destructive compression-like algorithm, which will work on <1% games' an advancement and generally nice feature - except it's just a distant cousin of scene or camera ray-tracing and not the 'ultimate dream'. Yeah, RTX looks great on Minecraft and Quake 2 - but please find an easier example for a derivative of ray-tracing and you'll be rewarded tenfold. Low poly-count, flat surfaces... Why not even Quake 3? Why not new Wolfenstain? Errr...

So, now he is speeding up M.2 storage? Yes, sure, why not?

I'll believe in all those things *WHEN* I see them in work, on real (and normal) system with more general benchmarks, not by-NVIDIA-for-NVIDIA set of 2...
 
Joined
Aug 22, 2016
Messages
167 (0.06/day)
They could have just "Guys, the new gpus support microsoft direct storage!", but they had to do their own proprietary stuff on top of it and no one (without beeing paid for) will implement it
 
Joined
May 2, 2017
Messages
7,762 (2.81/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
What games hit 150TB? Oh yeah, Flight Simulator, so you're right of course, there is one big Game now! Most AAA games are still in the 50GB range!
Good NVME SSDs cost about 150$ per TB
I don't know what the compression factor of these assets is, but lets say its 1:6 so a 50GB game comes to 300GB of uncompressed data (not counting that not all of the assets are even GPU ralated, like sound assets or pre rendered videos) that would mean you could store three uncompressed Games on a 1TB drive.
Most games are not AAA games that are even this big and a lot of games do fine the way they are now, so only a fraction of the games would even need an option for uncompressed install. Which means, you could propably store even more games on that 1TB drive.

A lot of enthusiasts spent high three digits or even four digits on GPUs, so why not spent another 150$ on an additional SSD to immensely speed up those data intensive AAA games?
Here's a list from back in January. Since then we've had Flight Simulator, that CoD BR thing, and a handful of others. Sure, most AAA games are still below 100GB, but install sizes are growing at an alarming rate.

Let me make it plainer then:

Thwe theoretical worst case is about the 24 core Nvidia claim.

The PS5 is probably exaggerated a bit, in that there might be remaining computing capacity on those 12 cores, basically the same thing you are saying.


The trouble here is that the teams working on RDNA2 discrete graphic cards and those working on consoles are different and under NDA's for 2 years. So it's not sure exactly what could trickle from the consoles to this generation, though I definitely hope the same as you do, otherwise RDNA2 cards will have trouble keeping up with the next-gen games.
I think you're also misreading "the Nvidia claim" - all they said is that to decompress a theoretical maximum throughput PCIe 4.0 SSD you would "need" a theoretical 24 CPU cores, which is the equivalent level of decompression performance of their RTX IO decompression block. I don't see this as them saying "this is how much performance you will need in the real world", as no game has ever required that kind of throughput, no such SSD exists, and in general nobody would design a game in that way - at least for another decade.

Also, I think your claim about the RDNA teams is fundamentally flawed. AMD post-RTG is a much more integrated company than previously. And while there are obviously things worked on within some parts of the company that the other parts don't know about, a new major cross-platform storage API provided by an outside vendor (Microsoft) is not likely to be one of these things.

Why not an independent co-processor using Nvlink that offloads directly, negating the need to use the CPU at all.
Because that would limit support to boards with free PCIe slots, excluding ITX entirely, require an expensive NVLink bridge, limit support to the 3090, etc. This would likely work just as "well" over a PCIe 4.0 x16 slot, but that would of course limit support to HEDT platforms. Besides, we saw how well dedicated coprocessor AICs worked in the market back when PhysX launched. I.e. not at all.

They could have just "Guys, the new gpus support microsoft direct storage!", but they had to do their own proprietary stuff on top of it and no one (without beeing paid for) will implement it
Is there actually anything proprietary here though? Isn't this just a hardware implementation of DirectStorage? Nvidia doesn't like to say "we support standards", after all, they have to give them a new name, presumably to look cooler somehow.
At this point, my confidence in anything Mr. Leather Jacket says is at the historical minimum. There is steaming pile of... lies about supposed ray-tracing, where he degraded the official term to 'heavily approximated, always partial ray-tracing with destructive compression-like algorithm, which will work on <1% games' (and none that I play - yes, I'm not interested in Cyberpunk until I see it in real life). He will charge it 1400g, it won't be available to buy and I'll be forced to watch ALL benchmarks done on that bloody thing, which nobody except reviewers will have - oh yes, all 5 games that support that fake-RT will become standard part of benchmark suite, to my great enjoyment, and provide highly skewed results for, say, CPUs for anyone who doesn't have that card and doesn't play that games. Great news!

To clear things up, I consider 'heavily approximated, always partial ray-tracing with destructive compression-like algorithm, which will work on <1% games' an advancement and generally nice feature - except it's just a distant cousin of scene or camera ray-tracing and not the 'ultimate dream'. Yeah, RTX looks great on Minecraft and Quake 2 - but please find an easier example for a derivative of ray-tracing and you'll be rewarded tenfold. Low poly-count, flat surfaces... Why not even Quake 3? Why not new Wolfenstain? Errr...

So, now he is speeding up M.2 storage? Yes, sure, why not?

I'll believe in all those things *WHEN* I see them in work, on real (and normal) system with more general benchmarks, not by-NVIDIA-for-NVIDIA set of 2...
What review sites do you know of that systematically only tests games in RT mode? Sure, RT benchmarks will become more of a thing this generation around, but I would be shocked if that didn't mean additional testing on top of RT-off testing. And comparing RT-on vs. RT-off is obviously not going to happen (that would make the RT-on GPUs look terrible!).
 
Joined
Dec 26, 2016
Messages
287 (0.10/day)
Processor Ryzen 3900x
Motherboard B550M Steel Legend
Cooling XPX (custom loop)
Memory 32GB 3200MHz cl16
Video Card(s) 3080 with Bykski block (custom loop)
Storage 980 Pro
Case Fractal 804
Power Supply Focus Plus Gold 750FX
Mouse G603
Keyboard G610 brown
Software yes, lots!
Death Stranding: 64 GB
Horizon Zero Dawn: 72 GB
Mount & Blade 2: 51 GB
Red Dead Redemption 2: 110 GB
Star Citizen: 60 GB
 
Top