• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Is there anway to make use of unused VRAM?

Mussels

Freshwater Moderator
Joined
Oct 6, 2004
Messages
58,413 (7.96/day)
Location
Oystralia
System Name Rainbow Sparkles (Power efficient, <350W gaming load)
Processor Ryzen R7 5800x3D (Undervolted, 4.45GHz all core)
Motherboard Asus x570-F (BIOS Modded)
Cooling Alphacool Apex UV - Alphacool Eisblock XPX Aurora + EK Quantum ARGB 3090 w/ active backplate
Memory 2x32GB DDR4 3600 Corsair Vengeance RGB @3866 C18-22-22-22-42 TRFC704 (1.4V Hynix MJR - SoC 1.15V)
Video Card(s) Galax RTX 3090 SG 24GB: Underclocked to 1700Mhz 0.750v (375W down to 250W))
Storage 2TB WD SN850 NVME + 1TB Sasmsung 970 Pro NVME + 1TB Intel 6000P NVME USB 3.2
Display(s) Phillips 32 32M1N5800A (4k144), LG 32" (4K60) | Gigabyte G32QC (2k165) | Phillips 328m6fjrmb (2K144)
Case Fractal Design R6
Audio Device(s) Logitech G560 | Corsair Void pro RGB |Blue Yeti mic
Power Supply Fractal Ion+ 2 860W (Platinum) (This thing is God-tier. Silent and TINY)
Mouse Logitech G Pro wireless + Steelseries Prisma XL
Keyboard Razer Huntsman TE ( Sexy white keycaps)
VR HMD Oculus Rift S + Quest 2
Software Windows 11 pro x64 (Yes, it's genuinely a good OS) OpenRGB - ditch the branded bloatware!
Benchmark Scores Nyooom.
As you can see, the processor cannot access VRAM directly. Only the GPU can do it and any attempt to use this memory as RAM, by detours, is just fun and useless. Even the igp has its RAM memory reserved, impossible for the CPU to access.

View attachment 296919
That render you've shown isn't an official diagram of how they're actually designed or work

Direct3D12 ultimate and Directstorage are changing how it all works, and much of it is already active.
The CPU can edit content of VRAM, and the GPU can read from NVME. They're all directly linked over PCI-E now.


It's just not that fast at doing so, latencies are good but overall bandwidth isn't
So loading in a new texture to prevent microstutter - hell yes.

As a write cache or something... maybe?
The CUDA version definitely was GPU controlled and had no CPU usage on my testing, but the CPU version of the software would likely have been software driven by the CPU.
 
Joined
Jun 6, 2022
Messages
622 (0.70/day)
??? I think you've misunderstood how that all works.
Very old image, but a clear example
View attachment 296917
I do not believe.

"Each chip uses two separate 16-bit channels to connect to a single 32-bit"

That render you've shown isn't an official diagram of how they're actually designed or work

Direct3D12 ultimate and Directstorage are changing how it all works, and much of it is already active.
The CPU can edit content of VRAM, and the GPU can read from NVME. They're all directly linked over PCI-E now.


It's just not that fast at doing so, latencies are good but overall bandwidth isn't
So loading in a new texture to prevent microstutter - hell yes.

As a write cache or something... maybe?
The CUDA version definitely was GPU controlled and had no CPU usage on my testing, but the CPU version of the software would likely have been software driven by the CPU.
The idea is that the CPU cannot use the real performance of the VRAM memory.
To simplify: a defender cannot replace the lack of a striker, nor vice versa. Everyone has their role and there they give the maximum yield.
 
Joined
Feb 1, 2019
Messages
3,526 (1.67/day)
Location
UK, Midlands
System Name Main PC
Processor 13700k
Motherboard Asrock Z690 Steel Legend D4 - Bios 13.02
Cooling Noctua NH-D15S
Memory 32 Gig 3200CL14
Video Card(s) 4080 RTX SUPER FE 16G
Storage 1TB 980 PRO, 2TB SN850X, 2TB DC P4600, 1TB 860 EVO, 2x 3TB WD Red, 2x 4TB WD Red
Display(s) LG 27GL850
Case Fractal Define R4
Audio Device(s) Soundblaster AE-9
Power Supply Antec HCG 750 Gold
Software Windows 10 21H2 LTSC
It might be interesting to use a VRAM disk with primocache -- maybe as a multi-gigabyte write cache (to save on writes to SSD's without using up system RAM).
Why would I do that when system ram is plentiful and VRAM is scarce?
 

Mussels

Freshwater Moderator
Joined
Oct 6, 2004
Messages
58,413 (7.96/day)
Location
Oystralia
System Name Rainbow Sparkles (Power efficient, <350W gaming load)
Processor Ryzen R7 5800x3D (Undervolted, 4.45GHz all core)
Motherboard Asus x570-F (BIOS Modded)
Cooling Alphacool Apex UV - Alphacool Eisblock XPX Aurora + EK Quantum ARGB 3090 w/ active backplate
Memory 2x32GB DDR4 3600 Corsair Vengeance RGB @3866 C18-22-22-22-42 TRFC704 (1.4V Hynix MJR - SoC 1.15V)
Video Card(s) Galax RTX 3090 SG 24GB: Underclocked to 1700Mhz 0.750v (375W down to 250W))
Storage 2TB WD SN850 NVME + 1TB Sasmsung 970 Pro NVME + 1TB Intel 6000P NVME USB 3.2
Display(s) Phillips 32 32M1N5800A (4k144), LG 32" (4K60) | Gigabyte G32QC (2k165) | Phillips 328m6fjrmb (2K144)
Case Fractal Design R6
Audio Device(s) Logitech G560 | Corsair Void pro RGB |Blue Yeti mic
Power Supply Fractal Ion+ 2 860W (Platinum) (This thing is God-tier. Silent and TINY)
Mouse Logitech G Pro wireless + Steelseries Prisma XL
Keyboard Razer Huntsman TE ( Sexy white keycaps)
VR HMD Oculus Rift S + Quest 2
Software Windows 11 pro x64 (Yes, it's genuinely a good OS) OpenRGB - ditch the branded bloatware!
Benchmark Scores Nyooom.
I do not believe.

"Each chip uses two separate 16-bit channels to connect to a single 32-bit"


The idea is that the CPU cannot use the real performance of the VRAM memory.
To simplify: a defender cannot replace the lack of a striker, nor vice versa. Everyone has their role and there they give the maximum yield.
Bolded line, correct
Underlined line... incorrect. Poor analogy.

The CPU can't use the VRAM fast because it has no direct access, and no high-speed link to it.
The CPU has to tell the GPU what to do, which then uses the VRAM, and then passes the result back.
That's not about somethings role but about an order of events and the speed between each of them in multiple metrics of bandwidth, latency, and processing delays.

As stated, DX12 has updates to change that but it's not used yet and definitely not part of that CUDA VRAM code.

I'm still not sure what you think you're saying with regards to the memory.
They use two 1GB or 2GB modules paired together to connect to the GPUs 32 bit wide memory buses, and then multiply that to reach their desired amount - in this case, 128 bit wide (and 192 bit wide on the previous gen)
I can discuss a lot more of that, but stating random facts and numbers doesn't explain what you mean by any of it.
PCI-E 4.0 x16 has a theoretical max of 32GB/s without any overheads so it could never bypass that, and PCI-E 5.0 GPU's could do 64GB/s... if the GPU architecture could actually provide that level of speed.

this quote sums this thread up nicely
In testing with a variety of games and synthetic benchmarks, the 32 MB L2 cache reduced memory bus traffic by just over 50% on average compared to the performance of a 2 MB L2 cache. See the reduced VRAM accesses in the Ada Memory Subsystem diagram above.

This 50% traffic reduction allows the GPU to use its memory bandwidth 2X more efficiently. As a result, in this scenario, isolating for memory performance, an Ada GPU with 288 GB/sec of peak memory bandwidth would perform similarly to an Ampere GPU with 554 GB/sec of peak memory bandwidth. Across an array of games and synthetic tests, the greatly increased hit rates improve frame rates by up to 34%.
They care about internal speed. Not external.

By using a caching system they managed to get the same gaming performance despite halving the bandwidth - but that VRAM is now halved in speed for something as simple as file transfers on a VRAM drive. That cache would not help that sort of activity, as the entire GPU as a whole was never designed or optimised for it.


Yay, your 128 bit GPU with 288GB/s bandwidth can at most send 64GB/s of that at best to another location of equal speed, which cant exist within the system as that would fill the DRAM in seconds and overwhelm even the fastest NVME drives. You then take into account that various file sizes, types, overheads, compression and so on and even simple things like reading and writing at the same time all slow those values down and it becomes a very poor proposition to use VRAM for anything other than its intended purpose, of being where the GPU stores the code it's crunching away at.
 

OneMoar

There is Always Moar
Joined
Apr 9, 2010
Messages
8,794 (1.65/day)
Location
Rochester area
System Name RPC MK2.5
Processor Ryzen 5800x
Motherboard Gigabyte Aorus Pro V2
Cooling Thermalright Phantom Spirit SE
Memory CL16 BL2K16G36C16U4RL 3600 1:1 micron e-die
Video Card(s) GIGABYTE RTX 3070 Ti GAMING OC
Storage Nextorage NE1N 2TB ADATA SX8200PRO NVME 512GB, Intel 545s 500GBSSD, ADATA SU800 SSD, 3TB Spinner
Display(s) LG Ultra Gear 32 1440p 165hz Dell 1440p 75hz
Case Phanteks P300 /w 300A front panel conversion
Audio Device(s) onboard
Power Supply SeaSonic Focus+ Platinum 750W
Mouse Kone burst Pro
Keyboard SteelSeries Apex 7
Software Windows 11 +startisallback
Top