AMD "Vega" High Bandwidth Cache Controller Improves Minimum and Average FPS

btarunr · Feb 28, 2017

At its Capsaicin & Cream event today, AMD announced that its High Bandwidth Cache Controller (HBCC), a feature introduced by its "Vega" GPU architecture to improve memory management, will increase game performance tangibly. The company did a side-by-side comparison between two sessions of "Deus Ex: Mankind Divided," in which a HBCC-aware machine purportedly presented 2x better minimum FPS, and 1.5x better average FPS scores, than a non-HBCC-aware system (though the old, trusty frame-rate counter was conspicuously absent from both demos).

AMD also went on to show how HBCC seemingly halves memory requirements, by deliberately capping the amount of addressable memory on the HBCC-aware system to only 2 GB - half of the 4 GB addressable by the non-HBCC-aware system, while claiming that even so, the HBCC-enabled system still showed "the same or better performance" through its better memory management and bandwidth speeds. If these results do hold up to scrutiny, this should benefit implementations of "Vega" with lower amounts of video memory, while simultaneously reducing production costs and overall end-user pricing, since smaller memory pools would be needed for the same effect.

View at TechPowerUp Main Site

the54thvoid · Feb 28, 2017

Whats the implication of throttling the GPU memory to 2GB? It sounds like an artificial 'improvement' that fails after 2GB of memory usage....

acperience7 · Feb 28, 2017

I don't understand how lowering the amount the VRAM amount would help with FPS. Can anyone explain this?

BiggieShady · Feb 28, 2017

By lowering amount of available VRAM, GPU is forced to actually heavily use High Bandwidth Cache ... that's how I got it, as a comparison of HBC vs. RAM-PCIE-VRAM transfer
Having less vram should have less impact on fps ... and incoming news about Vega HBCC to halve memory requirements

happita · Feb 28, 2017

The article says Human Revolution which is an older game, but the picture shows Mankind Divided. I'm thinking it's the latter, right?

Steevo · Feb 28, 2017

BiggieShady said:
By lowering amount of available VRAM, GPU is forced to actually heavily use High Bandwidth Cache ... that's how I got it

You are implying the GPU is aware of something instead of engineered to use the HBC with logic to move not only instructions but highly used textures, maps, and models into the cache.

I would liken this to when we bought motherboards that had their own cache, the CPU was unaware of the cache other than if it found instructions in it, they were executed much faster than on system memory. I believe their CPU division and engineering of SOC for Sony and MS is paying dividends for GPU tech as well.

Camm · Feb 28, 2017

This is cool, but I'm still somewhat concerned of how much die space this takes up.

BiggieShady · Feb 28, 2017

Steevo said:
You are implying the GPU is aware of something instead of engineered to use the HBC with logic to move not only instructions but highly used textures, maps, and models into the cache.

I would liken this to when we bought motherboards that had their own cache, the CPU was unaware of the cache other than if it found instructions in it, they were executed much faster than on system memory. I believe their CPU division and engineering of SOC for Sony and MS is paying dividends for GPU tech as well.

GPUs already do this with RAM serving as extension to VRAM and using PCIE for transfer when vram amount isn't enogh (less used textures end up in RAM and often used in VRAM) ... putting a high bandwidth memory cache buffer in between to battle stutters (this feature will shine on low end parts with less vram) ... nice one amd *slow clap*

londiste · Feb 28, 2017

amd is clearly talking about memory-starved situations. this should not have much effect if your gpu actually has access to necessary amount of vram.

RejZoR · Feb 28, 2017

The reason they lowered the VRAM availability is that they wanted to place Vega into worst possible situation. Situation where HBC really shows it's strength, situations when game VRAM usage goes beyond what you actually have on-board.

Only thing that I wonder about is if HBC can do the data management on its own or if it has to be specifically coded to use it. Because if it can be used out of the box with anything, it'll be awesome. But if you have to specifically code for it, then that's a problem by itself.

snakefist · Feb 28, 2017

Yes, this is not-enough-RAM situation, simulated. From that point of view, it's good thing, more future-proof (hoping that it comes down to middle segment, too - high-end users tend to replace GPUs more often, anyway)...

NdMk2o1o · Feb 28, 2017

Looks like hbm won't be limited to their top tier cards in future then perhaps? You could put 2gb hbm on a mid range card and get the same or better performance of it having 4gb gddr5!?...

Steevo · Feb 28, 2017

I wonder where I get my degree in keyboard engineering?

From the posts in this thread at the Nvidia school of fanboy!!

First we had people butthurt about all the AMD news since if you have to buy AMD you are obviously a piss poor peon that shouldn't have a computer, and now we have a lot of posts about a new technology from AMD and lots of hate tossed it way by salad tossers, with no syrup.

RejZoR · Feb 28, 2017

snakefist said:
Yes, this is not-enough-RAM situation, simulated. From that point of view, it's good thing, more future-proof (hoping that it comes down to middle segment, too - high-end users tend to replace GPUs more often, anyway)...

The concept is not new though. There were NVIDIA TurboCache and ATI HyperMemory cards that utilized small VRAM (usually just up to 256MB) while the rest was done via system RAM. It was super cost effective solution that delivered basically the same framerate as the one with all that memory on the graphic card itself. And what AMD has done here is just a very refined HyperMemory. And they aren't using fast on-board memory, they are using REALLY fast onboard memory.

Pruny · Feb 28, 2017

RejZoR said:
The concept is not new though. There were NVIDIA TurboCache and ATI HyperMemory cards that utilized small VRAM (usually just up to 256MB) while the rest was done via system RAM. It was super cost effective solution that delivered basically the same framerate as the one with all that memory on the graphic card itself. And what AMD has done here is just a very refined HyperMemory. And they aren't using fast on-board memory, they are using REALLY fast onboard memory.

Hypermemory cards were so weak that could not use that extra memory. Was a scam to sell crapy cards.

the54thvoid · Feb 28, 2017

Ah, I see now - they simulated the card only having 2GB ram, not dealing with 2 GB of texture or video data... Very nice in that case.

RejZoR · Feb 28, 2017

the54thvoid said:
Ah, I see now - they simulated the card only having 2GB ram, not dealing with 2 GB of texture or video data... Very nice in that case.

Well, the big Vega comes with 8GB of VRAM, that's enough even for most hungry games today. They were forced to simulate it to make a point. Otherwise it would just all run in the VRAM anyway. What this means is that you have 8GB on-board. But the game can utilize beyond that and you won't have any performance penalty where currently, if it goes past the on-board memory, performance will just tank like insane.

@Pruny
That's not entirely true. The cards were VRAM starved to begin with, some packing only 32MB of on-board VRAM. And then it was expanded to 128MB which was a common standard in 2004/2005. Meaning the cards could be ridiculously cheap since they hardly had any expensive RAM on them.

W1zzard · Feb 28, 2017

RejZoR said:
The concept is not new though. There were NVIDIA TurboCache and ATI HyperMemory cards that utilized small VRAM (usually just up to 256MB) while the rest was done via system RAM. It was super cost effective solution that delivered basically the same framerate as the one with all that memory on the graphic card itself. And what AMD has done here is just a very refined HyperMemory. And they aren't using fast on-board memory, they are using REALLY fast onboard memory.

What is new here is that apparently the gpu has some concept of virtual memory, like your cpu does. Think pagefile. This is completely transparent to the application.. Memory pages will be paged out automatically when memory get low, probably based on some recently used algorithm. When a page fault is generated by the gpu, the relevant pages are paged in by the gpu, automagically, but with higher latency.

Pruny · Feb 28, 2017

cards like x1050 had hipermemory, to x1550 , those were rubish.

londiste · Feb 28, 2017

RejZoR said:
The concept is not new though. There were NVIDIA TurboCache and ATI HyperMemory cards that utilized small VRAM (usually just up to 256MB) while the rest was done via system RAM. It was super cost effective solution that delivered basically the same framerate as the one with all that memory on the graphic card itself. And what AMD has done here is just a very refined HyperMemory. And they aren't using fast on-board memory, they are using REALLY fast onboard memory.

the interesting part about hbcc is that based on what little has been revealed about this they are going the other way on the memory hierarchy. this seems to be memory controller infused with some level of ability to control l2 cache. it will be interesting to learn how and what exactly has been achieved.

W1zzard said:
What is new here is that apparently the gpu has some concept of virtual memory, like your cpu does. Think pagefile. This is completely transparent to the application.. Memory pages will be paged out automatically when memory get low, probably based on some recently used algorithm. When a page fault is generated by the gpu, the relevant pages are paged in by the gpu, automagically, but with higher latency.

that is what hypermemory and turbocache (and their successors) already did.

speculation at this point but the new nuance from this side seems to be the 'cache' part that hints at new pieces in the hierarchy. looks like amd might be preparing to equip cards with multiple types of memory, perhaps both hbm and gddr5x and the new improved controller will be able to handle this better.

Steevo · Feb 28, 2017

Pruny said:
cards like x1050 had hipermemory, to x1550 , those were rubish.

Are you replying to yourself? Are you a bot? The 1050 was more of a "get Aero on Vista and pretties for $45" card aimed at office computers than anything. I don't think I ever put anything less than a 1600 in computer.

RejZoR · Feb 28, 2017

W1zzard said:
What is new here is that apparently the gpu has some concept of virtual memory, like your cpu does. Think pagefile. This is completely transparent to the application.. Memory pages will be paged out automatically when memory get low, probably based on some recently used algorithm. When a page fault is generated by the gpu, the relevant pages are paged in by the gpu, automagically, but with higher latency.

So, in an essence, AMD has expanded the cache hierarchy. We have L1 and L2 on GPU itself, L3 is basically VRAM (I'm not aware of L3 being used on GPU's unlike with CPU's or is it?) and now they've added L4 which is system RAM. All this is usually controlled by algorithm/prediction based prefetchers.

I mean, if this will be fully automatic without any need for special game code, it's gonna be nice and it's going to dramatically expand the usability of the graphic card over time as it ages and new demanding games come out with more memory needed to work. Sure it won't be as fast as having as much VRAm available at all times, but it won't be nearly as bad as running out of VRAM entirely. I know Win8/Win10 already does this to small extent, but I don't think not even nearly in such extent as VEGA will be doing it this.

I mean, with Vega, my 32GB of system RAM will finally find a very good use. Because for games, not even 16GB is really needed. Meaning other 16GB is idling to itself most of the time. But Vega will be able to use that. I like the idea very much.

londiste · Feb 28, 2017

aren't you forgetting that ram is at the other side of (likely actively used) pci-e x16?

Steevo · Feb 28, 2017

londiste said:
aren't you forgetting that ram is at the other side of (likely actively used) pci-e x16?

The actual amount of data used is negligible over the PCIe bus, I suggest a read of the PCIe scaling article W1zz did, most graphics cards only use X4 lanes of 2.0 in actual bandwidth, more than that only gives a few percent (not frames per second) more performance, so 60FPS +/- 3% doesn't really mean much.

Nabarun · Feb 28, 2017

Bla bla bla. Do I get to buy something that performs like the 1070 but costs less than 460/950? Otherwise it's just fvcking bla bla bla.

System Name	RBMK-1000
Processor	AMD Ryzen 7 5700G
Motherboard	Gigabyte B550 AORUS Elite V2
Cooling	DeepCool Gammax L240 V2
Memory	2x 16GB DDR4-3200
Video Card(s)	Galax RTX 4070 Ti EX
Storage	Samsung 990 1TB
Display(s)	BenQ 1440p 60 Hz 27-inch
Case	Corsair Carbide 100R
Audio Device(s)	ASUS SupremeFX S1220A
Power Supply	Cooler Master MWE Gold 650W
Mouse	ASUS ROG Strix Impact
Keyboard	Gamdias Hermes E2
Software	Windows 11 Pro

Processor	Ryzen 7800X3D
Motherboard	MSI MAG Mortar B650 (wifi)
Cooling	be quiet! Dark Rock Pro 4
Memory	32GB Kingston Fury
Video Card(s)	MSI RTX 5080 Vanguard SOC
Storage	Seagate FireCuda 530 M.2 1TB / Samsumg 960 Pro M.2 512Gb
Display(s)	LG 32" 165Hz 1440p GSYNC
Case	Asus Prime AP201
Audio Device(s)	On Board
Power Supply	be quiet! Pure POwer M12 850w Gold (ATX3.0)
Software	W10

Processor	AMD Ryzen R7 3900X
Motherboard	Gigabyte X570 Asrock X570 Taichi
Cooling	2x LL140, 4x LL120 / bequiet! Dark Rock 4
Memory	32GB Corsair Vengence RGB DDR4 (3600 Mhz)
Video Card(s)	Red Devil 5700XT
Storage	x1 Inland 1TB Nvme, 1x Samsung 860 EVO 1TB, 1x WD 1TB, 1x Crucial MX500 500GB, 1x Sandisk X400 256GB
Display(s)	Samsung C32H711
Case	Fractal Design Meshify C
Audio Device(s)	Onboard
Power Supply	Seasonic Prime Titanium 850W
Mouse	Logitech G502 Proteus Spectrum
Keyboard	Logitech G910
Software	Windows 10 Pro

System Name	Windows 10 64-bit Core i7 6700
Processor	Intel Core i7 6700
Motherboard	Asus Z170M-PLUS
Cooling	Corsair AIO
Memory	2 x 8 GB Kingston DDR4 2666
Video Card(s)	Gigabyte NVIDIA GeForce GTX 1060 6GB
Storage	Western Digital Caviar Blue 1 TB, Seagate Baracuda 1 TB
Display(s)	Dell P2414H
Case	Corsair Carbide Air 540
Audio Device(s)	Realtek HD Audio
Power Supply	Corsair TX v2 650W
Mouse	Steelseries Sensei
Keyboard	CM Storm Quickfire Pro, Cherry MX Reds
Software	MS Windows 10 Pro 64-bit

Processor	i5-7600k
Motherboard	ASRock Z170 Pro4
Cooling	CM Hyper 212 EVO w/ AC MX-4
Memory	2x8GB DDR4 2400 Corsair LPX Vengeance 15-15-15-36
Video Card(s)	MSI Twin Frozr 1070ti
Storage	240GB Corsair Force GT
Display(s)	23' Dell AW2310
Case	Corsair 550D
Power Supply	Seasonic SS-760XP2 Platinum
Software	Windows 10 Pro 64-bit

AMD "Vega" High Bandwidth Cache Controller Improves Minimum and Average FPS

btarunr

Editor & Senior Moderator

the54thvoid

Super Intoxicated Moderator

acperience7

BiggieShady

happita

Steevo

Camm

BiggieShady

londiste

RejZoR

snakefist

NdMk2o1o

Steevo

RejZoR

Pruny

the54thvoid

Super Intoxicated Moderator

RejZoR

W1zzard

Administrator

Pruny

londiste

Steevo

RejZoR

londiste

Steevo

Nabarun

System Name	Compy 386
Processor	7800X3D
Motherboard	Asus
Cooling	Air for now.....
Memory	64 GB DDR5 6400Mhz
Video Card(s)	7900XTX 310 Merc
Storage	Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s)	55" Samsung 4K HDR
Audio Device(s)	ATI HDMI
Mouse	Logitech MX518
Keyboard	Razer
Software	A lot.
Benchmark Scores	Its fast. Enough.

System Name	ATHENA
Processor	AMD 7950X
Motherboard	ASUS Crosshair X670E Extreme
Cooling	ASUS ROG Ryujin III 360, 13 x Lian Li P28
Memory	2x32GB Trident Z RGB 6000Mhz CL30
Video Card(s)	ASUS 4090 STRIX
Storage	3 x Kingston Fury 4TB, 4 x Samsung 870 QVO
Display(s)	Acer X38S, Wacom Cintiq Pro 15
Case	Lian Li O11 Dynamic EVO
Audio Device(s)	Topping DX9, Fluid FPX7 Fader Pro, Beyerdynamic T1 G2, Beyerdynamic MMX300
Power Supply	Seasonic PRIME TX-1600
Mouse	Xtrfy MZ1 - Zy' Rail, Logitech MX Vertical, Logitech MX Master 3
Keyboard	Logitech G915 TKL
VR HMD	Oculus Quest 2
Software	Windows 11 + Universal Blue

System Name	Dark Monolith
Processor	AMD Ryzen 7 5800X3D
Motherboard	ASUS Strix X570-E
Cooling	Arctic Cooling Freezer II 240mm + 2x SilentWings 3 120mm
Memory	64 GB G.Skill Ripjaws V Black 3600 MHz
Video Card(s)	XFX Radeon RX 9070 XT Mercury OC Magnetic Air
Storage	Seagate Firecuda 530 4 TB SSD + Samsung 850 Pro 2 TB SSD + Seagate Barracuda 8 TB HDD
Display(s)	ASUS ROG Swift PG27AQDM 240Hz OLED
Case	Silverstone Kublai KL-07
Audio Device(s)	Sound Blaster AE-9 MUSES Edition + Altec Lansing MX5021 2.1 Nichicon Gold
Power Supply	BeQuiet DarkPower 11 Pro 750W
Mouse	Logitech G502 Proteus Spectrum
Keyboard	UVI Pride MechaOptical
Software	Windows 11 Pro

System Name	Ryzen Reynolds
Processor	Ryzen 1600 - 4.0Ghz 1.415v - SMT disabled
Motherboard	mATX Asrock AB350m AM4
Cooling	Raijintek Leto Pro
Memory	Vulcan T-Force 16GB DDR4 3000 16.18.18 @3200Mhz 14.17.17
Video Card(s)	Sapphire Nitro+ 4GB RX 580 - 1450/2000 BIOS mod 8-)
Storage	Seagate B'cuda 1TB/Sandisk 128GB SSD
Display(s)	Acer ED242QR 75hz Freesync
Case	Corsair Carbide Series SPEC-01
Audio Device(s)	Onboard
Power Supply	Corsair VS 550w
Mouse	Zalman ZM-M401R
Keyboard	Razor Lycosa
Software	Windows 10 x64
Benchmark Scores	https://www.3dmark.com/spy/6220813

Processor	Ryzen 7 5700X
Memory	48 GB
Video Card(s)	RTX 4080
Storage	2x HDD RAID 1, 3x M.2 NVMe
Display(s)	30" 2560x1600 + 19" 1280x1024
Software	Windows 10 64-bit

System Name	barely hangin on...
Processor	Intel I5 4670K @stock
Motherboard	Asus H81m-cs (nothing else available now)
Cooling	CM Hyper 212X (in push-pull)
Memory	16GB Corsair Vengeance Dual Channel 1866MHz
Video Card(s)	Asus RX 580 4GB Dual
Storage	WD Blue 1TB, WD Black 2TB, Samsung 850 Evo 250GB
Display(s)	Acer KG241QP 144Hz
Case	Cooler Master CM 690 III (Transparent side panel) - illuminated with NZXT HUE RGB
Audio Device(s)	FiiO E10K>Boom 3D>ATH M50/Samson SR850/HD599SE
Power Supply	Corsair RM 850
Mouse	Redragon M901 PERDITION 16400 DPI Laser Gaming Mouse
Keyboard	HyperX Alloy FPS Mechanical Gaming Keyboard (Cherry MX Brown)
Software	7-64bit MBR, 10-64bit UEFI (Not Multi-boot), VBox guests...