Saturday, March 23rd 2024
AMD 24.3.1 Drivers Unlock RX 7900 GRE Memory OC Limits, Additional Performance Boost Tested
Without making much noise, AMD lifted the memory overclocking limits of the Radeon RX 7900 GRE graphics card with its latest Adrenalin 24.3.1 WHQL drivers, TechPowerUp found. The changelog is a bit vague and states "The maximum memory tuning limit may be incorrectly reported on AMD Radeon RX 7900 GRE graphics products."—we tested it. The RX 7900 GRE has been around since mid-2023, but gained prominence as the company gave it a global launch in February 2024, to help AMD better compete with the NVIDIA GeForce RTX 4070 Super. Before this, the RX 7900 GRE had started out its lifecycle as a special edition product confined to China, and its designers had ensured that it came with just the right performance positioning that didn't end up disrupting other products in the AMD stack. One of these limitations had to do with the memory overclocking potential, which was probably put in place to ensure that the RX 7900 GRE has a near-identical total board power as the RX 7800 XT.
Shortly after the global launch of the RX 7900 GRE, and responding to drama online, AMD declared the limited memory overclocking range a bug and promised a fix. The overclocking limits are defined in the graphics card VBIOS, so increasing those limits would mean shipping BIOS updates for over a dozen SKUs from all the major vendors, and requiring users to upgrade it by themselves. Such a solution isn't very practical, so AMD implemented a clock limit override in their new drivers, which reprograms the power limits on the GPU during boot-up. Nicely done, good job AMD!During the course of our testing of the PowerColor RX 7900 GRE Hellhound graphics card, we were playing around with overclocking using the latest 24.3.1 WHQL drivers, and found that it increased the memory overclocking slider limit in AMD Software, which can be pushed all the way up to 3000 MHz now (24 Gbps GDDR6-effective). Previously the highest possible setting was 2316 MHz. This doesn't necessarily mean that the memory will overclock all the way up to 24 Gbps, you're still limited by what the GDDR6 chips are capable of. Our PowerColor Hellhound ships with Samsung K4ZAF325BC-SC20 memory chips that are rated for 20 Gbps. With our review drivers for the RX 7900 GRE, we had managed a memory overclock of 2316 MHz (18.5 Gbps GDDR6-effective); but with the new drivers, we scored a spectacular 2604 MHz (20 Gbps), which beats the 19.5 Gbps speed that the RX 7800 XT ships with.
The increased memory speed sees our 3DMark Time Spy GT1 overclocked frame rate jump from 72.6 FPS to 77.1 FPS (GPU frequency was constant between the two runs at 2803 MHz). This brings the card's total overclocking potential to an impressive 15% real-life performance gain. It remains a mystery why AMD chose to go with a slower memory sub-system than the RX 7800 XT for the RX 7900 GRE. It may have to do with achieving an almost identical board power number to the RX 7800 XT, so that board partners could end up with the same cooler noise figures as their RX 7800 XT products; or it was just a product segmentation decision—we'll never know. With the 20 Gbps overclock, the RX 7900 GRE has a hearty 640 GB/s of memory bandwidth at its disposal, which should come in handy to keep the 80 RDNA 3 compute units better-fed.
Thanks to @Dragokar for letting us know of the driver change.
Shortly after the global launch of the RX 7900 GRE, and responding to drama online, AMD declared the limited memory overclocking range a bug and promised a fix. The overclocking limits are defined in the graphics card VBIOS, so increasing those limits would mean shipping BIOS updates for over a dozen SKUs from all the major vendors, and requiring users to upgrade it by themselves. Such a solution isn't very practical, so AMD implemented a clock limit override in their new drivers, which reprograms the power limits on the GPU during boot-up. Nicely done, good job AMD!During the course of our testing of the PowerColor RX 7900 GRE Hellhound graphics card, we were playing around with overclocking using the latest 24.3.1 WHQL drivers, and found that it increased the memory overclocking slider limit in AMD Software, which can be pushed all the way up to 3000 MHz now (24 Gbps GDDR6-effective). Previously the highest possible setting was 2316 MHz. This doesn't necessarily mean that the memory will overclock all the way up to 24 Gbps, you're still limited by what the GDDR6 chips are capable of. Our PowerColor Hellhound ships with Samsung K4ZAF325BC-SC20 memory chips that are rated for 20 Gbps. With our review drivers for the RX 7900 GRE, we had managed a memory overclock of 2316 MHz (18.5 Gbps GDDR6-effective); but with the new drivers, we scored a spectacular 2604 MHz (20 Gbps), which beats the 19.5 Gbps speed that the RX 7800 XT ships with.
The increased memory speed sees our 3DMark Time Spy GT1 overclocked frame rate jump from 72.6 FPS to 77.1 FPS (GPU frequency was constant between the two runs at 2803 MHz). This brings the card's total overclocking potential to an impressive 15% real-life performance gain. It remains a mystery why AMD chose to go with a slower memory sub-system than the RX 7800 XT for the RX 7900 GRE. It may have to do with achieving an almost identical board power number to the RX 7800 XT, so that board partners could end up with the same cooler noise figures as their RX 7800 XT products; or it was just a product segmentation decision—we'll never know. With the 20 Gbps overclock, the RX 7900 GRE has a hearty 640 GB/s of memory bandwidth at its disposal, which should come in handy to keep the 80 RDNA 3 compute units better-fed.
Thanks to @Dragokar for letting us know of the driver change.
32 Comments on AMD 24.3.1 Drivers Unlock RX 7900 GRE Memory OC Limits, Additional Performance Boost Tested
Nvidia's pricing is insane with 4070 Ti Super way dearer than 7900XT and 4080 Super starting at $1900 way dearer than XTX.
insane some games almost Rx 7900XT performance . We got a new value King here.
That said, people shouldn't go into GRE thinking every card's VRAM is going to be able to overclock up to 7900 XT spec. AIBs like changing cards up based on GDDR chip availability/pricing without public notice. Stock clocks are all that is guaranteed/warrantied.
It's still not clear to me why AMD is doing this unless consumer overclocks are exempt from the 600 GB/s regulation. It's possible Hack-A-Day misinterpreted the regulation but what we do know is that the regulation mostly targeted AI/high performance computing which often ship with HBM. The regulation effectively made exporting HBM products to China illegal. The language might be fuzzy but the real world application is the 600 GB/s limit on GDDR/HBM performance which have almost entirely vanished from the Chinese market.
Qutoing:
[ICODE]The GDDR6 memory chips are made by Samsung and carry the model number K4ZAF325BC-SC20. They are specified to run at 2500 MHz (20 Gbps effective).[/ICODE]
So you are suggesting that I have received a different memory PN or Samsung is selling overspecced ICs? Given that TPU BIOS dumpfor the card is correct, the BIOS supports either Hynix H56G42AS8DX014 or Samsung K4ZAF325BC. Judging by Google, both chip models are possible to find on GRE/XT/XTX all alike and the Samsung part doesn't come in a variant specced for 18Gbp: it is available as SC16 for 16Gbps or SC20 speecced for 20Gbps. Looks like an issue with the memory controller instead, which of course points that it is "in the reference spec", given they underclock for 2250MHz. It is completely baffling they would put 20Gbps VRAMs into a card capable of only 18Gbps...
Perhaps there is a bigger issue for AMD, given that many people on XT/XTX are suffering from black screen issues - did AMD badly bin RDNA3 with modules incapable of >2400MHz VRAM clocks into XT/XTX and those are suffering from the problems? Overclocking on GRE >2400MHz very similarly causes black screen & system hard reset. Maybe there is a bigger RDNA3 failure at play, not a GRE-related one...
LabRat 891 asked for GPU-Z screenshot, here it is.
I did a lot more digging into what happened. Short version is this:
1) November 2022, the performance export ban was issued that mentions the 600 GB/s rule on non-volatile memory: hackaday.com/2022/11/09/chinese-chips-are-being-artificially-slowed-to-dodge-us-export-regulations/
2) July 2023, 7900 GRE launches: www.tomshardware.com/news/amd-radeon-rx-7900-gre-launch
3) October 2023, Department of Commerce announces change in the export ban that strongly targets tensor cores (TFLOP calculation): www.tomshardware.com/news/no-nvidia-isnt-breaking-gpu-sanctions-analyst
4) November 2023, the rule goes into effect, 4090 disappears because it's not compliant, and 7900 series sales surge in China to fill the demand: hothardware.com/news/amds-flagship-radeon-rx-7900-xtx-and-xt-gpus-flourish-in-china
I believe GRE was created in case the rule was modified to apply the 600 GB/s rule to volatile memory which would limit Chinese sales of 7900 XT and 7900 XTX. Dell apparently misinterpreted the rule and proactively applied it in November 2023: www.techpowerup.com/316044/dell-allegedly-prohibits-sales-of-high-end-radeon-and-instinct-mi-gpus-in-china
AMD wasn't the only company to release a 600 GB/s product preemptively; Birin Technology (a Chinese company) did the same (see hackaday article above).
Because the 2023 replaced the rule invalidating the 2022 rule which mentioned 600 GB/s, AMD felt they were no longer threatened by it so they enabled overclocking in the driver for the card.
So, all good now, except that Chinese can't get 4090 anymore (hence 4090D www.tomshardware.com/pc-components/gpus/nvidia-launches-china-specific-rtx-4090d-dragon-gpu-sanctions-compliant-model-has-fewer-cores-and-lower-power-draw).
Yeah, 4090 was a no-no due to FP32 (Tflops), I knew that part. Didn't know/remember there was a 8-bit (TOP) performance(/density) update to the law, but I can certainly believe that part (I won't get into it).
I think we used to live in a world that largely relied on full/double precision (FP32/64), where-as now that has morphed into larger use of FP16 and 8-bit calculations (for which can be more densely packed).
I really should read the actual current guidelines, as obviously and as you can see, things can be easily misinterpreted and incorrect information passed along (even by the press and/or major companies).
It's actually quite fascinating imho: It used to be these companies were so far ahead of the governments they would circumvent whatever (eventually) archaic rule by implementing different design decisions. In this case the government actually appeared to very much understand what they were doing (at least in the updated law, which makes sense) while perhaps some in the supply chain did not (or keep current).
That said, I quite dislike talking about this stuff, tbh. Although it is important to understand the what/why, I personally very much prefer to be a unifier rather than discussing limitations in international trade.
Edit: TMW you look at your post later and realize you meant double-precision (64-bit floating point), single precision (FP32), and half-precision (FP16)...but worded it incorrectly.