Introduction
AMD's new "Fiji" GPU debuted just weeks ago in the
AMD Radeon R9 Fury X, a compact card with watercooling that's built exclusively by AMD. Following that, we saw the release of the Radeon R9 Fury, a regular full-sized card that's not built by AMD, but its partners
ASUS, Sapphire, and PowerColor. Later this year, we expect the Radeon R9 Nano to come out, an air-cooled mini-ITX variant of the Fury Series.
Overclockers had high hopes for these cards, but were quickly disappointed when word spread that neither memory overclocking nor voltage control would be available on the Radeon Fury series. Normal overclocking without voltage increases yields an increase of around 10% in clocks, which is a little on the low side, but not unexpected for a new GPU chip.
Implementing voltage control on Fiji was more difficult than on most other cards. While the voltage controller on these cards is well-known and supports I2C (a method to talk to the voltage chip from the host PC through software), getting I2C to work on Fiji in the first place posed another set of challenges. Unlike NVIDIA, AMD does not provide good API support to developers, their ADL library outdated and buggy, with updates spaced years apart. So most software utility developers implement hardware access in the hardware itself to write directly to the GPU registers AMD is changing with every new GPU. AMD's developer support is pretty much non-existent these days. All my contact has been worried about for four weeks now is that I make sure I use AMD's "new" GPU codenames in GPU-Z (for the R9 300 Series re-brands).
With recent GPU generations, AMD has transitioned GPU management tasks away from the driver and onto a little micro-controller inside the GPU dubbed SMC, which is tasked to handle jobs like clock control, power control, and voltage control. On Fiji, this controller dynamically adjusts and monitors voltage, which helps with overall power consumption. However, it makes voltage control more difficult than before. When overriding voltage externally, the controller will sense a discrepancy between its target and the real voltage. Assuming a fault has occurred, it puts the GPU into its lowest clock state: 300 MHz. The voltage monitoring process also keeps the I2C bus very busy, which causes interference with other transactions, such as those sent by GPU-Z to do its own monitoring. If two of these transactions overlap, the resulting data will be intermixed or faulty, which will cause the SMC to sense another possible fault that has it set fan speed to 100% to avoid damage to the card after turning off the screen.
Working around this was no easy task, but it looks like I've finally managed to crack it, so voltage monitoring and software voltage control will soon be available in the software I make; other tool developers will soon follow as well.
On the following page, we will investigate what gains can be achieved from voltage adjustments to the Radeon Fury, how these overclocking results translate into real performance, and what this means for power consumption.
Test System
Test System |
---|
Processor: | Intel Core i7-4770K @ 4.2 GHz (Haswell, 8192 KB Cache) |
---|
Motherboard: | ASUS Maximus VI Hero Intel Z87 |
---|
Memory: | 16 GB DDR3 @ 1600 MHz 9-9-9-24 |
---|
Harddisk: | WD Caviar Blue WD10EZEX 1 TB |
---|
Power Supply: | Antec HCP-1200 1200W |
---|
Software: | Windows 7 64-bit Service Pack 1 |
---|
Drivers: | Catalyst 15.7 WHQL |
---|
Display: | Dell UP2414Q 24" 3840x2160 |
---|
We will be using Battlefield 3 at 4K resolution to verify performance gains. All testing was done at a stock memory clock of 500 MHz, stock fan speeds, and the power limit set to +50% to avoid throttling at higher voltages.
Overvoltage Results
First, I tested stock, to get a baseline reading, and undervoltage for additional data points. Next, I increased the voltage in steps of 24 mV (the voltage controller's minimum step-size is 6 mV). For each setting, I determined maximum BF3-stable clocks and recorded the performance, once stable.
As you can see, Fiji scales nearly linearly with voltage, and performance follows at roughly half the clock increase rate.
Near +96 mV, the power limiter will start to kick in from time to time during games, when set to default, which is why we set it to +50% for all these tests.
Once we reach +144 mV, which results in a scorching 1.35 V on the GPU, the maximum stable frequency reaches its peak. At this point, the VRMs are running temperatures above 95°C even though they are cooled by the watercooling loop via a nearby copper pipe. That much heat on the VRMs is definitely not good for long-term use. I would say a safe permanent voltage increase on an unmodded card is around 40 mV or so.
Going beyond 144 mV, to 168 mV, just causes the card to get massively unstable, with maximum stable clocks nearly down to stock voltage levels.