• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

[FPS/W] FPS per Watt in GPUs and how it works in practice.

Joined
May 8, 2016
Messages
1,884 (0.61/day)
System Name BOX
Processor Core i7 6950X @ 4,26GHz (1,28V)
Motherboard X99 SOC Champion (BIOS F23c + bifurcation mod)
Cooling Thermalright Venomous-X + 2x Delta 38mm PWM (Push-Pull)
Memory Patriot Viper Steel 4000MHz CL16 4x8GB (@3240MHz CL12.12.12.24 CR2T @ 1,48V)
Video Card(s) Titan V (~1650MHz @ 0.77V, HBM2 1GHz, Forced P2 state [OFF])
Storage WD SN850X 2TB + Samsung EVO 2TB (SATA) + Seagate Exos X20 20TB (4Kn mode)
Display(s) LG 27GP950-B
Case Fractal Design Meshify 2 XL
Audio Device(s) Motu M4 (audio interface) + ATH-A900Z + Behringer C-1
Power Supply Seasonic X-760 (760W)
Mouse Logitech RX-250
Keyboard HP KB-9970
Software Windows 10 Pro x64
Since GN was using this metric in video about CPUs, I decided to take a crack at it from GPU side.
Short answer : It's complicated...
Long answer : It depends ;)
(I know, longer answer is shorter than short one :D)

But let's start with performance chart which has both FPS and Power :
Metro Exodus 1080p (stock).png

Watt's scale is on the left Y-axis, while GPU Voltage one is on right one. PThr = Power throttle, TThr = Temperature throttle (if either is reached, GPU decreases it's operating frequency).
Result is average FPS from 3 runs of build-in benchmark tool set at "Extreme" preset (and under DX12).
GPU = "Stock" settings in this case include : Maxed out TDP slider (Radeon Software or Afterburner, and new limit in [W] can be seen as "TDP Limit" bar), custom fan speed curve [where +1C in GPU temp = +1% in fan speed, example 80C = 80%, 90C = 90%] - to avoid temperature and power throttling), and some VRAM OC on few selected cards (Vega 64 [to reach 512GB/s of bandwidth], Titan Xp [to get to 576GB/s], Titan V [640GB/s], 2080 Super [512GB/s], and 3070 Ti [640GB/s]).
Max power draw detected by GPU-z means either board power (for Nvidia based cards), or "GPU Chip power" (for AMD cards).
Not sure why "GPU Chip power" means TGP in GPU-z (tested with HWinfo64), but whatever - I confirmed what it is regardless.
Lastly, I added vGPU [V] (or maximum GPU voltage detected by GPU-z) - to have better picture of why power is where it is, and to compare to extra thing I did next :)

Metro Exodus 1080p (Deep UV).png

^This is the same test, but with what I call "deep undervolt" settings (at least for Nvidia and RDNA cards).

In short : It's the usual undervolt, however it forces minimal GPU voltage allowed*/available to each card and locks it as max. value allowed (*example : In NV cards case, requirement is full 3D clocks on VRAM). This is done regardless of frequency/performance degradation it will cause (since minimum voltage WILL NOT be stable at default frequencies, may even cause issues at base frequency - depending on card/silicon lottery).

Sadly, AMD cards I tested - Radeon VII and Vega 64 LC, have too little headroom in undervolt department for their result to be considered "Deep undervolt".
On the other hand, RX 6700 XT got "short end of the stick" since it scales much higher on frequency at it's minimum voltage level allowed [~920mV] (set under Radeon Software). However it can somewhat recover due to boosting algorithm AMD uses (actual max. frequency can be +30/-30MHz [higher or lower]).
The cause for all of this, is locked ~1620MHz maximum target frequency for GPUs (which is still quite high for GCN class cards and quite low for RDNA2).
Aside from those, other settings stayed the same as on previous graph (maxed TDP, fan speed, etc.).

Now that we have some data, we can try to make a fun FPS/W chart :
Metro Exodus 1080p.png

This combines data from both previous charts (+ drops GPU voltage), FPS/W scale is on the left, while FPS one is on the right.
Data from above graph can be used to easy calculate card's power draw during test, by simply dividing FPS value on bottom of the bar, with FPS/W value on top.
I grouped stock and DUV results together, so that nobody tries to use wrong numbers for this.
Example : RTX 2080 Super
For "stock" we have 51[FPS]/0,18[FPS/W] = 283,3[W]
For "deep undervolt" 45/0,29 = 155,2 [W]
Based on those numbers : Power usage decreased by 128,1W (that's 45,2% drop), while average performance dropped by 13% from stock settings.
Card runs cooler and it's way less noisy as side effect.
Best card in Metro was RX 6700 XT (highest FPS/W) with both RTX and RDNA1 cards sharing second place.
This is where we get to some limitations of FPS/W metric...

Yes, it can show which architecture is the most efficient in generating frames, BUT it has four limitations :
1) It requires either Watts and/or plain FPS it was calculated from, to also be shown along with it (as I did with my graphs).
Without those additional information, there is no way to tell if performance is "good enough" (ie. over 60FPS/120FPS), OR if power during tests was kept "under control".
Example : One might be interested in FPS/W, to get best performing card - but without FPS metric shown, he can't know what level of performance card actually reached.
On the other hand, someone else may want best performing card with 180W TDP, however with only FPS/W shown, he also can't be sure how many watts card consumed to get to that score. Graph will just shows "best" but that card may not be what You are looking for.
2) FPS/W can get easily influenced by incorrect measuring of power and/or frames per second, making it quite complex to validate (for example : GPU-z showing "GPU Chip" as power metric for Radeons and comparing that to NVidia's own "GPU Chip" power draw metric [they are NOT the same, as mentioned previously - always validate power metric with second program like HWInfo64 to avoid issues).
3) Silicon lottery and binning : If someone tries to test this without limiting frequency or voltage, things will get messy (since other cards may not boost as high [to reach max. FPS] due to more power usage at the same voltage, or they will just crash at lower voltage level due to instability).
Decreasing frequency to bigger degree (as I did), helps mitigate this possible problem.
4) Some may think it's sort-of independent from power/thermal limitations, since it scales with both performance and power at the same time, but there are limits.
Throttling decreases performance, and it might decrease power as well, but that's not guaranteed. It depends on both cooler/PCB design as well as vBIOS behavior under such circumstances.

For testing of that last point, I tried my luck in Furmark2 :D
Furmark2 1440p (stock).png

Again, vGPU is max GPU Voltage recorded by GPU-z.

It's very much NOT a game, BUT in "Stock" settings - there isn't much difference between this and Metro Exodus power numbers (aside from FPS being lower in game due to settings used :p). Especially for Titan Xp/Titan V, and RTX 3070 Ti/2080 Super, which ALL reach power and/or thermal limits in both game benchmark and here.
I know, some will say "this is different", since actual gameplay has vastly more GPU usage spikes than Furmark2 and "card will use less power in just few miliseconds".
OK, however what if - it won't ?
Then we get "New World's RTX 3090" issue (game which vBIOS power/thermal doesn't recognize as "virus") :(
Some may claim : "undervolting will save your card", and well...
Furmark2 1440p (Deep UV).png

Yes, and no... it depends. Based on above numbers - it's mostly GPU complexity that's at fault (bigger die = bigger problem).
Some cards will be fine (like both RDNA ones I tested, or 1660 Ti and 2080 Super), others not so much (I don't have "big die" from either last and current gen to test this with). I think newer/bigger GPUs are simply running too high voltage for any kind of undervolt to "fix this" (as can be seen in cards that reach "TDP Limit" even with just 0.775V going to GPU die), when another "New World" style scenario occurs. Such cards will run into thermal runaway situation, and either destroy themselves, or crash PC before something happens (GPU driver picking "less evil" option here for the user). Of course, this is just speculation on my part, since much depends on cooling used (for example, my Vega 64 LC was sitting happily at 81C on hot spot [highest temp recorded], during both Furmark2 tests and at 83C hot spot during in-game benchmark test).

BUT moving on to FPS/W...
Furmark2 1440p.png

Furmark will "flat" all FPS/W results from cards that run into power/thermal limits (with or without undervolt).
Results for them may be high (in FPS terms), but calculating power used shows much grimmer picture (from thermal and noise perspectives).
However, this graph can also show us which cards can deliver full performance even under "power virus" type workload.
Side note : Cards "ranking" from left to right is up to tester.
I put throttling cards on left (with raising performance, not that it matters much), and then shifted focus to overall power used.

To me, best card from all the ones I tested is RX 6700 XT (since it reaches 78FPS while consuming 125W of power, AND it has more to give since I nerfed it so much on frequency side :D). GTX 1660 Ti get's a second place, because it's only card that's under 150W mark (133W) during Furmark2 test (that's what I get for picking top cards of old generations...). Again - depending on what you are looking for, RX 5700 XT may be better option than 1660 Ti (example : you want card with higher performance than 1660 Ti, but which doesn't draw more than 200W with undervolt).

As I mentioned in the beginning, FPS/W is a complicated topic where everything depends.
I really like it though, since it can show how much undervolting can give to users of all GPUs.
I hope this results and topic itself prove useful to someone trying to do tests with FPS/W in mind (using GPUs).

EDIT : Forgot to add platform used to test all this... oops :p
CPU : Core i9 11900k (@5.1GHz locked [253W power limit])
Cooling : 360 "AIO" cooler
MB : Z490 Dark
RAM : 2x 16GB RAM (B-Die, tweaked)
OS : Latest Windows 10 (22H2)
 
Last edited:
Joined
May 8, 2016
Messages
1,884 (0.61/day)
System Name BOX
Processor Core i7 6950X @ 4,26GHz (1,28V)
Motherboard X99 SOC Champion (BIOS F23c + bifurcation mod)
Cooling Thermalright Venomous-X + 2x Delta 38mm PWM (Push-Pull)
Memory Patriot Viper Steel 4000MHz CL16 4x8GB (@3240MHz CL12.12.12.24 CR2T @ 1,48V)
Video Card(s) Titan V (~1650MHz @ 0.77V, HBM2 1GHz, Forced P2 state [OFF])
Storage WD SN850X 2TB + Samsung EVO 2TB (SATA) + Seagate Exos X20 20TB (4Kn mode)
Display(s) LG 27GP950-B
Case Fractal Design Meshify 2 XL
Audio Device(s) Motu M4 (audio interface) + ATH-A900Z + Behringer C-1
Power Supply Seasonic X-760 (760W)
Mouse Logitech RX-250
Keyboard HP KB-9970
Software Windows 10 Pro x64
Can't edit my post already, so I just put a disclaimer here :

AMD power numbers are inaccurate and can't be used in direct comparisons to Nvidia cards
You simply cannot compare them. AMD uses a formula to estimate a value with unknown mechanics. NVIDIA actually measures the real current and voltage using dedicated circuitry
Source : LINK.

Wish I knew this before... I didn't bother to use program for power measurement on AMD cards (and would just drop them outright).
If this is correct, all monitoring programs can't show accurate enough values for AMD cards (not only GPU-z), for direct comparisons like this.

Nvidia cards are still correct.
 
Last edited:

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
27,665 (3.70/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
I unlocked the editing time limit for the first post, you should be able to edit it now
 
Top