Excellent.
Looks like the same issue that was affecting the Strix 3080 Ti. Except the Strix has "FBVDD" (memory power) exposed in Hwinfo, while your card doesn't. Might want to send a PM to Martin over on hwinfo forums and see if he can add this, assuming it's not some vbios bug making it not be reported.
On the Strix, it was memory power limit that was causing throttling. Notice your TDP Normalized is higher than your TDP?
Ok this is going to be long so I hope you can understand. It will answer both your questions about "GPU Usage" and about what causes "Power Limit throttling".
===============================
GPU Usage% has nothing to do with a power limit. GPU Usage is simply how much of a GPU is being utilized to push FPS out.
If your GPU is running at a max FPS cap, either set by Vsync (vertical sync), or by in-game FPS limiter, in game hardwired max FPS limit (e.g. Overwatch=400 FPS cap), or by Rivatuner's FPS limiter or "Scanline Sync" (useful when Vsync is off and the game's FPS limit slider causes screen tearing), if you reach that FPS limit steady, your GPU will be below 100% usage because it's reaching that limit and being prevented from going higher.
If you are not using a FPS limiter, your GPU is hitting some bouncing around higher FPS numbers (like let's say, 325 FPS) and is still not at 100% utilization, then in this case it's usually the CPU being unable to feed more frames or data to the GPU (CPU limit). This does NOT mean that the entire CPU is at 100% usage. It can simply mean that an individual thread is at its limit or the data is being saturated. This can be tricky to determine on modern multi-core CPU's because usage on threads often gets spread about and scattered about, rather than hammering an individual thread to 100%. Let's say one engine data thread is at max usage (100%). But it's spread out among four CPU threads to prevent saturation, so each of those threads is at 25% usage. Even worse, which thread is getting that 25% bounces around like a ping pong ball on the International Space Station. You get the point. So CPU or engine limitations can also prevent the GPU from pushing more frames, so GPU Utilization can be below its limit.
A Total Board Power Limit (TDP) makes things tricky because power limit causes the GPU to downclock itself. A downclocked GPU naturally has less available horsepower to push frames, so the lower the clock speed is, if you were NOT at 100% utilization, the closer you will get to 100% utilization. This should be self explanatory. But hitting a power limit by itself has nothing to do with how much a GPU is being utilized. You can hit a power limit even at 60% utilization (This is pretty lame, not gonna lie. If your GPU is hitting a power limit at 60% utilization, it means there's a lot of theoretical headroom left.
This is where it gets confusing. Remember there are MULTIPLE power rails in a GPU that deal with rendering. So for example let's say you were using a 165 FPS cap and 400W TDP and were at 82% utiliazation, reachiing 397W and hitting a power limit, and the board downclocks to 1980 mhz, 0.950v, from 2070 mhz theoretical @ 1.081v (just a few steps, it doesn't hard on throttle). Then you remove the FPS cap, it reaches 202 FPS, downclocks to 1905 mhz, 0.912v (from 2070 theoretical @ 1.081v) and is now at 99% usage and showing 401W this time. 4W for 37 FPS? This is another example of multiple power rails being used here to help render (it isn't all just TDP). Basically, different parts of a GPU can be used in different ways at heavier or lighter loads.
Here's an even more bizarre example.
Path of exile (uncapped). PL=114% (3090 FE=400W).
401W, 1920 mhz, 0.919v, 99% usage, 456 FPS. Settings: Global Illumination/shadows Quality=High. TDP Normalized=74%, TDP @ 99.8%
401W, 1815 mhz, 0.863v, 207 FPS, 99% usage. Settings: Global Illumination/shadows Quality=Ultra. TDP Normalized=86%, TDP=99.8%.
Notice the same power draw, same usage, but core clock has dropped more even though the power draw is the same? (TDP normalized% being higher is a clue--an internal rail is loaded heavier)
That's because GI/Shadows=Ultra loads a different part of the GPU core harder than GI:High so FPS is much lower. In both cases the card is at 99% usage because it's pushing out as many frames as it can at that setting. Just the core itself is under heavier load.
That means that usage has to do with the card pushing out as many FPS as it can until it has no more horsepower available, regardless if it's 200 FPS or 800 FPS.
Now power limits gets tricky.
TDP is what everyone knows about. But total board power (total design power) is simply the sum of the 8 pin connectors + PCIE Slot Power. So for a board with 400W TDP, this might be 165W + 165W +66W=396W at 114% TDP slider (e.g. Founder's Edition 3090, etc). So TDP is simply the total max power from any "combination" of all the 8 pins + PCIE Slot Power added together. It's not a fixed value from the rails themselves, but there is power balancing attempted on design. This is where it gets messy.
TDP% is not the only power limit that causes throttling.
There is also called "TDP Normalized%". And on most 2x8 pin cards, there are actually 8 individual rail power limits
that report to TDP Normalized that can also throttle you long before TDP% reaches its limit! (TDP% also reports to TDP Normalized btw, but this only matters if the rail power limits are set so sky high that they never reach them no matter what--the Kingpin 1000W Bios does this, in that case, TDP Normalized will = TDP %)
These power limits are:
GPU Chip Power
Memory (MVDDC/FBVDD)
8 pin #1 **
8 pin #2 ** (and 8 pin #3 if present)
PCIE Slot Power
Power Plane SRC Power **
NVVDD voltage rail power
MSVDD voltage rail power
SRC is a special case because the individual 8 pins have their own SRC rail,
called SRC1, SRC2 and SRC3 (if present), while the SRC chip itself has its OWN rail.
The SRC1/2/3 rails control the maximum limits of the 8 pin power rails linked to it.
If the 8 pin power rails exceed the linked SRC1/2/3 rails, you throttle.
If the SRC rail exceeds its own power limit, you throttle.
The SRC (Power plane source chip) controls power balancing and monitoring on all the other rails
How this is done, you need to ask Nvidia, and they sure won't tell you. Try to find a schematic.
Anyway...
TDP Normalized is simply the rail which reaches CLOSEST to its own TDP limit, compared to any other rail.
It's not a specific wattage or amps draw. TDP Normalized also does NOT respond to a TDP slider value
below 100% with respect to throttling UNLESS TDP% is the highest of any of the other rails (TDP% acts
as its own rail).
Some examples:
Rail 1 has a 100% (default) limit of 10W and a max limit of 20W
Rail 1 is pulling 9W. TDP of Rail 1 is 90%.
Rail 2 has a 100% (default) limit of 223W and a max limit of 270W
Rail 2 is pulling 112W. TDP of rail 2 is 50%.
Which rail get reported to TDP Normalized %? Rail 1! Because Rail 1 is closest to a "max" limit
So if no other rail (including main TDP) were closer than 90% to max, your TDP Normalized would be 90%.
Exceeding TDP Normalized throttles you the exact same way that TDP does.
The only difference is that the "Sub-rails" will not report a TDP normalized value to enforce throttling
if they are below 100% and your TDP slider is below 100%, while TDP% itself will.
The TDP% slider past 100% will allow the TDP Normalized sub-rails to exceed their internal 100% values up
to either their maximum value (if their value is stated in BIOS) or that percentage you set the TDP% slider to.
This does not apply if the Bios "Default" and "Maximum" Sub-rail values are set to the same value.
You would need a hex editor and disassembly skills in order to determine what these values are.
Some boards have one or more of the sub-rails set far too low. This will cause you to get a power limit throttle
even though your TDP% (total board power) isn't even close to its own limit. Because the sub-rail
has hit its limit first. From what I've seen on some 3080 TI's, this is usually "Memory power" causing it.
Your memory power rail isnt being exposed to NVAPI (at least not on hwinfo) but it is ALWAYS exposed to TDP Normalized.
So from what I saw on the Strix 3080 Ti boards reporting 162W memory power draw (MORE THAN RTX 3090s' !!!!!),
that's what is probably going on
This could be a flaw in the power balancing hardware,
as 3090's with double the VRAM usually report about 120W on this rail...
Since this seems to be occurring on multiple cards, I doubt a RMA would help you.
You can try flashing the "Galax 1000W" 3080 Ti bios and see if this increases the memory power rail draw (if your card is a 2x8 pin card, your total
max TDP will be 66% of 1000W or 667W, due to the missing 8 pin #3 which will be "duplicated" from 8 pin #1
Seems like only Strix and a few other cards are reporting memory (MVDDC/FBVDD) power limit to windows. That seems to be what is causing throttle (on the cards that report it on 3080 TI's, this is exceeding 162W. On the 3090 ROG Strix, the ampere BIOS editor (which won't work on any 3080 Ti, any newer builds are private and I don't know how to get it) shows the memory power limit on that card to be exactly 162W, which it never reaches at stock...)