• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

WX9100 Power mods - help w/ powerplay tables

toshiaki_12

New Member
Joined
Mar 10, 2024
Messages
6 (0.02/day)
You need to rethink your fan shroud, it's simply not getting enough airflow directed through the heatsink. If you were to extend that plexi shroud along the full length of the card and do away with the tape it would probably work better. Basically, for fans right angles and restrictions/reductions are the enemy to airflow so the less of those you have the better.

This is how I have mine set up currently; Just a hole cut into the shroud and a piece of cardboard as an air dam to help guide the air.

View attachment 339991View attachment 339989

Here's how it performs @ 220W without an undervolt (thanks to AMD's janky software I accidentally removed the undervolt while setting the fan speed higher).
Certainly hot, but that's totally normal for Vega. I normally run at 170W undervolted which makes it settle around 75-80C with the fan only at 2000 RPM.

View attachment 339990

I also suggest using the WX 9100 or Vega FE PPTables so the HBM runs at 945MHz, the Vega 56/64 tables make it stay at or strongly favor 800MHz.

220w! It's amazing.

thank you very much.
I will try various things.

Ah, I think this is probably due to hot spot temperature.
I searched and found some information about it. Old posts about radeon vega hot spot temperature.
It seems that for some reason the hot spot temperature of some Radeon Vega can reach 105 degrees.
Does it depend on how you apply the grease or the gap between the heat sink and the chip?
I can't figure it out.
(I applied ZM-STC8 a little thicker. 8.3W/mK)

I rebuilt the shroud, but the temperature did not improve.
I had the fan running at 4500rpm+ and thought the exhaust power was sufficient, but the hot spot temperature reached 105 degrees.

Sorry for the many images.
That's to show vbios version etc.

wattman was also revived by whql-amd-software-adrenalin-edition-24.3.1-win10-win11-mar20-vega-polaris, "disableworkstation 1", and pptable.

lol_004.jpglol_003.jpglol_002.jpglol_001.jpg2431_disableworkstation+fe pptable_-50% limit3.jpg2431_disableworkstation+fe pptable_-50% limit2.jpgstock_2431_002.jpgstock_2431_001.jpg
 

Attachments

  • 2431_disableworkstation+fe pptable_-50% limit.jpg
    2431_disableworkstation+fe pptable_-50% limit.jpg
    375.8 KB · Views: 43
  • stock_2431_003_fan full 4000rpm+.jpg
    stock_2431_003_fan full 4000rpm+.jpg
    389.9 KB · Views: 46
  • stock_2431_003_fan default max2000rpm.jpg
    stock_2431_003_fan default max2000rpm.jpg
    305.5 KB · Views: 46
Joined
Jul 19, 2015
Messages
996 (0.29/day)
System Name The Banshee
Processor Ryzen 5 5600 @ 4.65GHz CO -30
Motherboard AsRock X370 Taichi
Cooling Asus ROG Strix LC 240
Memory 32GB 4x8 G.SKILL Trident Z 3200 CL14 1.35V
Video Card(s) PCWINMAX RTX 3060 6GB Laptop GPU (80W)
Storage 1TB Kingston NV2
Display(s) LG 25UM57-P @ 75Hz OC
Case Fractal Design Arc XL
Audio Device(s) ATH-M20x
Power Supply Evga SuperNova 1300 G2
Mouse Evga Torq X3
Keyboard Thermaltake Challenger
Software Win 10 Pro 64-Bit
220w! It's amazing.

thank you very much.
I will try various things.

Ah, I think this is probably due to hot spot temperature.
I searched and found some information about it. Old posts about radeon vega hot spot temperature.
It seems that for some reason the hot spot temperature of some Radeon Vega can reach 105 degrees.
Does it depend on how you apply the grease or the gap between the heat sink and the chip?
I can't figure it out.
(I applied ZM-STC8 a little thicker. 8.3W/mK)

I rebuilt the shroud, but the temperature did not improve.
I had the fan running at 4500rpm+ and thought the exhaust power was sufficient, but the hot spot temperature reached 105 degrees.

Sorry for the many images.
That's to show vbios version etc.

wattman was also revived by whql-amd-software-adrenalin-edition-24.3.1-win10-win11-mar20-vega-polaris, "disableworkstation 1", and pptable.

Maybe you can try the thermal paste application method shown here:

This is roughly how I do mine and it seems to work well.
 

toshiaki_12

New Member
Joined
Mar 10, 2024
Messages
6 (0.02/day)
Maybe you can try the thermal paste application method shown here:

This is roughly how I do mine and it seems to work well.
thank you.
I tried igor's method three times, but unfortunately it didn't seem to work. For me.
Perhaps my mi25 is an extreme "not good individual."

It's funny that the difference between GPU and hotspot is 30 degrees.
(150w limit, fan 100%)
150w limit fan max.jpg

Cleaning with isopropyl alcohol and toothbrush, and thermal grease after testing.
clean.jpgafter test1.jpgafter test2.jpg
 
Joined
Jul 19, 2015
Messages
996 (0.29/day)
System Name The Banshee
Processor Ryzen 5 5600 @ 4.65GHz CO -30
Motherboard AsRock X370 Taichi
Cooling Asus ROG Strix LC 240
Memory 32GB 4x8 G.SKILL Trident Z 3200 CL14 1.35V
Video Card(s) PCWINMAX RTX 3060 6GB Laptop GPU (80W)
Storage 1TB Kingston NV2
Display(s) LG 25UM57-P @ 75Hz OC
Case Fractal Design Arc XL
Audio Device(s) ATH-M20x
Power Supply Evga SuperNova 1300 G2
Mouse Evga Torq X3
Keyboard Thermaltake Challenger
Software Win 10 Pro 64-Bit
It's funny that the difference between GPU and hotspot is 30 degrees.
Ahh yours is the version without epoxy in-fill, that's why your hotspot is so high. Looks like your already using plenty of thermal paste so that's probably going to be as good as it gets unfortunately.

I do wonder if something is wrong with the heatsink itself? 73C/150W @ 4800 RPM really looks like the heatsink just isn't doing it's job.
 

toshiaki_12

New Member
Joined
Mar 10, 2024
Messages
6 (0.02/day)
Ahh yours is the version without epoxy in-fill, that's why your hotspot is so high. Looks like your already using plenty of thermal paste so that's probably going to be as good as it gets unfortunately.

I do wonder if something is wrong with the heatsink itself? 73C/150W @ 4800 RPM really looks like the heatsink just isn't doing it's job.
That may be the case. Vapor chamber failure etc.
I have no way of verifying that….

The force of the exhaust.
 
Joined
Apr 18, 2019
Messages
2,328 (1.15/day)
Location
Olympia, WA
System Name Sleepy Painter
Processor AMD Ryzen 5 3600
Motherboard Asus TuF Gaming X570-PLUS/WIFI
Cooling FSP Windale 6 - Passive
Memory 2x16GB F4-3600C16-16GVKC @ 16-19-21-36-58-1T
Video Card(s) MSI RX580 8GB
Storage 2x Samsung PM963 960GB nVME RAID0, Crucial BX500 1TB SATA, WD Blue 3D 2TB SATA
Display(s) Microboard 32" Curved 1080P 144hz VA w/ Freesync
Case NZXT Gamma Classic Black
Audio Device(s) Asus Xonar D1
Power Supply Rosewill 1KW on 240V@60hz
Mouse Logitech MX518 Legend
Keyboard Red Dragon K552
Software Windows 10 Enterprise 2019 LTSC 1809 17763.1757
That may be the case. Vapor chamber failure etc.
I have no way of verifying that….

The force of the exhaust.
If that's the exhaust out of the card, then there's something wrong w/ the heatsink, etc.
Even w/ a FRACTION of that airflow through the dense fins, I'd had no issue keeping 200-225W cool on the stock heatsink.

1711312192155.png

Interesting... Looks like there's quite a 'thick' layer remaining. In my experience, it 'smooshes' out thinner. Then again, I have 'filled' HBM on mine.
I'd consider more mounting pressure but, I'm not 100% sure as to the consequences.

AFAIK no cooler other than a WX9100's will 100% 'bolt on'. Alternatively, you could keep an eye out for a Raijintek cooler.
(~$80 new, but could mount to other GPUs, later)

Edit:
Try to get ahold of conron over PM.

He's a newer member but, has/had a 'damaged' MI25; he'd probably send ya for the cost of shipping more/less.
Could salvage the heatsink for yours.
 
Last edited:

toshiaki_12

New Member
Joined
Mar 10, 2024
Messages
6 (0.02/day)
If that's the exhaust out of the card, then there's something wrong w/ the heatsink, etc.
Even w/ a FRACTION of that airflow through the dense fins, I'd had no issue keeping 200-225W cool on the stock heatsink.

View attachment 340545
Interesting... Looks like there's quite a 'thick' layer remaining. In my experience, it 'smooshes' out thinner. Then again, I have 'filled' HBM on mine.
I'd consider more mounting pressure but, I'm not 100% sure as to the consequences.

AFAIK no cooler other than a WX9100's will 100% 'bolt on'. Alternatively, you could keep an eye out for a Raijintek cooler.
(~$80 new, but could mount to other GPUs, later)

Edit:
Try to get ahold of conron over PM.

He's a newer member but, has/had a 'damaged' MI25; he'd probably send ya for the cost of shipping more/less.
Could salvage the heatsink for yours.
Thank you for your suggestion.
However, since I live in Japan, I have not contacted Mr. Conlon because I think the shipping costs would probably be horrendous.

MI25 was on sale again, so I bought it.
That MI25 was very normal.
another mi25.jpganother mi25_temp_001.jpg
another mi25_fan curve_001.jpganother mi25 setup.jpg
(aluminum tape as usual.)

And disassembly.
right is a "good heatsink".
TIM looks like graphite.
heatsink compare.jpg


I installed "Good Heatsink" on "bad MI25 (hotspot 105 degrees)" and tested it.
(graphite sheet has been removed and 17W/m・K grease has been applied)
temperature has dropped dramatically.
badmi25 with good heatsink_003 without graphite.jpgbadmi25 with good heatsink_004 without graphite.jpg

Yeah, I think it's definitely a heatsink failure, and I think it's extremely rare.

"MORPHEUS II CORE EDITION" is also sold in Japan. Approximately $83.
I'll probably buy it someday...no? try water cooling...?

Thank you!

The MI25 I bought is for blade servers and does not have an "instinct logo case".
Making a casing is very tedious.
server attachment.jpg
 

Jin

New Member
Joined
Apr 19, 2024
Messages
1 (0.00/day)
I was somewhat surprised that the MI25 thread is still alive in 2024, I thought these cards were not so popular anymore, but since it's actually still active I figured I'd share my story.

I have a total of three MI25 cards, two of them are a "server version" where I had to use an angle grinder and a drill to make them fit into a desktop case and the other one is for a desktop mount (that's the one with the nice black cover and branding).

I'm using Fedora, currently on ROCm 6.0.0

Now, the "desktop" one was stuck at a maximum setting of 110W and refused to accept anything higher, which I knew was not true, since the "server" versions which are the same hardware had a max setting of 220W out of the box. This is btw how I found this thread - looking on how to reset the card to get it to 220W. I however did not want to reflash it with firmware from another model, so I took a slightly different approach. I simply dumped the firmware from one of my "server" cards and flashed it onto the "desktop" card, which indeed did reset the max power setting to 220W as it should be.

Had a fair share of suffering with cooling, after trying a number of fans and a number of shroud designs I ended up throwing everything out of the desktop case onto a mining rig and designing slightly longer shrouds for an 80mm server fan (Arctic S8038-10K) which blows into the card casing from the back. This seems to keep the card at around 52-54°C during SDXL inference with the fan not being at max in a cooler environment ~21°C room temperature.

For LLMs, with llama.cpp I can split inference between all three cards and thus also have the model loaded completely in vRAM across the cards, the fans remain at lower RPMs, seems the cards to not get stressed out to much in such a setup.

I use cheap HW585 fan controllers from AliExpress which are not ideal, but they seem to be OK for the job, I placed the thermistor at the front of the card where the hot air is coming out. The issue with the HW585 controller is, that once it gets into the initial hotter temperature range it starts to spin the fans up and down in waves, following the temperature rise and the temperature drop, which is quite annoying. It does keep the cards cool though.

I would have preferred to control the cards from Linux by reading out the temperature sensor in software, but the solutions I found were either too expensive or required too much fiddling.
 

Attachments

  • DSCF5440.JPG
    DSCF5440.JPG
    1.5 MB · Views: 49

JimminyCrumpet

New Member
Joined
Nov 30, 2023
Messages
4 (0.01/day)
Also new funky video from Iceberg Tech about HBCC. The TLDR: Dont forget to turn off HBCC guys

Sorry, don't watch youtube videos, but that's oddly different than my experience with it which was a 5-10fps increase in Borderlands 3 2560x1600 run at customized ultra+ settings (turned off temporal AA because it made things worse and regular AA because the resolution made it irrelevant, but turned up reflections and everything else that was possible... fog wasn't even on 16GB cards, it used 3D point clouds / textures in that game and destroyed memory) .

The bigger problem with it is that it would make all non-game software crash. Adobe Lightroom Classic's internal GPU acceleration would self-destruct if you left it on, and the implementation was a massive hack that involved an AMD-installed / branded driver for the >4GB address space of the PCIe bus on Intel systems. The system I had the vega in initially only had 32GB of ram which wasn't too shabby (people still install less than that for some insane reason) but it caused another issue... The video driver was apparently assuming it could always use the HBCC memory block but never really told the OS it was using it. I guess they figured since games were limiting themselves to <4 or 8GB of ram at the time because consoles they could get away with it as long as you had enough. I had mine set to 8GB so if I used a single program that went over 24GB of ram allocations (3D studio max) it would allocate part of the HBCC but texture data could and would still end up there and overwrite parts of the software, which usually resulted in a BSOD. I only turned it on for bandwidth-heavy games.

These days resizable bar is the more correct way of making the same thing happen so there's not much reason to use the old hbcc hack at all unless all you have is a vega and can't turn on rebar. It showed up as an option on mine but the mobo didn't support it and I didn't feel like doing the kludge of a bios hack to the x99 board to make it work because it has zero error checking and hardcodes the GPU slot and I just knew I'd forget about it and end up breaking boot a year later.

I just wanted to chime in and say it actually did something useful at one point. Maybe it wouldn't have on a dual channel ddr4 board but quad channel had plenty of bandwidth to spare.
 
Top