Monday, January 2nd 2023

AMD Radeon RX 7900 XTX May Feature Faulty Coolers, Causing Overheating

AMD's latest GPUs have been reported to be experiencing overheating issues, with many users claiming that the vapor chamber cooler works better in a vertical rather than a horizontal position. Regardless of orientation, vapor chamber coolers should equal roughly the same heat dissipation performance and move the heat away from the source; however, testing showed that some reference AMD Radeon RX 7900 XTX GPUs feature defect coolers. According to the testing conducted by Roman "der8auer" Hartung, AMD's Radeon RX 7900 XTX RDNA3 GPUs are experiencing problems with overheating caused by a faulty vapor chamber design.

What der8auer found is that these coolers could have a defect in the manufacturing process, where the liquid inside the vapor chamber faces problems in circulation after condensation. It could relate to manufacturing issues of the cooler itself, with an inadequate amount of fluid or insufficient pressure inside the chamber. For more in-depth testing and performance benchmarks, see the video below. It is important to note that we didn't see other reports that replicate this behavior, so always take these reports with a dash of salt.
https://www.techpowerup.com/
Add your own comment

286 Comments on AMD Radeon RX 7900 XTX May Feature Faulty Coolers, Causing Overheating

#76
Dirt Chip
AusWolf"I'm not sure... possible... I guess... seems to be... I have a feeling..." This is all his own subjective opinion based on nothing (a sample size of 4 known to be faulty cards?), but considering the views and likes on the video, a lot of people seem to be moved by it, which is sad. Giving personal opinion as a conclusion, especially at the end of an investigatory video is unprofessional, and wrong. If you don't have facts, then don't say anything. And yes, "AMD is in BIG trouble" is the textbook definition of clickbait. AMD would be right to sue for reputation damage.

Regardless of his "feelings" and "guesses", the solution is clear as day: AMD has to do a proper investigation on the matter, and then issue a statement on what happens next. There's a proper way of handling a mass fault / recall. There always has been. One person's "feelings" don't change that, especially someone's who has zero clue on how widespread the problem is.
I don`s think you understand what I wrote- the 4 sample size is not what he base the recall on. It just used to isolate the problem.
He confirmed 40 more cases and thousands reporting the issue- those are his words.

His findings are not nothing, how can you say that if it moved AMD and others to investigate more?
He have facts - abnormally behaving cards - and with that he speculate further according to his experience and knowledge about vapor chamber and the industry- a valid thing to do as long as it is clear (and it is to me) that it`s your opinion and not truth from god himself.

We are very much in agreement that this issue must be replicated by others before any actual recall will take place.
Posted on Reply
#77
AusWolf
Dirt ChipI don`s think you understand what I wrote- the 4 sample size is not what he base the recall on. It just used to isolate the problem.
He confirmed 40 more cases and thousands reporting the issue- those are his words.

His findings are not nothing, how can you say that if it moved AMD and others to investigate more?
He have facts - abnormally behaving cards - and with that he speculate further according to his experience and knowledge about vapor chamber and the industry- a valid thing to do as long as it is clear (and it is to me) that it`s your opinion and not truth from god himself.
I'm not saying that his findings are nothing.

I'm just saying that they're not indicative of the whole picture by far. It requires further investigation.

I'm also saying that skipping the conclusion part without facts to back his "feelings" and "guesses" up would have been wise. You can speculate whatever you wish, but not when you're a public figure.
Dirt ChipWe are very much in agreement that this issue must be replicated by others before any actual recall will take place.
That I agree on.
Posted on Reply
#78
Dirt Chip
AusWolfEdit: I quickly skimmed through the video (20 minutes is too long to watch so early in the morning). He based his entire findings on the delta between hotspot and junction temp. He indicated that at some point, the delta drops, and the cards start to throttle. Earlier in the video, he shows that even though the cards reach max hotspot quite early, the junction temp starts to rise later, and that leads to the delta dropping... So... if the junction temp has room to rise, then where is the throttle? What are the clock speeds and power consumption? Has he run any stress test that give you an actual indication of the cards throttling? We don't know.



Edit: He claims a voltage and power consumption decrease. Does this manifest in decrease in performance too? Again, we don't know.
You are right, I also wondered on that slide and he might be mistaken with interpretation here. The delta lower because the average rise and the hotspot stay the same. But it still can`t explain the "flip test" findings.

I really can`t wait to others validating or disproof it.
Posted on Reply
#79
Bwaze
I remember when people were speaking against Der8auer in 2019 when he organized a poll about low boost clocks of Ryzen 3000 processors, and how his method (a public poll without proof of purchase) would just attract trolls and that the result was just indicative of activities Intel fanboys...

Der8auer: Only Small Percentage of 3rd Gen Ryzen CPUs Hit Their Advertised Speeds

And then AMD admitted the fault and promised they would fix the boost clocks in a new AGESA.

Posted on Reply
#80
AusWolf
Dirt ChipYou are right, I also wondered on that slide and he might be mistaken with interpretation here. The delta lower because the average rise and the hotspot stay the same. But it still can`t explain the "flip test" findings.

I really can`t wait to others validating or disproof it.
What I don't understand is why the junction temp starts to rise after a while... he says the GPU starts to receive less voltage and starts to throttle. Shouldn't that decrease temperatures? I mean, if point A has bad vapor chamber contact or whatever, and reaches 110 °C, and point B has good contact and is only 60 °C, then if you apply less voltage, point B should be cooler, shouldn't it? The fans are presumably at max. rev anyway due to the hotspot reaching its safe limit.

What I'd like to see is someone run a 3DMark Time Spy stress test on one of these cards to see how that "throttle" actually manifests in an observable way.
Posted on Reply
#81
Luminescent
They need to fire some people asap, for years of work to be wasted just because someone messed up the cooler is unacceptable.
Posted on Reply
#82
TumbleGeorge
AusWolfYou can add together anecdotal stories and do a proper investigation. Gamer's Nexus is famous for doing that. But the investigation has to end with a proper conclusion based on facts, and not "feelings".

Edit: I quickly skimmed through the video (20 minutes is too long to watch so early in the morning). He based his entire findings on the delta between hotspot and junction temp. He indicated that at some point, the delta drops, and the cards start to throttle. Earlier in the video, he shows that even though the cards reach max hotspot quite early, the junction temp starts to rise later, and that leads to the delta dropping... So... if the junction temp has room to rise, then where is the throttle? What are the clock speeds and power consumption? Has he run any stress test that give you an actual indication of the cards throttling? We don't know.



Edit: He claims a voltage and power consumption decrease. Does this manifest in decrease in performance too? Again, we don't know.
It has been confirmed that with some manual voltage reduction, these pre-factory overclocked PC components now run at reduced power consumption and temperatures without losing any performance. Such a loss begins again when a certain limit is passed in the reduction, and usually even then the loss of performance cannot really be felt by the average user in everyday life. But it can be measured by running tests.
Posted on Reply
#83
AusWolf
TumbleGeorgeIt has been confirmed that with some manual voltage reduction, these pre-factory overclocked PC components now run at reduced power consumption and temperatures without losing any performance. Such a loss begins again when a certain limit is passed in the reduction, and usually even then the loss of performance cannot really be felt by the average user in everyday life. But it can be measured by running tests.
That's why I'd like to see a Time Spy stress test result. Claiming that the card "throttles" means nothing, especially with an AMD card that doesn't have fixed clock steps.
Posted on Reply
#84
Suspecto
He published a 20 min long video explaining the methodology and how he had come to the conclusion. I usually read a thesis then think about the conclusion instead of complaining that I can't be bothered to watch 20 min long video and then spending even more time in total by writing messages and making screenshots about something I haven't watched yet. Seriously?
Posted on Reply
#85
AusWolf
SuspectoHe published a 20 min long video explaining the methodology and how he had come to the conclusion. I usually read a thesis then think about the conclusion instead of complaining that I can't be bothered to watch 20 min long video and then spending even more time in total by writing messages and making screenshots about something I haven't watched yet. Seriously?
Skimming through the video was enough to conclude that the data I'm looking for is not there. I watched the useful parts, but I'm not interested in personal speculation. My time is more precious than that.

Thanks for commenting.
Posted on Reply
#86
Suspecto
AusWolfSkimming through the video was enough to conclude that the data I'm looking for is not there. I watched the useful parts, but I'm not interested in personal speculation. My time is more precious than that.

Thanks for commenting.
You haven't watched the entire video so you can't know what is there and what isn't and which parts are actually useful and which aren't. Everything that you produced, is your own speculation because of not judging the complete data set and not knowing what is in it so you concluded it had NOT been there in the first place. The irony.
Posted on Reply
#87
AusWolf
SuspectoYou haven't watched the entire video so you can't know what is there and what isn't and which parts are actually useful and which aren't. Everything that you produced, is your own speculation because of not judging the complete data set and not knowing what is in it so you concluded it had NOT been there in the first place. The irony.
Except that I did not speculate anything. All I ever said was, more data is needed and AMD needs to investigate the issue further before we jump to conclusions.

I know what data is useful TO ME, and it's not there, believe me. I do not need a 20-minute commentary to understand a couple of diagrams. Feel free to show me the diagram with cards' clock speeds, or any stress test result if you disagree.

Nice try, but next time, try reading what I said before you comment.
Posted on Reply
#88
OfficerTux
AusWolfEdit: He claims a voltage and power consumption decrease. Does this manifest in decrease in performance too? Again, we don't know.
It seems highly unlikely to me that the card can have the same performance at 290W Vs 350W. I think his claims are valid. Still would be nice to see the achieved FPS and Clocks of the card though.
Posted on Reply
#89
AusWolf
OfficerTuxStill would be nice to see the achieved FPS and Clocks of the card though.
Exactly. Claiming something and proving it are different things.
Posted on Reply
#90
OfficerTux
AusWolfExactly. Claiming something and proving it are different things.
He did show video evidence of the Wattage dropping.

Findings with my Reference Design 7900 XTX:

I also have a very high delta of 25K between hotspot and average temperature. Hotspot ends up at 98°C. The card does not throttle, but the fans always run at full speed once it is loaded (~2800 RPM) which quite noisy.

My card is mounted horizontally, unfortunately I can not lay my case on the side to test vertical orientation, since my water pump for the CPU would run dry.

I will buy a waterblock anyway once it's available, but I am little bit worried about the resale value.
Posted on Reply
#92
Fluffmeister
Bomby569The hardest materials in the world are diamonds and the unwillingness of AMD fans to admit any problem
Bloody hell, almost spat my coffee out, well played sir.
Posted on Reply
#93
Dirt Chip
Zyll GoliatOk here is the latest de8auer video......
TL;DR
Structure seems fine, might be not enough H2O inside.
Also he show a survey by "ComputerBase" in German, so far ~25% of 223 vote they have this issue.
Posted on Reply
#94
Bomby569
Zyll GoliatOk here is the latest de8auer video......
i'm a fan of mayhem and destruction in general, but this video doesn't really say much if anything at all
Posted on Reply
#95
bobsled
Dirt ChipTL;DR
Structure seems fine, might be not enough H2O inside.
Thanks for the summary, saved a watch.
Posted on Reply
#96
Vya Domus
zlobbyAMD's has bad coolers?
AMD is not making coolers. They design them, the manufacturing is not done by them.
Posted on Reply
#97
zlobby
Vya DomusAMD is not making coolers. They design them, the manufacturing is not done by them.
Yet some of their (AMD's) cards allegedly have bad(ly performing) coolers.

And of course AMD is not manufacturing coolers. Heck, they stopped making the chips themselves a long time ago.

I'd argue even that AMD could be paying for someone to design the coolers for them, just as they do with chipsets.
Posted on Reply
#98
Dirt Chip
bobsledThanks for the summary, saved a watch.
Also he show a survey by "ComputerBase" in german, so far ~25% of 223 vote they have this issue.
It is for sure biased towards higher rate then true, but show the general scope (that is not sub 1% as with 12vhpwr).
Posted on Reply
#99
Vya Domus
zlobbyYet some of their (AMD's) cards allegedly have bad(ly performing) coolers.

And of course AMD is not manufacturing coolers. Heck, they stopped making the chips themselves a long time ago.

I'd argue even that AMD could be paying for someone to design the coolers for them, just as they do with chipsets.
I don't know if you understood what I said, AMD does not make coolers they just design them. And the coolers that they design work fine, look at the TPUs review of the 7900XTX where it reaches only 73C hotspot temperature. This is in an open test bench I think, nonetheless, there is clearly nothing wrong with the design. Why would they pay someone else for something that works just fine ? Maybe this was a QA issue that could have been spotted by AMD, I don't know how that process works, although it is still not clear to me how many of these things are faulty or if that's even the issue here. Nonetheless, if that is the problem, sadly AMD might have not had the ability to prevent this since they are not in charge of the manufacturing.

Dirt Chipso far ~25% of 223 vote they have this issue.
223 is a very small sample size, not the mention the obvious bias in being more likely to answer the pool if you do in fact have an issue. It's also not clear what "the issue" even is, do all of those people have cards that hit 110C ? Or do they hit other, lower, temperatures that the users find problematic.
Posted on Reply
#100
TheoneandonlyMrK
AMD 7000 Series Recall - RX 7900 XTX & XT:


And this is the shit that comes of such hyperbolic, conclusions, regurgitated Chinese whisper style, next go round alll AMD cards need recall.


Der8aure with millions of views comes a modicum of responsibility.

That guys dropping in my estimation I'll watch this new video of his.
Posted on Reply
Add your own comment
Dec 2nd, 2024 19:22 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts