Tuesday, August 13th 2019
110°C Hotspot Temps "Expected and Within Spec", AMD on RX 5700-Series Thermals
AMD this Monday in a blog post demystified the boosting algorithm and thermal management of its new Radeon RX 5700 series "Navi" graphics cards. These cards are beginning to be available in custom-designs by AMD's board partners, but were only available as reference-design cards for over a month since their 7th July launch. The thermal management of these cards spooked many early adopters accustomed to seeing temperatures below 85 °C on competing NVIDIA graphics cards, with the Radeon RX 5700 XT posting GPU "hotspot" temperatures well above 100 °C, regularly hitting 110 °C, and sometimes even touching 113 °C with stress-testing application such as Furmark. In its blog post, AMD stated that 110 °C hotspot temperatures under "typical gaming usage" are "expected and within spec."
AMD also elaborated on what constitutes "GPU Hotspot" aka "junction temperature." Apparently, the "Navi 10" GPU is peppered with an array of temperature sensors spread across the die at different physical locations. The maximum temperature reported by any of those sensors becomes the Hotspot. In that sense, Hotspot isn't a fixed location in the GPU. Legacy "GPU temperature" measurements on past generations of AMD GPUs relied on a thermal diode at a fixed location on the GPU die which AMD predicted would become the hottest under load. Over the generations, and starting with "Polaris" and "Vega," AMD leaned toward an approach of picking the hottest temperature value from a network of diodes spread across the GPU, and reporting it as the Hotspot.On Hotspot, AMD writes: "Paired with this array of sensors is the ability to identify the 'hotspot' across the GPU die. Instead of setting a conservative, 'worst case' throttling temperature for the entire die, the Radeon RX 5700 series GPUs will continue to opportunistically and aggressively ramp clocks until any one of the many available sensors hits the 'hotspot' or 'Junction' temperature of 110 degrees Celsius. Operating at up to 110C Junction Temperature during typical gaming usage is expected and within spec. This enables the Radeon RX 5700 series GPUs to offer much higher performance and clocks out of the box, while maintaining acoustic and reliability targets."
AMD also commented on the significantly increased granularity of clock-speeds that improves the GPU's power-management. The company transisioned from fixed DPM states to a highly fine-grained clock-speed management system that takes into account load, temperatures, and power to push out the highest possible clock-speeds for each component. "Starting with the AMD Radeon VII, and further optimized and refined with the Radeon RX 5700 series GPUs, AMD has implemented a much more granular 'fine grain DPM' mechanism vs. the fixed, discrete DPM states on previous Radeon RX GPUs. Instead of the small number of fixed DPM states, the Radeon RX 5700 series GPU have hundreds of Vf 'states' between the bookends of the idle clock and the theoretical 'Fmax' frequency defined for each GPU SKU. This more granular and responsive approach to managing GPU Vf states is further paired with a more sophisticated Adaptive Voltage Frequency Scaling (AVFS) architecture on the Radeon RX 5700 series GPUs," the blog post reads.
Source:
AMD
AMD also elaborated on what constitutes "GPU Hotspot" aka "junction temperature." Apparently, the "Navi 10" GPU is peppered with an array of temperature sensors spread across the die at different physical locations. The maximum temperature reported by any of those sensors becomes the Hotspot. In that sense, Hotspot isn't a fixed location in the GPU. Legacy "GPU temperature" measurements on past generations of AMD GPUs relied on a thermal diode at a fixed location on the GPU die which AMD predicted would become the hottest under load. Over the generations, and starting with "Polaris" and "Vega," AMD leaned toward an approach of picking the hottest temperature value from a network of diodes spread across the GPU, and reporting it as the Hotspot.On Hotspot, AMD writes: "Paired with this array of sensors is the ability to identify the 'hotspot' across the GPU die. Instead of setting a conservative, 'worst case' throttling temperature for the entire die, the Radeon RX 5700 series GPUs will continue to opportunistically and aggressively ramp clocks until any one of the many available sensors hits the 'hotspot' or 'Junction' temperature of 110 degrees Celsius. Operating at up to 110C Junction Temperature during typical gaming usage is expected and within spec. This enables the Radeon RX 5700 series GPUs to offer much higher performance and clocks out of the box, while maintaining acoustic and reliability targets."
AMD also commented on the significantly increased granularity of clock-speeds that improves the GPU's power-management. The company transisioned from fixed DPM states to a highly fine-grained clock-speed management system that takes into account load, temperatures, and power to push out the highest possible clock-speeds for each component. "Starting with the AMD Radeon VII, and further optimized and refined with the Radeon RX 5700 series GPUs, AMD has implemented a much more granular 'fine grain DPM' mechanism vs. the fixed, discrete DPM states on previous Radeon RX GPUs. Instead of the small number of fixed DPM states, the Radeon RX 5700 series GPU have hundreds of Vf 'states' between the bookends of the idle clock and the theoretical 'Fmax' frequency defined for each GPU SKU. This more granular and responsive approach to managing GPU Vf states is further paired with a more sophisticated Adaptive Voltage Frequency Scaling (AVFS) architecture on the Radeon RX 5700 series GPUs," the blog post reads.
159 Comments on 110°C Hotspot Temps "Expected and Within Spec", AMD on RX 5700-Series Thermals
How about AMD makes attractive, complete products - not just in benchmarks, but also in real life (quiet, cool, easy to setup, tinker-free and well supported by OEMs)?
Maybe then they'll be able to sell more?
They're making products aimed at enthusiasts - willingly focusing on a group that is more enticed to pay "200~400$ more for 5~25% performance". I mean: how much do people on this forum spend on OC? :-)
If AMD lacks money on polishing their GPUs, they can do an FPO like every normal listed company would do. :-)
Sorry, but just because the max temp of the silicon may be 110C does NOT mean it should reach that normally. This would be like if I drove my car at 155 MPH every single day with the heat pegged out. AMD is just making excuses for their ludicrously junk cooler design. Reaching such high tempts then cooling off when not gaming is going to prematurely wear out these chips, especially their solder connections.
So silicone 120-130C?
On the other hand, 110°C being expected and in spec is a suspicious statement because we know these GPUs throttle at that exact 110°C point.
It is like saying Ryzen 3000 running at 95°C is expected and in spec. It is technically correct...
"This temperature is fine" but fan noise and throttling isn't...
and Im aware that AMD partners have fixed this issue, unfortunally they come a little late, a month late.
let's just hope AMD learned their lesson finally and do better coolers for 5800 xt
But where is your evidence of the card actually throttling?
Because if the reference design is really throttling and not boosting to full potential, the Sapphire Pulse wouldn't perform only marginally better even with a factory overclock.
At least that is my line of thought anyway. Still glad I got the 3 fan gigabyte version for only $20 more though :D
So RX 5700 XT is chilling compared to Nvidia blower. And don't forget thermi, unless you born yesterday.
Here is Nvidia blower temperature : www.guru3d.com/articles_pages/amd_radeon_rx_5700_and_5700_xt_review,8.html
The reason you don't see a significant boost is because the gain from pushing the frequency is poor in Navi(maybe driver issue?). This chip having an overclock of 15% results in a <4% performance gain
Nuff said, I would say.
You're not wrong about thermal density, its been a problem starting with Ivy Bridge's 22nm, I vividly remember Tomshardware making remarks on it as an explanation for the crappy heat transfer off die. Yet everyone insisted in complaining about shitty TIM instead. We know better now that Intel solders its high end range and still reaches boiling point. The real temp of Tjunctiion on Intel has been known for years. I'm not sure what you're trying to say here, other than those K models get really hot, which is absolutely true. But not 110C.
You're also not convincing me that as nodes (and thus transistor size/thickness of materials) get smaller, they can readily handle more heat. I'd say it is quite the opposite.
www.intel.com/content/www/us/en/products/processors/core/i9-processors/i9-9900k.html
forums.intel.com/s/question/0D50P0000490XQPSA2/thermal-management-for-intel-6th7th-generation-tcase-vs-tjunction?language=en_US They call that a loser's strategy, begging for people to keep coming to the rescue. A winner's strategy is what AMD does for CPU right now. They know how it works. Only the hardened AMD fanbase seems to have trouble grasping that.
I hope AMD is right about the "Nvidia Killer" they are working on... I believe it when I see it.. Would be awesome.
Also simply look at and compare vcore. Nvidia readily drops vcore as it reaches higher temps, AMD is much more liberal with that. And when you hit throttle point (84C on an Nvidia card and dropping boost biins won't suffice), you get bumped back rigorously, with vcore dropping to below 0,9V. Yes but one does not exclude the other, and you cán run a 9900K in spec at stock and even a little beyond that without needing custom water. Why do you think Intel doesn't deliver a boxed cooler?
Must've missed the P4, Atoms or various Nvidia GPUs then, brand name & market position are just as important if not more than the actual product in many cases!
As someone else said before Nvidia does not expose these hotspot temperatures so we can't compare them and know with certainty that Nvidia does deal with this as well.