• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVidia now HIDING hot spot temperature? A great problem IMO.

Joined
Jan 19, 2023
Messages
326 (0.44/day)
Yeah like that time when 3090 and 3080 FE memory chips were easily reaching 100C (brand new card, after time and some dust it would be worse) while max tj junction temp was 105, that was easily fixable by swapping thermal pads on the memory chips, Nvidia engineers sure knew what they were doing then.

BTW I had that issue on my 3080 FE as well, and hotspot reading was useful then too, as I used too thick pads and my hotspot was a lot higher than it needed to be. Easily checked that and bought thinner pads. Another case where hotspot temp could be useful.
 
Joined
Sep 3, 2019
Messages
3,765 (1.91/day)
Location
Thessaloniki, Greece
System Name PC on since Aug 2019, 1st CPU R5 3600 + ASUS ROG RX580 8GB >> MSI Gaming X RX5700XT (Jan 2020)
Processor Ryzen 9 5900X (July 2022), 200W PPT limit, 80C temp limit, CO -6-14, +50MHz (up to 5.0GHz)
Motherboard Gigabyte X570 Aorus Pro (Rev1.0), BIOS F39b, AGESA V2 1.2.0.C
Cooling Arctic Liquid Freezer II 420mm Rev7 (Jan 2024) with off-center mount for Ryzen, TIM: Kryonaut
Memory 2x16GB G.Skill Trident Z Neo GTZN (July 2022) 3600MT/s 1.38V CL16-16-16-16-32-48 1T, tRFC:280, B-die
Video Card(s) Sapphire Nitro+ RX 7900XTX (Dec 2023) 314~467W (382W current) PowerLimit, 1060mV, Adrenalin v24.12.1
Storage Samsung NVMe: 980Pro 1TB(OS 2022), 970Pro 512GB(2019) / SATA-III: 850Pro 1TB(2015) 860Evo 1TB(2020)
Display(s) Dell Alienware AW3423DW 34" QD-OLED curved (1800R), 3440x1440 144Hz (max 175Hz) HDR400/1000, VRR on
Case None... naked on desk
Audio Device(s) Astro A50 headset
Power Supply Corsair HX750i, ATX v2.4, 80+ Platinum, 93% (250~700W), modular, single/dual rail (switch)
Mouse Logitech MX Master (Gen1)
Keyboard Logitech G15 (Gen2) w/ LCDSirReal applet
Software Windows 11 Home 64bit (v24H2, OSBuild 26100.2605), upgraded from Win10 to Win11 on Jan 2024
It’s 99.9% certain that any modern chip, CPU or GPU uses resources and info internally to regulate its functions, that the user is completely unaware of.

So there is a very slim chance that a GPU this size and that complexity to not monitor its silicon “health” by dozens of temperature, current and voltage sensors.

nVidia already is getting some criticism about the power increase, the not new fab node and the relatively small raw performance increase. Those points cannot be hidden.
They don’t need another point to make their GPUs even more unappealing to more potential buyers. Most likely they decided to hide the hotspot temp, especially on the FE 5090 that has a compact cooler and will probably peak constantly over 100C.

Sure it’s something less, info wise for those who used to use it. Had its value.

I don’t think this is a matter of GPU health concern but time will tell as always.

It may be harder for the user to know if TIM is going bad, but to be honest that is a thing of traditional TIM.
Not LM. If LM is between nickel plated surfaces it will almost never dry out. As long as it can’t drip out either (for reduced quantity).

Copper surface can absorb “half” of LM compound and gradually render it dry (within 2-3 months) and need to be replaced frequently until it’s saturated.
But that also depends on LM actual compound ingredients, for example the TG Conductonaut that is based on Indium-Gallium.

I will also guess that AIB variants will also have that sensor hidden for uniformity across all variants, even though there are going to be huge coolers on some of them with a lot lower temperatures.
 
Joined
Feb 20, 2019
Messages
8,608 (3.97/day)
System Name Bragging Rights
Processor Atom Z3735F 1.33GHz
Motherboard It has no markings but it's green
Cooling No, it's a 2.2W processor
Memory 2GB DDR3L-1333
Video Card(s) Gen7 Intel HD (4EU @ 311MHz)
Storage 32GB eMMC and 128GB Sandisk Extreme U3
Display(s) 10" IPS 1280x800 60Hz
Case Veddha T2
Audio Device(s) Apparently, yes
Power Supply Samsung 18W 5V fast-charger
Mouse MX Anywhere 2
Keyboard Logitech MX Keys (not Cherry MX at all)
VR HMD Samsung Oddyssey, not that I'd plug it into this though....
Software W10 21H1, barely
Benchmark Scores I once clocked a Celeron-300A to 564MHz on an Abit BE6 and it scored over 9000.
I learned that at least on their FE 5090 models, Nvidia is now hiding the chip hot spot temperature. I hope it will not be the rule for all 50 series cards!

Hot spot temperature is an important health status indicator of the graphic card!

For example my current 4070 started with the hot spot being about 10°C above the whole chip temperature, but then it slowly started drifting higher and now is 30°C above. My card is probably affected by the crap-paste problem Igor reported about, and now thanks to the hot spot temperature I know that I should probably fix it.
Hotspot temperature tells you only about paste/TIM issues. The 20, 30, 40-series GPUs manage their own clocks, voltages, core usage, and temperature across hundreds of locations on the die. Distilling those readings that are sampled every few milliseconds down to "temperature" and "hotspot temperature" that's sampled every 1-2 seconds was already pretty silly.

For liquid metal there is nothing you can do to change the hotspot temperature, so why even present the user with a number that they can't do anything about, other than worry (due to being ill-educated on how GPU temperatures really work)?
 

AsRock

TPU addict
Joined
Jun 23, 2007
Messages
19,150 (2.98/day)
Location
UK\USA
I watched GN teardown and it seems that the liquid metal breached one or two ribbons, they probably overfilled it, or they have problem with air pressure in the chip area while mounting the cooler.

How I wrote in the thread about the FE cooler, their cooler has about half surface area of the larger AIB models, and the liquid metal is a desperate measure to lower the temps the chip is hitting with such a small surface area cooler. And the liquid metal was not enough, they simply need to hide the evidence of what is going on.

I do not think that making the 600W card so small was a good decision, they insisted on carrying this plan out even though they found out that the card overheated, and the disastruous fallout for everybody is that they decided to hide hot spot temperature.

Well yeah, and rubber decays too over time, damage to that seal could be hard to see and never mind replacing it. And one thing for sure i don't want liquid metal all over my PC because a seal broke and splashed all over every were because it hit a fan.

I'll leave the suckers and people with no sense of value to buy them.
 
Top