# Can one trust GPU-Z ‘GPU hot spot’ and ‘memory junction’ figures?



## 4EvrYng (Feb 4, 2022)

I have in front of me three EVGA 2080 Super FTW3 Hybrid cards. When I stress test them with TimeSpy Extreme GPU-Z reports GPU temperature and all ICX temperatures (GPU, memory, and power) are within 2C across all three -BUT- hot spot and memory (junction?) on one of them are reported almost 10 C higher than on other two (see table below).


*Sensor**Card 1**Card 2**Card 3**Max-Min*GPU53.8 C54.9 C55.4 C1.6 CGPU hotspot65.2 C67.0 C*73.7 C**8.5 C*Memory64.4 C66.7 C*73.7 C**9.3 C*ICX GPU 153 C54 C55 C2 CICX GPU 252 C52 C52 C0 CICX Mem 152 C51 C51 C1 CICX Mem 256 C55 C55 C1 CICX Mem 354 C53 C53 C1 CICX Pwr 145 C45 C44 C1 CICX Pwr 248 C47 C47 C1 CICX Pwr 348 C47 C47 C1 CICX Pwr 448 C47 C47 C1 CICX Pwr 556 C54 C54 C2 C

I would think if GPU and memory on one of them are really running 10C hotter that would be reflected on rest of the sensors but I don’t see that, I see it only on hot spot and memory junction.

That made me start researching and I have come across following (please correct me if I am wrong):

Allegedly GPU-Z support for “hot spot” and/or “memory junction” applies only for 30 series cards and 20 series owners can’t rely on reported values.
EVGA’s Jacob Freeman states that temperatures from ICX sensors are closer to the actual running temperatures (see 



__ https://twitter.com/i/web/status/1437439632573038594).
Jacob is further pointing people to thread https://forums.developer.nvidia.com...erature-via-nvidia-smi-or-nvml-api/168346/160 where Nvidia’s moderator states “The memory case temperature is not exposed by any third-party tools authorized by NVIDIA on Windows or Linux. Existing third-party tools appear to be reporting numbers that do not represent the relevant case temperature (Tc) specification and it’s normal for other readings to show higher values.” 
That could explain why I can’t find memory junction temperature in HwInfo64 and is casting doubt on reliability of figures reported by GPU-Z for those two sensors

So can one trust GPU-Z ‘GPU hot spot’ and ‘memory junction’ figures for 20 series cards? Can somebody explain why values I am seeing on card # 3 for those two sensors are not falling in line with trend on rest of sensors?


----------



## R-T-B (Feb 4, 2022)

I dunno re the validity of the figure, but thought I'd point out those temps are fine, regardless.


----------



## lowrider_05 (Feb 4, 2022)

ICX uses external Sensor Probes so the ICX can not see the Hotspot Temps and yes the Hot Spot Temp can vary a bit between cards because it is the hottest sensor on one die that is reporting the hotspot temp.
as Gpu silicon (ASIC) Quality is different on each card, there can be differences on what part of the die has the most (leakage) and gets the hottest.

On the memory side, i am not that shure, i know that in HWinfo64 the MemoryJunction temp Sensor is directly read from the chips themself but for GPU´z i don´t know.


----------



## W1zzard (Feb 4, 2022)

GPU-Z reads hotspot and memory temperature through NVIDIA's own APIs

Maybe one of the memory chips doesn't have optimum contact? I think memory temperature is the highest temperature of all chips

iCX has physical sensors that are placed at specific locations across the board. It doesn't measure every single memory chip.

Do more testing, will be interesting to see your findings


----------



## 4EvrYng (Feb 4, 2022)

W1zzard said:


> GPU-Z reads hotspot and memory temperature through NVIDIA's own APIs



Are you saying that Nvidia's statement I've linked to is incorrect, please?

Also, I've read that sensors on some of GPUs Nvidia shipped might not be calibrated and that there is a way to check through API whether they were. Does GPU-Z check whether sensors were calibrated before reading out or it always assumes they were?



W1zzard said:


> Maybe one of the memory chips doesn't have optimum contact? I think memory temperature is the highest temperature of all chips
> 
> iCX has physical sensors that are placed at specific locations across the board. It doesn't measure every single memory chip.



Thought did cross my mind. However, if that was the case wouldn't 10C increase in internal temperature result in at least some upward trend on at least one of ICX sensors even though they are external?

I've read that it might be possible to get temperatures of all internal GPU sensors through Nvidia's API, not just the hottest one. If yes would it be possible to get that whole list though GPU-Z? Having all of them would give users an idea are they having sub-optimal thermal interface application / is their thermal interface getting worse over time.



W1zzard said:


> Do more testing, will be interesting to see your findings



Do you meant test with other software besides 3DMark? If yes I already did that and exact same behavior can be observed regardless of what I use to test. If not please let me know how you would like me to test.


----------



## W1zzard (Feb 4, 2022)

4EvrYng said:


> Are you saying that Nvidia's statement I've linked to is incorrect, please?


I think so. I'm using their own API and return the untouched value. Of course GPU-Z is not "authorized" to use their internal only methods to report memory temperature

Have you reassembled the cooler of card 3 yet? It's probably just uneven pressure or some other mounting issue


----------



## 4EvrYng (Feb 4, 2022)

W1zzard said:


> I think so. I'm using their own API and return the untouched value. Of course GPU-Z is not "authorized" to use their internal only methods to report memory temperature
> 
> Have you reassembled the cooler of card 3 yet? It's probably just uneven pressure or some other mounting issue



Does their API report whether sensors have been factory calibrated? Does GPU-Z assume sensors have been always calibrated?

Personally I haven't done anything to 3rd card. I can't know did previous owner do anything but EVGA factory stickers on card are in perfect condition, just the way they come from factory, no signs of any tampering/work.


----------



## W1zzard (Feb 4, 2022)

Nobody calibrates their sensors. I still find it highly unlikely that they are off that much.

You're not getting any new answers until you take apart the card and repaste it

Or blame GPU-Z and be done with it, the card is fine either way


----------



## 4EvrYng (Feb 4, 2022)

W1zzard said:


> Nobody calibrates their sensors. I still find it highly unlikely that they are off that much.
> 
> You're not getting any new answers until you take apart the card and repaste it
> 
> Or blame GPU-Z and be done with it, the card is fine either way



I'm not looking to cast blame in any particular direction, I'm just trying to figure out what might be going on as there seems to be a contradiction between values


----------



## W1zzard (Feb 5, 2022)

4EvrYng said:


> I'm not looking to cast blame in any particular direction, I'm just trying to figure out what might be going on as there seems to be a contradiction between values


Rather I'd say it's an indicator of a physical issue with the 3rd card

You realize that if the cooler is tilted, the temperatures on one side of the die will be higher due to lack of contact, and along that same plane the contact with memory chips will be MUCH worse?


----------



## 4EvrYng (Feb 5, 2022)

W1zzard said:


> Rather I'd say it's an indicator of a physical issue with the 3rd card
> 
> You realize that if the cooler is tilted, the temperatures on one side of the die will be higher due to lack of contact, and along that same plane the contact with memory chips will be MUCH worse?


While tilting of cooler didn’t cross my mind (thank you for reminding me of that possibility) I did consider mounting imperfections as possible cause, it is just that I felt internal 10C rise in one spot would be likely reflected in upward trend on at least one of ICX sensors, even if that change is in lesser amount, instead of absolutely no change.

Not having a possible explanation for lack of upward trend on ICX sensors means I can’t with 100% certainty rule out possibility something could be affecting accurate readout of sensors. In turn that means only way to answer with certainty what might be going on is to inspect and redo cooling of card, like you suggested.

I’ve given that a thought. Third card seems to be passing all tests I am aware of without any obvious issues (OCCT’s GPU and VRAM, 3DMark’s Time Spy Extreme and Fire Strike Ultra … please feel free to suggest any others) and maximum temperatures during those tests doesn’t exceed values I mentioned, which are (it is my understanding) still well within levels these components should be able to do without issues. Last, but not least, card is still well within warranty period. So I have decided against it as I would be opening a can of worms and spending lots of time on it without need or benefit that would justify it.

Thank you again for your help!


----------



## W1zzard (Feb 5, 2022)

4EvrYng said:


> it is just that I felt internal 10C rise in one spot would be likely reflected in upward trend on at least one of ICX sensors, even if that change is in lesser amount, instead of absolutely no change.


This is exactly what will happen with a tilted cooler, that's why hotspot is so useful (to diagnose cooling problems, it's quite useless otherwise unless the card thermally throttles)


----------



## 4EvrYng (Feb 5, 2022)

W1zzard said:


> This is exactly what will happen with a tilted cooler, that's why hotspot is so useful (to diagnose cooling problems, it's quite useless otherwise unless the card thermally throttles)


I trust you. It is just that my logic was "increase in internal heat will spread to area in its vicinity, thus in turn it should spread to external sensors in its vicinity, and in turn those external sensors should show some heat increase too".


----------



## W1zzard (Feb 5, 2022)

4EvrYng said:


> I trust you. It is just that my logic was "increase in internal heat will spread to area in its vicinity, thus in turn it should spread to external sensors in its vicinity, and in turn those external sensors should show some heat increase too".


----------



## 4EvrYng (Feb 5, 2022)

W1zzard said:


>


Yes, but that is 2C variance between best and worst internal GPU sensors, ICX GPU 1 also shows 2C variance, and ICX GPU 2 shows 0C, while hot spot shows almost 9C. Same with memory. So I don't see trend of spreading.


----------



## W1zzard (Feb 6, 2022)

That's not how physics works, your cards will be fine. Ignore the issue, I'm done here


----------



## 4EvrYng (Feb 6, 2022)

W1zzard said:


> That's not how physics works, your cards will be fine. Ignore the issue, I'm done here


Thank you for your responses and trying to help


----------

