RTX 2070 artifacting when power limit is set to max

rtwjunkie · Jul 7, 2019

losemi said:
Well that's the thing, i used msi afterburner oc scanner without raising power limit which set core clock to +135 curve, and when power usage went over 100 white artifacts appeared and game crashed.

With everything on stock power usage never goes above ~90 and there are no problems. Now my question is, what could be the cause of this and how do I find out for certain.

By raising power limit, you are raising heat. It shouldn’t happen with the Gaming X designed cooler, but it may also be sharing that heat with some of the memory modules.

just keep in mind with your new card. Also, please don’t use furmark. No reason for that useless power draw and heat. A good stout game (for those who are gamers) is the best test you can give a card, in my opinion.

robot zombie · Jul 7, 2019

biffzinker said:
Turns out it was the overclock on the memory I had to back it down to +866MHz. The GPU if using RTRT was unstable so I back it down to +76 MHz from +115.

Ahh, I've actually experienced something similar, but with RAM. Seemed to work fine for about 8 months and then suddenly I was getting signs of slight memory instability. Games CTDing, more than anything, but also a handful of memory-related BSODs. Turned out it was jussst unstable enough to never have any appreciable hiccups for that long. Funny how they became more frequent after the first one, but world and its wonders. After passing so many memtest runs in the past, it failed the first one I shot it through

Didn't feel like delving in, so I upped the voltage on DRAM and SOC and it has been flawlessly carrying on ever since, as well as passing long memtest runs again. Go figure.

And yeah, RTRT definitely reduces OC headroom on both accounts. As far as I can tell, 2.1ghz is stable on my 2060 for even the most demanding tests... until you throw it up against an RTX-enabled game. Then it just dumps the moment clocks peak during actual gameplay. 2.01 is the most I get. Similar story with memory. It'll run +900mhz normally, probably a wee bit more. But with RTX enabled it's down to about 550 before artifacts begin to show up.

Vayra86 · Jul 7, 2019

losemi said:
Like I said, it did when playing Resident Evil 2, using curve from oc scanner it went over 104 and the limit was set to 100. That's when I got artifacts (white space invaders) and the game crashed.
I also got those same artifacts when setting power limit to max and running furmark. Now when resetting everything to stock there are no problems.
So what could be the cause? Can insufficient power from psu cause those artifacts? Since problems only occur when power goes beyond 100. Maybe its because I didn't raise voltage with power, or is the gpu faulty?

This is what im trying to figure out, if its gpu I wanna return it asap, if its psu I wanna replace it asap as well since it might damage the gpu.

This would be my cue for an RMA.

Space Invaders while not touching VRAM clocks and just using power limits is bad. And no, 'it doesn't happen at stock' is no guarantee this won't get worse eventually. The card is bad. Return it and do not accept an apology from support. This needs to be replaced.

Remember how those 2080ti's with space invaders started crashing after longer periods of use. Something is clearly amiss and at some point something is going to give. Don't be that guy looking at a dead card a few days past warranty...

Besides, if you have to run a Gaming Z (sub top-end MSI) at stock... that is head scratcher.

You guys are way too lenient on accepting shitty product. You buy a special, higher priced card with improved cooling and supposed OC potential, running at stock or running into artifacts (!!) otherwise can never be 'product works as advertised'. If you had bought a simple blower, sure. But not this.

Here's what really happened... Turing is a new gen with large dies and yields aren't stellar, profitability is under pressure and so the binning happens more tightly within advertised spec. End result: you paid premium (twice over: chip and AIB solution) for a bad bin and even a nearly faulty card.

robot zombie said:
Ahh, I've actually experienced something similar, but with RAM. Seemed to work fine for about 8 months and then suddenly I was getting signs of slight memory instability. Games CTDing, more than anything, but also a handful of memory-related BSODs. Turned out it was jussst unstable enough to never have any appreciable hiccups for that long. Funny how they became more frequent after the first one, but world and its wonders. After passing so many memtest runs in the past, it failed the first one I shot it through Didn't feel like delving in, so I upped the voltage on DRAM and SOC and it has been flawlessly carrying on ever since, as well as passing long memtest runs again. Go figure.

And yeah, RTRT definitely reduces OC headroom on both accounts. As far as I can tell, 2.1ghz is stable on my 2060 for even the most demanding tests... until you throw it up against an RTX-enabled game. Then it just dumps the moment clocks peak during actual gameplay. 2.01 is the most I get. Similar story with memory. It'll run +900mhz normally, probably a wee bit more. But with RTX enabled it's down to about 550 before artifacts begin to show up.

Degradation at work. That is why any chip you buy usually has sufficient headroom. If you detect a chip that doesn't have that, my story above applies. You got sold a dud in disguise.

It is good to raise awareness of this, because as the nodes get smaller and the optimization tighter (read: we get OC out of the box now from nearly all name brands including Intel and AMD - XFR, Turbo that far exceeds stated TDP, GPU boost, etc.), the lower headroom directly eats into product longevity. There is less performance to be gained from shrinks, so things are clocked out of the optimal curve as well.

rtwjunkie said:
A good stout game (for those who are gamers) is the best test you can give a card, in my opinion.

Still king of my stress test for any CPU or GPU in real world: Total War Warhammer (2). Brings anything to its knees, and just clicking end turn gives you a wild ride of voltage spikes and full load > 0% load situations both for CPU and GPU. You also get the concurrent heat from CPU and GPU affecting both components. If you can pull a rig through a few hours of Warhammer campaign map, its rock solid

jaggerwild · Jul 7, 2019

You use FURMARK you deserve to not have a working GPU, wake up! FURMARK IS JUNK!

robot zombie · Jul 7, 2019

Vayra86 said:
This would be my cue for an RMA.

Space Invaders while not touching VRAM clocks and just using power limits is bad. And no, 'it doesn't happen at stock' is no guarantee this won't get worse eventually. The card is bad. Return it and do not accept an apology from support. This needs to be replaced.

Remember how those 2080ti's with space invaders started crashing after longer periods of use. Something is clearly amiss and at some point something is going to give. Don't be that guy looking at a dead card a few days past warranty...

Besides, if you have to run a Gaming Z (sub top-end MSI) at stock... that is head scratcher.

You guys are way too lenient on accepting shitty product. You buy a special, higher priced card with improved cooling and supposed OC potential, running at stock or running into artifacts (!!) otherwise can never be 'product works as advertised'. If you had bought a simple blower, sure. But not this.

Here's what really happened... Turing is a new gen with large dies and yields aren't stellar, profitability is under pressure and so the binning happens more tightly within advertised spec. End result: you paid premium (twice over: chip and AIB solution) for a bad bin and even a nearly faulty card.

Thaank you! That's what I'm saying. I get that overclocks aren't exactly guaranteed, but stock ought to always work.

Degradation at work. That is why any chip you buy usually has sufficient headroom. If you detect a chip that doesn't have that, my story above applies. You got sold a dud in disguise.

It is good to raise awareness of this, because as the nodes get smaller and the optimization tighter (read: we get OC out of the box now from nearly all name brands including Intel and AMD - XFR, Turbo that far exceeds stated TDP, GPU boost, etc.), the lower headroom directly eats into product longevity. There is less performance to be gained from shrinks, so things are clocked out of the optimal curve as well.

With the RAM, I do wonder, which is a shame because it's b-die... higher-end g.skill stuff, too. I'll be curious to see if it gets worse at any point. Hard to say what's really happening though... I think this board isn't the best memory OC'er to begin with. These modules could be pushed much further, and I still have plenty of voltage to play with. RAM can be subtle, muddying things. It could just as easily be that there was always wonkiness that I never detected. RAM's pretty notorious for being finicky like that. CPU on the other hand still runs at the same relatively low voltage for its clocks and will even do 4.3ghz all-core just as reliably as it always has. But then, I don't run it into the ground like that. It's been at 4ghz @ ~1.1v for quite a while. Still running <1v SoC too. If anything was degrading at that point, I'd expect there to have already been an acute failure. Again, we shall see. First thing I'll do in that case is report on it here.

Now with my 2060, I really doubt that there's any real issue. It's a Strix model and I can tell you from removing the cooler that it is the higher-binned "OC" chip. The card is specced to do 1980mhz stock boost, which it still does, RTX or not. And that's already about the highest OC you can expect those chips to hold. There simply is no headroom at stock. That's how they're sold and you buy them knowing what you get out of the box is all that there is - it's not marketed as having ANY OC potential. It's sold as something that locks you in at the best possible performance from the jump. I can run it a little higher under certain scenarios, but nobody counts on it. Degradation to me would be it failing to hold its initial performance in either scenario. 2.1ghz is a lot for a 2060... even with a 125% max power limit. It's more than you can ever ask for... nobody's going around saying "Oh yeah, a good 2060 will do 2.1ghz no problem." because it just doesn't happen. At most they'll say you *may* hit 2ghz if you're really lucky.

To me it's no wonder that it can't handle 2.1 when it's actually fully loaded-down. Even the best ones don't typically clock that high, ever, under any circumstances. As for the memory OC, it's probably tied to the GPU being pushed harder. The OC it'll hold hasn't changed in the entire time I've had it. It's just never been able to handle as much with RTX enabled, probably due to the added complexity when the RTRT cores become active. You can see how much more power is being gulped up in the temperatures alone - it goes up another 8-10C. Likely just not enough power at that point to give the memory the voltage it's used to while also powering the whole GPU die when it hits full utilization. Not to mention the controller itself is working harder. I think it's not that the memory is suddenly demanding more power due to degradation... it's just that due to the power limit imposed, there's no longer enough to go around in that scenario. Just hitting the natural limit of the hardware under max load and max power. Worth noting that the memory OC will hold-up when the GPU isn't being pushed to those clocks in RTRT applications. If the performance under either scenarios had ever declined from their separate baselines, I'd agree there was degradation happening. Not saying it absolutely can't become a problem later, just that it hasn't seemed to yet. As of now, I keep the power limit at 95% with no core clock boost and the thing still holds a +500 memory bump and clocks in at 2.01ghz. To many, that's considered somewhat exceptional. Nobody would expect degraded hardware to ever do that. Even a brand new one can't guarantee you that kind of performance.

Just wanted to clarify a little bit to prevent confusion.

But I do agree, the practice of tuning products to the max out of the box is problematic. Some upsides to it, too. But at the end of the day, it's going to lead to more products falling off under the stress. Higher degradation rates are a given and it is important to recognize both the risk and the signs when buying new hardware.

jaggerwild · Jul 8, 2019

LOL@memtest! Fruit loops!

64K · Jul 8, 2019

It's my understanding that Nvidia drivers don't allow Furmark to push the card to crazy clocks like in the past. That's why in W1zzard's reviews the Furmark watts used is often the same or close to Peak Gaming watts used but I don't run Furmark and I never have.

robot zombie · Jul 8, 2019

jaggerwild said:
LOL@memtest! Fruit loops!

What's wrong with memtest? Other than it failed me a bunch of times

Nah but really, I wasn't aware it was that ridiculous of a thing to use for testing memory overclocks.

EarthDog · Jul 8, 2019

64K said:
It's my understanding that Nvidia drivers don't allow Furmark to push the card to crazy clocks like in the past. That's why in W1zzard's reviews the Furmark watts used is often the same or close to Peak Gaming watts used but I don't run Furmark and I never have.

Yes and no... Both Nvidia and AMD have called it a power virus and recommended not to use it as it can damage the card. While protections have been put in place, the biggest problem with the app is that it doesn't test at running clocks. With all of the protections put in to try and prevent damage, the card lowers itself, sometimes hundreds of MHz to run at the power limit. If you are gaming and boost to say 1900 MHz... if you furmark, you could be running at 1700 Mhz not even testing the same clocks or voltage you would be at which, as you likely guessed, makes it pointless for that type of testing.

With CPUs, running P95 with AVX or similar programs (not named whatever program that Regeneration guy made) you can run into the same heat/load. Try running handbrake or POVRay and that will get it right up there. Here, with Furmark, no game is a power virus and instantly throttles many many bins just to run within power limits.

Processor	Core i9-9900k
Motherboard	ASRock Z390 Phantom Gaming 6
Cooling	All air: 2x140mm Fractal exhaust; 3x 140mm Cougar Intake; Enermax ETS-T50 Black CPU cooler
Memory	32GB (2x16) Mushkin Redline DDR-4 3200
Video Card(s)	ASUS RTX 4070 Ti Super OC 16GB
Storage	1x 1TB MX500 (OS); 2x 6TB WD Black; 1x 2TB MX500; 1x 1TB BX500 SSD; 1x 6TB WD Blue storage (eSATA)
Display(s)	Infievo 27" 165Hz @ 2560 x 1440
Case	Fractal Design Define R4 Black -windowed
Audio Device(s)	Soundblaster Z
Power Supply	Seasonic Focus GX-1000 Gold
Mouse	Coolermaster Sentinel III (large palm grip!)
Keyboard	Logitech G610 Orion mechanical (Cherry Brown switches)
Software	Windows 10 Pro 64-bit (Start10 & Fences 3.0 installed)

Processor	Ryzen 9 3900X
Motherboard	Asus ROG Strix X370-F
Cooling	Dark Rock 4, 3x Corsair ML140 front intake, 1x rear exhaust
Memory	2x8GB TridentZ RGB [3600Mhz CL16]
Video Card(s)	EVGA 3060ti FTW3 Ultra Gaming
Storage	970 EVO 500GB nvme, 860 EVO 250GB SATA, Seagate Barracuda 1TB + 4TB HDDs
Display(s)	27" MSI G27C4 FHD 165hz
Case	NZXT H710
Audio Device(s)	Modi Multibit, Vali 2, Shortest Way 51+ - LSR 305's, Focal Clear, HD6xx, HE5xx, LCD-2 Classic
Power Supply	Corsair RM650x v2
Mouse	iunno whatever cheap crap logitech clutches Xbox 360 controller security blanket
Keyboard	HyperX Alloy Pro
Software	Windows 10 Pro
Benchmark Scores	ask your mother

System Name	Tiny the White Yeti
Processor	7800X3D
Motherboard	MSI MAG Mortar b650m wifi
Cooling	CPU: Thermalright Peerless Assassin / Case: Phanteks T30-120 x3
Memory	32GB Corsair Vengeance 30CL6000
Video Card(s)	ASRock RX7900XT Phantom Gaming
Storage	Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s)	Gigabyte G34QWC (3440x1440)
Case	Lian Li A3 mATX White
Audio Device(s)	Harman Kardon AVR137 + 2.1
Power Supply	EVGA Supernova G2 750W
Mouse	Steelseries Aerox 5
Keyboard	Lenovo Thinkpad Trackpoint II
VR HMD	HD 420 - Green Edition ;)
Software	W11 IoT Enterprise LTSC
Benchmark Scores	Over 9000

Processor	5930K
Motherboard	MSI X99 SLI
Cooling	WATER
Memory	16GB DDR4 2132
Video Card(s)	EVGAY 2070 SUPER
Storage	SEVERAL SSD"S
Display(s)	Catleap/Yamakasi 2560X1440
Case	D Frame MINI drilled out
Audio Device(s)	onboard
Power Supply	Corsair TX750
Mouse	DEATH ADDER
Keyboard	Razer Black Widow Tournament
Software	W10HB
Benchmark Scores	PhIlLyChEeSeStEaK

Processor	Ryzen 9 3900X
Motherboard	Asus ROG Strix X370-F
Cooling	Dark Rock 4, 3x Corsair ML140 front intake, 1x rear exhaust
Memory	2x8GB TridentZ RGB [3600Mhz CL16]
Video Card(s)	EVGA 3060ti FTW3 Ultra Gaming
Storage	970 EVO 500GB nvme, 860 EVO 250GB SATA, Seagate Barracuda 1TB + 4TB HDDs
Display(s)	27" MSI G27C4 FHD 165hz
Case	NZXT H710
Audio Device(s)	Modi Multibit, Vali 2, Shortest Way 51+ - LSR 305's, Focal Clear, HD6xx, HE5xx, LCD-2 Classic
Power Supply	Corsair RM650x v2
Mouse	iunno whatever cheap crap logitech clutches Xbox 360 controller security blanket
Keyboard	HyperX Alloy Pro
Software	Windows 10 Pro
Benchmark Scores	ask your mother

Processor	i7 7700k
Motherboard	MSI Z270 SLI Plus
Cooling	CM Hyper 212 EVO
Memory	2 x 8 GB Corsair Vengeance
Video Card(s)	Temporary MSI RTX 4070 Super
Storage	Samsung 850 EVO 250 GB and WD Black 4TB
Display(s)	Temporary Viewsonic 4K 60 Hz
Case	Corsair Obsidian 750D Airflow Edition
Audio Device(s)	Onboard
Power Supply	EVGA SuperNova 850 W Gold
Mouse	Logitech G502
Keyboard	Logitech G105
Software	Windows 10

RTX 2070 artifacting when power limit is set to max

PC Gaming Enthusiast