• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

An interesting AMD GPU issue.

Status
Not open for further replies.
Joined
Jan 14, 2019
Messages
15,397 (6.78/day)
Location
Midlands, UK
System Name My second and third PCs are Intel + Nvidia
Processor AMD Ryzen 7 7800X3D
Motherboard MSi Pro B650M-A Wifi
Cooling be quiet! Dark Rock 4
Memory 2x 24 GB Corsair Vengeance EXPO DDR5-6000 CL36
Video Card(s) PowerColor Reaper Radeon RX 9070 XT
Storage 2 TB Corsair MP600 GS, 4 TB Seagate Barracuda
Display(s) Dell S3422DWG 34" 1440 UW 144 Hz
Case Kolink Citadel Mesh
Audio Device(s) Logitech Z333 2.1 speakers, AKG Y50 headphones
Power Supply 750 W Seasonic Prime GX
Mouse Logitech MX Master 2S
Keyboard Logitech G413 SE
Software Bazzite (Fedora Linux) KDE Plasma
I think you might be asking the thread starter as opposed to me, but I'll answer since it seemed like we were both in the same boat here (and/or in case you are referring to me).
I was asking the thread starter. If I'd asked you, I would have quoted the post (or part of it) that I wanted to ask about. ;)
 
Joined
Oct 2, 2020
Messages
1,126 (0.68/day)
System Name Laptop ASUS TUF F15 | Desktop 1 | Desktop 2
Processor Intel Core i7-11800H | Intel Core i5-14600K@135W | Intel Core i3-10100
Motherboard ASUS FX506HC | Gigabyte B660M DS3H DDR4 | MSI MAG B560M Bazooka
Cooling Laptop built-in cooling lol | Thermalright Assassin Spirit w/ BeQuiet Shadow Wings fan| Stock Copper
Memory 24 GB @ 3200 | 32 GB @ 3733 | 16 GB @ 3200
Video Card(s) Nvidia RTX 3050 Mobile 4GB | Nvidia GTX 1650 | Nvidia GTX 960 2 GB
Storage Adata XPG SX8200 Pro 512 GB | Samsung M2 SSD 256 GB & 1 TB 2.5" HDD @ 7200| SSD 250 GB & SSD 240 GB
Display(s) Laptop built-in 144 Hz FHD screen | Dell 27" WQHD @ 75 Hz & 49" TV FHD | Samsung 32" TV FHD
Case It's a laptop, it doesn't need case lmfao | Deepcool Mattrexx 55 MESH | Aerocool Cylon PRO
Audio Device(s) laptop built in audio | Logitech 2.1 speakers | Logitech stereo speakers
Power Supply ASUS 180W PSU | SeaSonic Focus GX-550 | SeaSonic M12II EVO 520W
Mouse Logitech G604 | Corsair Harpoon wired mouse| Logitech G305
Keyboard Laptop built-in keyboard |Razer Blackwidow | Steelseries APEX 7 TKL
VR HMD Quest 2 sold out and don't need VR anymore lol
Software Windows 10 Enterprise 20H2 | Windows 10 Enterprise 20H2 | Windows 11 24H2 LTSC
Benchmark Scores good enough
Hello everyone. First of all, please forgive my poor English, I will try to explain the problem I have experienced in detail.

I'm experiencing a black screen followed by a restart issue while playing games and conducting GPU stress tests.
The issue doesn't occur every time, and it's unpredictable; it happens randomly.

For example, when playing Cyberpunk 2077 with the CPU at stock settings, specifically at 3600 MHz, I encounter a black screen and restart issue. However, when I enable CPU Core Performance Boost, I can play for hours without any problems.

Sometimes when I overclock the CPU, the problem is temporarily resolved. And sometimes, when I overclock the VRAM, the problem is also temporarily resolved, but it starts again after a while.
In some cases, when I set the GPU fan speed to 100%, the problem is temporarily resolved, but then it starts again later.

I'm sure this is not a heating or PSU-related issue because during stress tests, the CPU reaches a maximum of 75 degrees Celsius, the graphics card reaches a maximum of 65-70 degrees Celsius (hotspot 85-90 degrees Celsius), VRAM temperature does not exceed 70 degrees Celsius, and socket temperature stays between 60-65 degrees Celsius

I also tried with three different PSUs:

  • Gigabyte P650B
  • Xigmatek X-Power 650W
  • Deepcool PK660D
When I changed the PSU, the problem was temporarily resolved for a few days, but then it started again.

Sometimes when I install pro drivers, the problem is temporarily resolved, but it starts again after a while. Also, when I run an OCCT power test for an hour, there are no issues, but the next day when I run the power test again, the computer shuts down within 1 minute.

I believe this is a software or stability-related issue. I've tried all AMD drivers, and although the problem is sometimes temporarily resolved, it starts again later.

My prebuilt PC Specs:

Processor: AMD Ryzen 7 3700x
GPU: Sapphire Nitro+ RX 6600 XT
RAM: 32 gb 3200 MHZ ( idk brand )
Motherboard: Asus B450m Dragon
SSD: 1 TB Kioxia exceria g2
HDD: 2 TB Seagate 7200 rpm
PSU: 550w Deepcool PK550D Bronze


The solutions I've tried:

* DDU & every driver version
* installed only driver option
* completely disassembled the computer and reassembled it.
* replaced thermal paste & pads
* tried different rams
* removed XMP (d.o.c.p)
* Reset CMOS
* undervolt & underclock gpu

I learned that there is no VRM heatsink on the motherboard, but I can run CPU stress tests for hours without any issues. The problem only occurs when I run GPU 3D tests.
removed XMP (d.o.c.p) - this does affect gaming, but doesn't affect power req from psu
is there cpu oc?
gpu bought new or used?
 

Atlas39

New Member
Joined
Apr 10, 2024
Messages
10 (0.03/day)
I think you might be asking the thread starter as opposed to me, but I'll answer since it seemed like we were both in the same boat here (and/or in case you are referring to me).

While it was more unstable at stock, it was still unstable either way.

And I never had a desire to run with certain performance features disabled nor at stock RAM speeds. I only tried changing these things since I first had the instability, and one of the first things you do when having instability issues is to test at stock. I imagine this is why the thread starter changed the things too.


Well, that's wonderful news. I was thinking "I hope they don't come back and say false hope it happened again after all" because that happened to me a lot. I thought I resolved it and then my hopes were crushed. Sometimes it would take upwards of a week to occur so there was a lot of false hope scenarios for me.

In my case it's... sort of been resolved? I think? The issue was gone since the RMA but then one particular use case had it happen again. It's strange as to why that one use case in particular still causes it but so far no others do, but if I decide to investigate more, your findings do have a lot of reasoning behind them so I'm somewhat hopeful.

The funny thing is, there were three things that crossed my mind while I was doing my own five million attempts. One was to try the other PCI Express slot (not enough room), and the other was to try my old motherboard (never got around to it), and the other was trying with my PC on its side. There's a chance any of those may have resolved my issue if this is what was causing it for me.

And if nothing else, your issue is resolved at least and that's what matters for this thread (and I hope it stays that way!)

If the problem still persists, you can see with a small test.

Just lay the computer on its side and open the case cover.
remove the screws holding the gpu
Then slowly and gently push the GPU.
If the GPU is moving up and down in the PCI-E slot, there may be a contact problem.

Because when I tried it with the old GPU, it was firmly placed in the slot and did not move, even if I did not install the screws.

I hope I could explain it with my bad English. If you need more help, I can record a video and upload it here so you can see in more detail exactly how I solved the problem.

Also, even if there is no problem, I think the GPU holder is necessary. You know, the new generation cards are quite large and heavy, so the GPU holder can provide protection against gravity and prevent problems that may occur in a long time.

Your problem may be caused by something different, but we won't know until we try it.

removed XMP (d.o.c.p) - this does affect gaming, but doesn't affect power req from psu
is there cpu oc?
gpu bought new or used?

3 psus are new the other one ( corsair rm850 ) i was borrowed from a friend

cpu oc was making system more stable still crashing but less frequent
gpu also is new one

so in the end i already solved the problem but its a secret that i cannot find out how cpu / ram oc make whole system more stable in this case
 
Joined
Nov 7, 2017
Messages
2,151 (0.79/day)
Location
Ibiza, Spain.
System Name Main
Processor R7 5950x
Motherboard MSI x570S Unify-X Max
Cooling converted Eisbär 280, two F14 + three F12S intake, two P14S + two P14 + two F14 as exhaust
Memory 16 GB Corsair LPX bdie @3600/16 1.35v
Video Card(s) GB 2080S WaterForce WB
Storage six M.2 pcie gen 4
Display(s) Sony 50X90J
Case Tt Level 20 HT
Audio Device(s) Asus Xonar AE, modded Sennheiser HD 558, Klipsch 2.1 THX
Power Supply Corsair RMx 750w
Mouse Logitech G903
Keyboard GSKILL Ripjaws
VR HMD NA
Software win 10 pro x64
Benchmark Scores TimeSpy score Fire Strike Ultra SuperPosition CB20
i would have suggested changing slot for gpu, but since the board only has one full sized, i didn't.
forgot about possible flexing, not outright defect of slot.
 
Joined
Feb 22, 2022
Messages
631 (0.55/day)
Processor AMD Ryzen 7 5800X3D
Motherboard Asus Crosshair VIII Dark Hero
Cooling Custom Watercooling
Memory G.Skill Trident Z Royal 2x16GB
Video Card(s) MSi RTX 3080ti Suprim X
Storage 2TB Corsair MP600 PRO Hydro X
Display(s) Samsung G7 27" x2
Audio Device(s) Sound Blaster ZxR
Power Supply Be Quiet! Dark Power Pro 12 1500W
Mouse Logitech G903
Keyboard Steelseries Apex Pro
I know the issue has been solved. But there are some posts that deserve comments.

Are you using Display Port?

I had to get a higher grade DP cable back on the RTX 3070 for the same issue, it would randomly black screen at 170hz at 1440P.
I ended up buying 8K certified DP cables and it's been absolutely solid since and now with my RX 7900 XT too.
You are confusing momentary black screen caused by Display Port sync issues (the black screen being the resync event), and the problem OP have. Which is a black screen followed by the computer crashing. A poor quality DP cable should never be the cause for the second issue.

I pointed out that his system wont peak at even half of what these PSUs are rated for, do you really think it's probable that 3 different PSU, even if they are garbage, they all couldn't even take half the transient spikes their meant to be able to handle ?
Your comment is mostly true, but you got some important details wrong. That GPU and CPU combination alone, running at spec, is rated at 268W. And I highly doubt the motherboard will strictly adhere to the stock 88W limit for that CPU. That is ignoring power delivery inefficiency and the power draw of every other component (rough estimate probably around 10-15% of GPU+CPU in this case). A stable load of >300W while gaming is highly probable. And transient load spikes, although order of magnitude(s) less than 30-series Geforce, will obviously occur (which would be what you refer to as peak). That being said, I agree that a good quality 550W PSU should be able to handle this system just fine.

Update:

i found out pci-e slot is problematic or gpu itself PCI-E slot is too wide or GPU pins are too thin

pci e slot too loose that gpu can tilt up and down inside

so i use little pressure on gpu and lean it to cpu side I fixed it that way with the help of screws

Looks like the problem is solved!

Probably the GPU pins and PCI-E slot pins were not making solid contact with each other. and this caused a momentary power loss and the computer to restart.

However, I still need to playtest like this for at least a few days.

Those who experience a similar problem should try this method before changing the GPU or motherboard.

this problem can be solved also using a gpu holder like this:

View attachment 342976
And the pins are spring loaded. A heavy GPU can move the pins enough, even in a 100% intact slot, to cause bad connection on one side. Good to hear that you found the problem in the end. Remember to support your GPU people :)

May I just ask what is so interessting about this or any AMD GPU issue ?
I am sorry, but why do you bother commenting if the thread does not interest you?
 
D

Deleted member 239770

Guest
Sorry but yes it will, Radeon's don't recover after the signal loss which is mostly due to it not being defined properly in your Windows registry by driver timeout restart, instead Windows will report hardware failure and simply blue screen then restart all under that black screen.

There is a video on why DP cables cause major issues.


 
Last edited by a moderator:

AsRock

TPU addict
Joined
Jun 23, 2007
Messages
19,226 (2.96/day)
Location
UK\USA
Processor AMD 3900X \ AMD 7700X
Motherboard ASRock AM4 X570 Pro 4 \ ASUS X670Xe TUF
Cooling D15
Memory Patriot 2x16GB PVS432G320C6K \ G.Skill Flare X5 F5-6000J3238F 2x16GB
Video Card(s) eVga GTX1060 SSC \ XFX RX 6950XT RX-695XATBD9
Storage Sammy 860, MX500, Sabrent Rocket 4 Sammy Evo 980 \ 1xSabrent Rocket 4+, Sammy 2x990 Pro
Display(s) Samsung 1080P \ LG 43UN700
Case Fractal Design Pop Air 2x140mm fans from Torrent \ Fractal Design Torrent 2 SilverStone FHP141x2
Audio Device(s) Yamaha RX-V677 \ Yamaha CX-830+Yamaha MX-630 \Paradigm 7se MKII, Paradigm 5SE MK1 , Blue Yeti
Power Supply Seasonic Prime TX-750 \ Corsair RM1000X Shift
Mouse Steelseries Sensei wireless \ Steelseries Sensei wireless
Keyboard Logitech K120 \ Wooting Two HE
Benchmark Scores Meh benchmarks.
I pointed out that his system wont peak at even half of what these PSUs are rated for, do you really think it's probable that 3 different PSU, even if they are garbage, they all couldn't even take half the transient spikes their meant to be able to handle ?

He all so says when changing one PSU the problem went away for a few weeks, then returned.

How ever he could of been playing different games.
 

HTC

Joined
Apr 1, 2008
Messages
4,692 (0.76/day)
Location
Portugal
System Name HTC's System
Processor Ryzen 5 5800X3D
Motherboard Asrock Taichi X370
Cooling NH-C14, with the AM4 mounting kit
Memory G.Skill Kit 16GB DDR4 F4 - 3200 C16D - 16 GTZB
Video Card(s) Sapphire Pulse 6600 8 GB
Storage 1 Samsung NVMe 960 EVO 250 GB + 1 3.5" Seagate IronWolf Pro 6TB 7200RPM 256MB SATA III
Display(s) LG 27UD58
Case Fractal Design Define R6 USB-C
Audio Device(s) Onboard
Power Supply Corsair TX 850M 80+ Gold
Mouse Razer Deathadder Elite
Software Ubuntu 20.04.6 LTS
@Atlas39 , If i may make a suggestion: IF INDEED your problem has been fixed, ask a mod to enable you to EDIT the original post to add [SOLVED] to the TITLE, as well as add [SOLVED] to the post itself, and pointing to the actual post where you figured it out, so that others having a similar issue can immediately search the relevant post for a fix to their problem.
 
Joined
Feb 22, 2022
Messages
631 (0.55/day)
Processor AMD Ryzen 7 5800X3D
Motherboard Asus Crosshair VIII Dark Hero
Cooling Custom Watercooling
Memory G.Skill Trident Z Royal 2x16GB
Video Card(s) MSi RTX 3080ti Suprim X
Storage 2TB Corsair MP600 PRO Hydro X
Display(s) Samsung G7 27" x2
Audio Device(s) Sound Blaster ZxR
Power Supply Be Quiet! Dark Power Pro 12 1500W
Mouse Logitech G903
Keyboard Steelseries Apex Pro
I had terrible issues (different from yours) after switching to 4K gaming.
They were completely fixed by just enabling paging file in Windows settings.

I broke my head searching for reason in my RX 6800 Nitro+
I forgot to comment on this in the previous post. STOP disabling paging file in Windows. This was a sometimes good hack up to Windows XP. Since then it will only break stuff and basically never do you any good. Surprisingly Windows is actually smart enough to only use the page file when necessary. BUT some software are programmed to always use the page file, even when there is no point in doing so based on available system resources, and will refuse to run or even crash if you have disabled paging. For the last couple decades, or thereabouts, there have been zero advantage and potentially aggravating negative consequences for doing it. Automatic size vs setting min/max manually is a more valid discussion imho. But 99.9% of the time that is also a who cares topic these days.

Sorry but yes it will, Radeon's don't recover after the signal loss which is mostly due to it not being defined properly in your Windows registry by driver timeout restart, instead Windows will report hardware failure and simply blue screen then restart all under that black screen.

There is a video on why DP cables cause major issues.


You specifically said black screen, no mention of crashing. That is usually a resync event, which is often caused by below spec/poor quality cables and is a well known issue. But sure, use a poor enough cable that cause constant resyncs and surprise, the driver will crash and usually take down Windows in the process. Or Windows will end up timing out the driver, bsod and restart. Both Nvidia and AMD have also had driver versions with black screen crash issues, in those cases it is always a question of did the driver crash and cause a black screen or did a black screen cause a crash. Black screen crashes have an inherent chicken and egg problem. Bottom line is that below spec cables will "only" cause flickering and/or ~1 sec black screen recoveries most of the time. Testing with a short cable, preferrably the one that came with the display, is an important troubleshooting step. But it is not very high on my list when there are actual crashes happening.

I also had to purchase better DP cables after purchasing two 1440p 240Hz monitors, because I could not use the included ones since they were to short. I would see momentary black screens while using the computer. And it got worse while running 10-bit colour instead of 8 (for at least two reasons). On the third set of 3m cables I got some that could actually handle the necessary bandwidth at that length.

There is a good reason both HDMI and DP cables come with certifications to help you pick the correct quality for your use. (Which of course chinesium manufacturers fake all the time.) But the good cables are also made in China 99% of the time. Finding properly working ones without having test equipment can be a real gamble.
 
Joined
Apr 10, 2024
Messages
2 (0.01/day)
Processor AMD Ryzen 7 5800X3D
Motherboard ASUS CROSSHAIR VIII HERO
Cooling Arctic Liquid Freezer III 420
Memory G.SKILL TridentZ Neo 32GB 3600mt/s RGB F4-3600C16D-32GTZNC
Video Card(s) AORUS RTX 3070 Ti MASTER
Storage 2TB KC3000+1TB SAMSUNG 970 Pro+500GB OCZ Trion 150
Display(s) BENQ XL2430T
Case BE QUIET! SHADOW BASE 800 FX
Power Supply EVGA B5 850W
Mouse Zaopin Z1 PRO
Keyboard SPC GEAR GK630
@Atlas39 Have you tried disabling ULPS like I mentioned? Might seem too simple but I have personally seen two systems that both had 6600 XT ( Nitro+ and Asus Dual variant) and had the exact same symptoms you described. Disabling it fixed the black screen and restart issue for them. They also had kernel power error in event viewer. Both completely new systems (AM4 and AM5), the only common denominator was the graphics card.
 
Joined
Nov 7, 2017
Messages
2,151 (0.79/day)
Location
Ibiza, Spain.
System Name Main
Processor R7 5950x
Motherboard MSI x570S Unify-X Max
Cooling converted Eisbär 280, two F14 + three F12S intake, two P14S + two P14 + two F14 as exhaust
Memory 16 GB Corsair LPX bdie @3600/16 1.35v
Video Card(s) GB 2080S WaterForce WB
Storage six M.2 pcie gen 4
Display(s) Sony 50X90J
Case Tt Level 20 HT
Audio Device(s) Asus Xonar AE, modded Sennheiser HD 558, Klipsch 2.1 THX
Power Supply Corsair RMx 750w
Mouse Logitech G903
Keyboard GSKILL Ripjaws
VR HMD NA
Software win 10 pro x64
Benchmark Scores TimeSpy score Fire Strike Ultra SuperPosition CB20
@Calenhad
unless like most here/enthusiast will have non-hdd based storage,
so having min/max set instead of win regularly resizing, is a good idea to lower amount of ("new") cells being written to.
 
Joined
Jan 1, 2012
Messages
460 (0.10/day)
I was asking the thread starter. If I'd asked you, I would have quoted the post (or part of it) that I wanted to ask about. ;)
Ah, thank you for the clarification. My initial reaction was to presume that but it could go either way so i figured I'd ask.

And either way I thought my feedback might be useful even if it wasn't directed at me since I imagine the thread starter changed those settings just to troubleshoot as I did (not that i was to speak for them though), and I found out by pure coincidence that it was unstable at stock. I'm still not sure what explains that since it was only the case for me with the new problematic video card and not the older one.
If the problem still persists, you can see with a small test.

Just lay the computer on its side and open the case cover.
remove the screws holding the gpu
Then slowly and gently push the GPU.
If the GPU is moving up and down in the PCI-E slot, there may be a contact problem.
I might be mistaken, but my imagination is telling me there's always been a little bit of play with all cards I've ever connected.

But yes, this definitely gives me a few additional things to try if I ever decide to look into it.
Your problem may be caused by something different, but we won't know until we try it.
Absolutely. We may or may not even have the same issue, but given the symptoms being so similar and the fact that it makes sense, it's worth investigating for me I suppose.

But right now I only seem to be having it with one use case (specifically, an older versions of Minecraft; current versions do not exhibit the issue so unlike the original occurrences which were random and happening across multiple things, I've specifically only seen it in one game on the new one). So this occurrence I'm having now may or may not even be the same issue I had originally too (but if it's a different issue, then I suppose it would have to be a driver issue?). I'd like to not ignore it because my thought process is that hardware that is stable shouldn't be crashing in anything unless the software is known to be unstable, and this one is not as far as I know? I can't find it stated that "yes, this version of Minecraft is known to be inoperable on AMD video hardware" anywhere. So my instinct was telling me it was the same old hardware issue just... only showing up there for some reason.

Way too much about this doesn't seem to follow traditional norms which is what makes it such a pain to troubleshoot.
so in the end i already solved the problem but its a secret that i cannot find out how cpu / ram oc make whole system more stable in this case
Yeah that was the mystery confusing me as well.
 

Atlas39

New Member
Joined
Apr 10, 2024
Messages
10 (0.03/day)
since yesterday i was testing system by games and power tests still didn't get crash
i think this time we success guys..

Thanks everyone for help & effort.
And special thanks to @Waldorf to teaches me very detailed and accurated ram tests and his endless supports even late at night

So my last request can moderators edit the title like " Gpu black screen issue (solved)

topic has many precious advices and knowledges so people who has similar problems can find the topic easier and maybe these fixes will solve their problem too
 
Status
Not open for further replies.
Top