# AMD 380x Hard Locks on COIVD-19 Workloads



## Operandi (May 18, 2021)

This started happening when WCG started throwing out tons of GPU work units and is on my main system which runs a Sapphire Nitro 4GB 380x.  And I think this happened when I tried running Milkway@home a ways back which was the only GPU workload I tried on this PC.  Back then I thought it was Milkyway@home since that would sometimes crash the WCG client but now I think its my GPU and I'm just seeing it more now that there is a lot more GPU work loads coming from WCG.  

Typically it will run for a day or three but eventually I'll come back my PC and its just unresponsive, never any BSOD, driver crashes, or anything.  Also the work units it completes look good when I look them up so it seems to be computing ok.  Drivers are updated, system is otherwise really stable, no issues with games; the few I'm currently playing (Overwatch, Diablo 3, Battletech, random indie stuff).  I've also taken off all the GPU overclocks but it literally seems to make no difference, stock vs mild OC vs aggressive OC.  I've tried running extended runs of Furmark which pass fine but I'm not sure if thats even relevant to compute work... not sure what else to try.  Anyone else ever have issues like this?


----------



## phill (May 18, 2021)

Have you tried any games on it at all?  Any issues with anything like Heaven or Valley or 3D Mark??  I'd avoid using Furmark, it's not the kindest of GPU programs....


----------



## Operandi (May 18, 2021)

phill said:


> Have you tried any games on it at all?  Any issues with anything like Heaven or Valley or 3D Mark??  I'd avoid using Furmark, it's not the kindest of GPU programs....


Yeah, per og post.


> (Overwatch, Diablo 3, Battletech, random indie stuff)



I burn in everything with Prime95 and Furmark and set my OCs there because they are unkind to hardware.


----------



## dragontamer5788 (May 18, 2021)

There's always the chance its your CPU / RAM that's messing things up, and not the GPU itself. In which case...



Operandi said:


> I burn in everything with Prime95 and Furmark and set my OCs there because they are unkind to hardware.



Prime95 is an AVX-benchmark, which is a relatively niche portion of the CPU (most programs do NOT use AVX). And while yes: AVX is the "hottest" section of the CPU, you're really not testing the typical CPU functions under Prime95.

Maybe hit your computer for 24-hours of Memtest86+ (Burn-in test your RAM), or maybe some other CPU burn-in programs? Maybe its not your GPU at all that's causing the problem.


----------



## phill (May 18, 2021)

Operandi said:


> Yeah, per og post.
> 
> 
> I burn in everything with Prime95 and Furmark and set my OCs there because they are unkind to hardware.


Apologies, read straight past that...

I think that's half the battle,  Never use Furmark to test for overclocks, always a game or a benchmark.  That said, its very rare I'd run a GPU overclocked, I found in past experience that a CPU overclock is of more benefit than a GPU.  That said with how tightly wound CPUs can be and the amount of extra performance you can get with overclocking them, better just leaving it be..  

Have you tried a format or a bios reset at all?  Just wondering if there's a bit of a bug or something that might be throwing it off...


----------



## mstenholm (May 18, 2021)

I run both folding@home and COVID GPU, sometimes at the same time and on the same GPU and I still have to experience a crash, even with 8 Corvid jobs and a folding job running side-by-side. I doubt that the Corvid GPU units are putting that much strain on your GPU. How many do you run concurrently?


----------



## Operandi (May 21, 2021)

dragontamer5788 said:


> There's always the chance its your CPU / RAM that's messing things up, and not the GPU itself. In which case...
> 
> 
> 
> ...


I was about to post and rule this out as up till about the last 2 months or so everything has been pretty solid but I just had a hard lock Wednesday with GPU compute disabled.  So yeah... didn't even get chance to jinx myself.  I think the uptime was like 3.5 days since the last time it hard locked which is about normal so you may be right in that it probably/might not be GPU related at all.



phill said:


> Apologies, read straight past that...
> 
> I think that's half the battle,  Never use Furmark to test for overclocks, always a game or a benchmark.  That said, its very rare I'd run a GPU overclocked, I found in past experience that a CPU overclock is of more benefit than a GPU.  That said with how tightly wound CPUs can be and the amount of extra performance you can get with overclocking them, better just leaving it be..
> 
> Have you tried a format or a bios reset at all?  Just wondering if there's a bit of a bug or something that might be throwing it off...


So to follow on to my other reply I first set the multipliers back to their stock clocks (this is an old FX8350 fyi) and for whatever reason was getting hard locks at idle on the desktop which seems super odd.  Loading optimized defaults gave me much better results though and it seems good.  I still have WCG running on CPU (at 50%) with stock clocks but its only been two days so too soon to say its stable.  If it goes for a week solid I'll turn GPU compute back on see what happens. 

After that I'll re-work my overclock and see what happens, maybe my chip is degraded after years of being overclocked?  Also if not Prime95 what else do people use for CPU stability tests?  I know it uses AVX but I thought it used pretty much "everything" depending on the memory datasets you choose?


----------



## phill (May 22, 2021)

To be fair, I've had such mixed results with Prime and even Orthos back in the day that I just used WCG for the stress testing...  If it was dropping work units like flies, then I knew an overclock was poor.  I once ran Orthos (similar to Prime 95, just different GUI if you will) for about 15/16 hours, no crashes, worked perfect.
30 seconds into WCG, I had a work unit fail (this was back in the day of my E6600 I had, wasn't the best overclocker but it was clocked to about 3.5Ghz at the time...  I just thought, right, well that was a waste of time lol

Most of what overclocking I used to do for for benchmarking, so every day use as long as it gamed, surfed online and such, I never gave a monkies outside of that if it wasn't 1001% stable... I have to plead a little ignorance but I'm not sure if the FX8350 even has AVX instructions, does it??  

My thoughts are, if it's running stock and running fine, then it's ok.  If your clocking it and it's failing then you can either swap out parts to see if they are limiting the performance/overclock or just not overclock at all.  I know the FX CPUs are very inefficient and very power hungry and the first gen Rzyen rips past them, like a number of CPUs do sadly but for you still using them, then I think the 50% CPU on the CPU with stock clocks and such, I think is bang on the money.  
CPUs can degrade.  Hell, I've had a Intel i3 (6320 it was we used with XTU benchmark) it killed it running at stock...  Just never worked properly and got to the point that you needed to pump more voltage through it than it really should have...  
Never ran the benchmark again to be honest,  I don't really enjoy killing hardware at all.... 

To something you mentioned - WCG - are you with team TPU for your contributions or are you with another team?  I had a look in our teams list and couldn't see your name but wondered if you might have been under another/different name??


----------



## Operandi (May 24, 2021)

Just had a hard freeze over the weekend running at stock, so not a WCG or an overclock issue anymore...  The only thing I can think of is I changed out the a couple of the SATA cables a bit back so I re-checked those connections.  This might have happened before, not sure.... I've had this system since college but I kinda remember the right SATA connectors putting slight pressure on the cables as they go through the cable management of my Lian Li case.  We'll see what happens, its been running for 2 days with the OC restored and CPU and GPU compute back on since I don't think thats the issue.

As to the FX and the OC yeah the FX CPUs have AVX.  The OC is pretty mild 200mhz all core and 600mhz turbo core, so 4.2 and 4.8 respectively at stock voltage, 24 hour Prime95 passes for stability.  It does more with more voltage but the heat and the noise become pretty stupid for daily use.

Never got into the teams thing.  Just putting CPU cycles to projects I find interesting and putting my hardware to a good use.


----------

