Tuesday, January 24th 2023

Samsung 990 PRO Flagship SSD Has an Endurance Problem, Users Notice Rapid Drive-Health Drops

Samsung 990 PRO is the company's flagship client SSD, which is among the fastest Gen 4 NVMe SSDs you can buy. It also commands a very high price premium, with the 1 TB variant priced at $170, and the 2 TB variant at $290. When you're buying in this segment, you expect the highest endurance figures for your SSD. Client SSD endurance figures are already on the rise, as NAND flash technology evolves. Neowin noticed that their 990 PRO isn't meeting this vital expectation, and with a little digging, found that there are others with this problem, and they didn't just get a bad drive.

Apparently, the "drive health" reading in Samsung Magician—the utility software for Samsung SSDs—drops rather rapidly for the 990 PRO. After a clean software installation on a new drive, Neowin observed that their drive's health reading was already at 99% (something very unexpected for a new drive); and what's worse was that even with regular use of the drive in the following days, the drive health would drop by 1 percentage point every day. Drive health is interchangeable with endurance, as it indicates the number of program-erase (PE) cycles left on the NAND flash memory before regions of the drive's user-area become unwritable.
Such a rapid drop in endurance used to be a problem in the very first generations of client SSDs some 15 years ago, but it's highly unusual for a flagship product like the Samsung 990 PRO. This user on Twitter claims that their drive health dropped down to 64% at just 2 TB of total bytes written—something you don't expect even entry-level SSDs to end up with. Neowin's initial RMA request was rejected (the drive returned) as the company found "no defect" with it, but once it realized that it was dealing with the press, it quickly reached out to replace the drive and try to reproduce the issue.
Source: Neowin
Add your own comment

80 Comments on Samsung 990 PRO Flagship SSD Has an Endurance Problem, Users Notice Rapid Drive-Health Drops

#51
qoonik
CrackongI avoided Samsung SSD after I had 3x 4TB 870EVO in my NAS failed in 2 weeks time.
So what do you recommend for NAS storage?
Posted on Reply
#52
Crackong
qoonikSo what do you recommend for NAS storage?
I prefer MX500

It is cheap, locally.
MX500 $275 for 4tb , $130 for 2tb
870EVO $330 for 4tb, $160 for 2tb
Consider buying 4 of those for RAID5 and you saved up the money for another drive which could be RAID6 / hot spare

There is no meaningful performance difference vs 870EVO

Very reliable in my use case
Currently running 6x2tb + 1x4tb + 3x500gb MX500s in the house in NAS / EPYC VM server
None of them fails so far.
Posted on Reply
#53
SkullFox
kaktus1907And I just ordered 2TB 990 Pro today. I've been very happy with my 500GB 970 Pro MLC drive that I'd wish to get a new Samsung drive.

50TB so far in 3 years 94% health left.


Canceled my order, not gonna risk if it has same problem or not.

Edit: Ordered Kingston KC3000 instead.
what software is this? WHinfo?
Posted on Reply
#54
kaktus1907
SkullFoxwhat software is this? WHinfo?
HWiNFO64
Posted on Reply
#55
Godrilla
My 990 pro 2 terabyte still has zero percentage used or 100% life left according to Samsung Magician software even after 2 months of use. I am very interested in how this story unfolds.
Posted on Reply
#56
ThrashZone
GodrillaMy 990 pro 2 terabyte still has zero percentage used or 100% life left according to Samsung Magician software even after 2 months of use. I am very interested in how this story unfolds.
Hi,
Samsung magician is bias software
You can verify with above mentioned hwinfo64 using sensors only and agreeing to the warning popup message.
www.hwinfo.com/download/

You can also use crystal disk info
www.majorgeeks.com/files/details/crystaldiskinfo.html
Posted on Reply
#57
Psychoholic
strange issue, clearly not all drives are affected.
Will be interesting to see what's causing it, i would think if its a reporting/calculation bug, it would be the same across all drives with the same firmware.

Here's my 2TB in HWINFO:
Posted on Reply
#58
ThrashZone
Hi,
99% life on a very old 256gb 850 pro os ssd
Guess these days are long gone
.


Funny 990 has three temperatures now
Posted on Reply
#59
Tomorrow
ThrashZoneHi,
99% life on a very old 256gb 850 pro os ssd
Guess these days are long gone
.


Funny 990 has three temperatures now
Two actually: Controller and NAND. Controller temp is usually #2 and is the higher one as NAND does not get as hot unless stressed.
HWInfo64 shows 3 for some reason but Drive Temperature is simply the NAND temp taken from another sensor most likely.
So unless Samsung has placed 3rd sensor on DRAM i doubt it shows more than the Controller and NAND temps.

I still have one 128GB 850 PRO. The thing is a tank. MLC NAND barely degraded. Before i retired it it has only like 1% wear after 100TB+ writes over multiple years, thousands of hours of power on and hundreds of power cycles.
Posted on Reply
#60
Godrilla
ThrashZoneHi,
Samsung magician is bias software
You can verify with above mentioned hwinfo64 using sensors only and agreeing to the warning popup message.
www.hwinfo.com/download/

You can also use crystal disk info
www.majorgeeks.com/files/details/crystaldiskinfo.html
Both my 3 year old 970 2 terabyte evo plus and 990 pro say 100% life must be a lightweight user compared to some
photos.app.goo.gl/eCUa9RFDkTNwa3XJA :cool:

Normally every drive has a small batch of bad drives. Eg. go to any drive review and look at the often insignificant ( although unfortunate ) amount of bad reviews. How many people came forward with 990 pro life degradation so far? I heard 2 so far maybe more. What's the alternative pcie 5.0 nvme ssds that have no user reviews that cost 2 to 3 times the amount for a real world performance that is probably similar. If anyone does experience performance loss please post here. Although I don't want people to go on a testing frenzy even though testing shouldn't cause loss of life that rapid either.

Update: I am running them at Standard Mode, not sure if full performance is also a culprit?
Posted on Reply
#61
b1k3rdude
CrackongI avoided Samsung SSD after I had 3x 4TB 870EVO in my NAS failed in 2 weeks time.
The 870 series was garbage from the get go. The last good SSD Samsung released was the 970 series.

I have a 960 pro that I had for several years now and its life is still at 98% (13k on-hours, total reads/writes 86/58TB), and the used Crucial P1 that is also a few years old but newer than the samsung is sitting at 99%
Posted on Reply
#62
AsRock
TPU addict
Would not be the 1st time SMART has got it all wrong. Who knows SMART might be killing other drives because of PC error's.

Funny there is no way to undo a drive at your own risk when it goes read only.
Posted on Reply
#63
Bigshrimp
I will never ever buy Samsung products again, as I realized that they do not honor their warranty for their products. I had an in-warranty earbuds purchased directly from Amazon that was listed as a US model and when one of the two earbuds failed, I sent it in for warranty repair. Samsung sent it back unrepaired without a reason or explanation. I had to contact them 3 times to get an answer for why the warranty wasn't honored and they said it was a Caribbean model and not for the US market. I contacted Amazon and they assured me it was a US model and they replaced the defective earbuds for me instead. After that, I realized how shady Samsung was and vowed never to buy any of their products again. Just a shameful company and terrible service all around. They obviously don't give a damn about their customers in the slightest.
Posted on Reply
#64
joesiv
SSD's have much more complicated firmwares than HDDs, there is so much more to go wrong. With using more layers, and having more features to assist in caching and wear leveling, things can definitely go wrong.

At the company I work for, we did extensive testing on SSD's years ago, after one brand failed us due to firmware bugs. This stuff is not hard to detect with a reasonable testing setup, if you're looking, especially if you have cooperation from the manufacturer, which reviewers typically have. It was a pet peave of mine when I used to read SSD reviews and all the reviewer would say about endurance is the marketing tarabytes written. I used to comment on reviews requesting them to actually test life exepctancy, or at least LOOK at the SMART data after all their testing, and get a gut feeling if it is ok or not. This would have caught this issue and it could have been presented either to Samsung to look at it, and hopefully fix it before it was released, or have a firmware ready to go soon after release. Or, reviewers could have given a warning to buyers.

Something to understand about life expectancy and "terabytes written". The spec that they advertise and gets repeated by most people and reviewers is NAND writes. And this is based on the NAND that was used in the product, it's well tested, how many writes can the NAND handle on average, mixed in with over provisioning, you have your terabytes written.

The problem is, that people (and probably some reviewers) expect that to be "OS bytes written", when you copy 1 TB to your disk, that's an OS write. Unfortunately, unless you look at the SMART data (you may need manufacturers help to understand how to decode it, or to get the the units they are storing in) and look at the NAND writes. you really have no idea how much you've written to the disk.

Your "1 TB" written, may actually be much greater than 1 TB NAND writes. Actually, now that I think about it, DRAMless cached models, it may even be greater than 2 TB, if they are copying it from some SLC mode NAND over to standard multi layer/3d NAND.

The difference between OS writes and NAND writes is called "write amplification", it's a known metric for SSD manufacturers, and different grade SSD's intended for different uses (consumer vs enterprise for example) will have acceptance of different write amplifications to be within spec. Thats why it's a faux pas to use a consumer grade drive in a NAS. Perhaps DRAMless SSD's have higher write amplifications due to caching, I don't know. But I do know that features like wear leveling and maintenance tasks will increase the write amplification, as the SSD maintains the data on the drive and pushes it around. This is all normal and expected, but if something is wrong with the firmware, things can easily go wrong, and write amplification can get out of control.

We should try to get a full dump of the SMART data, and see if samsung publishes it's SMART data documentation, and we'll probably find that NAND writes are actually right in line with the life expectancy that's being reported. If the workload is typical, then this is likely a firmware bug
Posted on Reply
#65
efikkan
No one should be surprised by these news. I've been saying it for years that the endurance ratings of TLC/QLC/... are just marketing BS. I'm at least glad I secured myself one of the last 970 Pros for my next build. (I have 3 in total)

I've also noticed some SSDs tend to wear out very quickly (not this model in particular), without much data written at all. My theory is that lots of small writes will cause a lot of wear, at least in certain patterns. But I haven't tested this out in a controlled environment.

I would actually like if TPU wrote an editorial asking the market to create a proper pro SSD again, like a SLC SSD of 256 GB, 512 GB, 1 TB, 2 TB and 4 TB variants. In a market oversaturated with crappy white label SSDs, isn't there room for at least one proper enthusiast/prosumer SSD? I would buy several.
Posted on Reply
#66
R-T-B
GabrielLP14Am i reading this correctly? People freaking out because of 6%
In a few days? You bet your ass its freakout worthy.
Posted on Reply
#67
Godrilla
Update my 990 pro 2 terabyte lost 1% since a few days (99% life left) ago with almost 10 x less writing than my 970 evo plus 2 terabyte drive which is still reading 100% life. Both on new magician software up version 10.0.22621 and CrystalDiskinfo version 8.17.14. I hardly used my PC since last reading but I did update my Magician software. FYI.

Not sure if this helps Tom's article mentions the 980 pros early death might be fixes with new firmware www.tomshardware.com/news/samsung-980-pro-ssd-failures-firmware-update
Posted on Reply
#68
chrcoluk
Samsung not doing great with this and 870 EVO issues, but my question, how come this got a story but not the 870 EVO problem?
Update 2 - 2023.01.23
Samsung's RMA division, Hanaro, have reached out and offered to A) Replace this SSD, and B) Try to replicate the problem. Quite why both of these options were not on the table before the issue became public is a mystery.
AsRockI was watching a video about a user having this issue, 93% after no more than 10TB ( owned for a month ) writes and apparently samsung told them it's a none issue.

We'll see.
Did they hire the MX500 firmware team?
TomorrowTwo actually: Controller and NAND. Controller temp is usually #2 and is the higher one as NAND does not get as hot unless stressed.
HWInfo64 shows 3 for some reason but Drive Temperature is simply the NAND temp taken from another sensor most likely.
So unless Samsung has placed 3rd sensor on DRAM i doubt it shows more than the Controller and NAND temps.

I still have one 128GB 850 PRO. The thing is a tank. MLC NAND barely degraded. Before i retired it it has only like 1% wear after 100TB+ writes over multiple years, thousands of hours of power on and hundreds of power cycles.
My 512gig 850 pro has done years in my main PC as OS drive, followed by PS4 duty (which auto records game footage as you play), and is now been used as a ZFS host drive in proxmox, its the best drive I own I reckon.
BigshrimpI will never ever buy Samsung products again, as I realized that they do not honor their warranty for their products. I had an in-warranty earbuds purchased directly from Amazon that was listed as a US model and when one of the two earbuds failed, I sent it in for warranty repair. Samsung sent it back unrepaired without a reason or explanation. I had to contact them 3 times to get an answer for why the warranty wasn't honored and they said it was a Caribbean model and not for the US market. I contacted Amazon and they assured me it was a US model and they replaced the defective earbuds for me instead. After that, I realized how shady Samsung was and vowed never to buy any of their products again. Just a shameful company and terrible service all around. They obviously don't give a damn about their customers in the slightest.
I think its a combination of chasing good reviews (making it bench well) and been greedy on margins e.g. ditching MLC for TLC on pro drives.

The 870 EVO on paper is no better than the 860 EVO so one wonders why the 870 was even released and the only explanation is they done some cost cutting somewhere.
Posted on Reply
#69
Tomorrow
GodrillaNot sure if this helps Tom's article mentions the 980 pros early death might be fixes with new firmware www.tomshardware.com/news/samsung-980-pro-ssd-failures-firmware-update
This is the exact same problem that was affecting my PM9A1's that Tom's quoted in the article from Reddit (that's me they quoted).
My post was made before i discovered the new firmware and found a way to update.
Media errors kept increasing and available spare kept dropping. Updating the firmware did stop the degradation.

Atleast currently these values have not changed anymore. It was also way more difficult because OEM versions of Samsung SSD's like my PM9A1 cant be updated trough Magician and have to be updated via command-line utility in a roundabout way.
Even finding this utility took me over a year as it was not available when i first checked in 2021.
Posted on Reply
#70
RJARRRPCGP
john_I have a TeamGroup 1TB GX2 that seems to be having problems with trimm. Speed drops to 5MB/sec in some parts of the drive. Contacted TeamGroup, send them a number of benchmarks, got some replies from them, including one saying that....

in the end because of also a busy period for me with my job and stuff, just gone and bought a Kioxia to replace it.

Some companies aren't really helpful when someone finds a problem with their products. And this case proves that even Samsung that asks from the consumer to pay a premium, isn't better either.
Looks like a faulty SSD.
Posted on Reply
#71
Godrilla
Update another Magician software update another 1% loss of life so now my 990 pro has 98% life. The only thing I did was install Hogwarts and 20% complete progress for game. Stay away from Samsung drives until they realize that their laxed approach to this will not be tolerated. They are relying on mind share too much but this arrogance will bight them in the arse!
Posted on Reply
#72
RJARRRPCGP
The 850 Evo has firmware updates, too. IIRC, Magician alerted me about a firmware update.
Posted on Reply
#73
ThrashZone
GodrillaUpdate another Magician software update another 1% loss of life so now my 990 pro has 98% life. The only thing I did was install Hogwarts and 20% complete progress for game. Stay away from Samsung drives until they realize that their laxed approach to this will not be tolerated. They are relying on mind share too much but this arrogance will bight them in the arse!
Hi,
They need to clean house in their firmware department or actually test the crapware better before release.
Think they adopted microsoft testing which are end users not paid ms personnel :laugh:
Posted on Reply
#75
Godrilla
Tomorrowwww.tomshardware.com/news/samsung-990-pro-firmware-update-released-ssd-health
hardware/comments/110x049/_/j8by0s8semiconductor.samsung.com/consumer-storage/support/tools/

Unfortunately this only stops the degradation. It does not reverse it.
Allegedly at this point remember their marketing said these are pro level drives that have high duration in the first place so allegedly at this point yet to be determined in terms efficacy of the update!
Posted on Reply
Add your own comment
Jan 21st, 2025 23:13 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts