# Discussion about BackBlaze as a source for drive reliability data



## lexluthermiester (Jul 24, 2021)

I discussion was going on that was off-topic for the thread it was in and needed a new home. And away we go...



R-T-B said:


> Backblaze also doesn't fair so well if you take a close look at their numbers (or a different year with different conclusions, for that matter). There is a lot wrong with using them for any kind of generalized conclusion.


I've never seen any major issues with their data.



R-T-B said:


> I exclude them from my stats with good rationale.


Because of course you would. But Ok, got any *other* data that is publicly available that shows usage and failure rates of such a large number of drives we can cite? BackBlaze, to my knowledge, is the only source for data like this where sample size is very large and data disclosure is frequent.


----------



## newtekie1 (Jul 24, 2021)

This discussion has been hashed out several times now, basically every time someone brings up Backblaze's data.

The fact is their data is a marketing tool for them, but it has no usefulness for average consumers of hard drives that run them in a single drive configuration.  And their data isn't even really that accurate, as they have said they leave out models that have 100% failure rates in their environment.

Hard drives in general have low failure rates, but they fail. Never trust a hard drive(or any form of data storage), always have a backup. /Thread


----------



## R-T-B (Jul 24, 2021)

lexluthermiester said:


> I've never seen any major issues with their data.


Really?  The fact that they deploy more Seagate than any other brand does not bother you from a sample size perspective?

The fact that they use client drives in environments unsuitable for said drives does not bother you?

That's just scratching the surface.



lexluthermiester said:


> Because of course you would. But Ok, got any *other* data that is publicly available that shows usage and failure rates of such a large number of drives we can cite?


PugetSystems deals with a lot of drives.  I'll dig it up, a moment.









						What is the most reliable hardware in our Puget Systems workstations?
					

Here at Puget Systems, our goal is not only to provide the fastest workstations possible, but the most reliable as well. As a part of our constant drive to offer only the highest quality components possible, we track and regularly review the failure rates for each part we carry. Today, we want...




					www.pugetsystems.com
				




Mind you, their data seems to be opaque, and it only shows WDD, which was the winner in 2019 stats for them.  It's thus also flawed.  Still, it does establish a bit of a trend.

But that wasn't really my initial point, which in the thread that started this, which was that bad data is not a valid basis on which to make ANY kind of generalized trend or accusation claim.  Just because it's the only data does not make it good.  Ancedotes are also not good.  They can maybe help you guess but you can't make any hard claims on them.


----------



## lexluthermiester (Jul 25, 2021)

R-T-B said:


> Really? The fact that they deploy more Seagate than any other brand does not bother you from a sample size perspective?


No. They show the failure rates based on drive model, not by manufacturer.


R-T-B said:


> The fact that they use client drives in environments unsuitable for said drives does not bother you?


No. Who defines "environmental suitability"? You? In drive racks, the drives are kept cool by way of very powerful and very LOUD fans. Cooling is not an issue, nor is power delivery. So we can rule out those two problems as sources for inducing potential failure. So the statement of "environments unsuitable for said drives" is completely opinion and that's all it is. The merit of use-case suitability is not a factor in the numbers subject to question.


R-T-B said:


> That's just scratching the surface.


Oh please, do continue.


R-T-B said:


> which was that bad data is not a valid basis on which to make ANY kind of generalized trend or accusation claim


But calling it bad data is a matter of opinion. So far that's all that is being offered, opinion. BackBlaze offers data and numbers in good faith to show the performance of the drives they use. The data appears to be factual and as such you have little to no basis to claim the data has no merit or is not valid. Unless you are going to call them liars. Is that what you are doing?


----------



## R-T-B (Jul 25, 2021)

lexluthermiester said:


> They show the failure rates based on drive model, not by manufacturer.


And the quantities listed in their samples?  Certain drives have dozens, others thousands.  This does not bother you?

I suggest you really take a hard look at the tables provided.



lexluthermiester said:


> In drive racks, the drives are kept cool by way of very powerful and very LOUD fans.


At Backblaze?  No, not really.  They use consumer drive enclosures IIRC.  Everything about them is designed to minimize costs (hence not enterprise drives).  

Not that it matters, it's far more than temps that differ in this usage case.



lexluthermiester said:


> Unless you are going to call them liars. Is that what you are doing?


I'm not appreciating you putting words in my mouth.  It's their own data claims of methodology that bother me, so I'm obviously trusting their data.


----------



## hat (Jul 25, 2021)

newtekie1 said:


> This discussion has been hashed out several times now, basically every time someone brings up Backblaze's data.
> 
> The fact is their data is a marketing tool for them, but it has no usefulness for average consumers of hard drives that run them in a single drive configuration.  And their data isn't even really that accurate, as they have said they leave out models that have 100% failure rates in their environment.
> 
> Hard drives in general have low failure rates, but they fail. Never trust a hard drive(or any form of data storage), always have a backup. /Thread


You win an internet.

Got data that you don't want to have disappear overnight? Back it up. Even the most reliable things can and will break down on somebody... if you don't want it to be you, have a plan.


----------



## lexluthermiester (Jul 25, 2021)

R-T-B said:


> This does not bother you?


No. Why should it? The individual drive percentages are properly represented within the context of their own performance. How do you fail to understand that given the data shared?


R-T-B said:


> I suggest you really take a hard look at the tables provided.


Thanks for the tip, but I'm not the one missing the finer details that have been clearly displayed for all to see.


R-T-B said:


> At Backblaze? No, not really. They use consumer drive enclosures IIRC.


What? You can't be serious with that nonsense... Are you, R-T-B, flame baiting?


R-T-B said:


> I'm not appreciating you putting words in my mouth.


I didn't. I posed a question. Learn how to properly context.


R-T-B said:


> It's their own data claims of methodology that bother me, so I'm obviously *** trusting their data.


Did we forget a word where I left the asterisk?


----------



## erocker (Jul 25, 2021)

Been using them at work about 3 years now. They've been great as far as I know. Boss is happy with them.


----------



## lexluthermiester (Jul 25, 2021)

erocker said:


> Been using them at work about 3 years now. They've been great as far as I know. Boss is happy with them.


What are you referring to?


----------



## R-T-B (Jul 25, 2021)

lexluthermiester said:


> No. Why should it?


Because they don't even include drives with a high or near 100% failure rate by their own admission?

That's the key piece that drives it all to uselessness.

Also, a couple tens of thousand drives is always way more statistically useful than a mere thousand or so, and backblaze spans all over the place in that regard, having a much larger sample of Seagate than anyone else for example.  It's hard to take it for a generalized trend for that reason.



lexluthermiester said:


> Did we forget a word where I left the asterisk?


Not sure what you expect there.  I said what I meant.  I can trust the data and still claim the data is useless for the case it was being used in.  It's data, but it's useless for consumer drive trends.


lexluthermiester said:


> What? You can't be serious with that nonsense... Are you, R-T-B, flame baiting?


No, I am dead serious about what I have heard about their operation.  We're talking fullon drives stacked in shipping containers and the like.  Have not verified it, I will leave that to the posters who are sure to find this to validate their claims.  I won't go so far as to claim that bit is validated, just passing rumor I have heard.  Sorry for misrepresenting it if it seemed as fact.



lexluthermiester said:


> What are you referring to?


Probably their backup service.


----------



## lexluthermiester (Jul 25, 2021)

R-T-B said:


> Because they don't even include drives with a high or near 100% failure rate by their own admission?


Ok, I'll agree, they should disclose that. But what if the reason they do not is because of some unforeseen problem that did not reflect actual drive reliability? Or maybe the sample size was so small that including it would be insignificant? There could be a great number of reasons why that data was withheld. We don't know and we can not hold the disclosed portion of data invalid because of the absence of particular data. If they have determined a viable reason to not include certain statistics, that is is their choice. We can not hold them adversely responsible for that.


R-T-B said:


> I can trust the data and still claim the data is useless for the case it was being used in. It's data, but it's useless for consumer drive trends.


How can you "trust" the data and at the same time conclude it's useless? I really want to understand your school of thought here. To me, it seems like one of us is missing something. I'm willing to accept that it might be me, but I want to understand your logic.


R-T-B said:


> just passing rumor I have heard. Sorry for misrepresenting it if it seemed as fact.


Fair enough. That one is a bit hard to believe.


----------



## R-T-B (Jul 25, 2021)

lexluthermiester said:


> How can you "trust" the data and at the same time conclude it's useless?


I only said it was useless for making sweeping generalizations about brands (if you follow the comment chain all the way back someone was using it to claim Seagate was "garbage").  I am sure the data is valid for some things, but not for generalized brand recomendations in a home use environment, IMO.  Of course, I'd argue you should never blindly buy brands anyways.

Of course a lot of this comes down to how strict you want to be, statistically speaking.  I will admit there is some data there that is indicative of WD being a quality brand, but to use it to call them "the best" is the point I take contention with.  The numbers are close even if we accept them for that use anyways, almost within the margin of error in some cases.  Of course select models may be avoided based on that data, but that says little about "x brand is good/bad"


----------



## erocker (Jul 25, 2021)

lexluthermiester said:


> What are you referring to?


The topic of discussion.


----------



## lexluthermiester (Jul 25, 2021)

R-T-B said:


> I only said it was useless for making sweeping generalizations about brands (if you follow the comment chain all the way back someone was using it to claim Seagate was "garbage").


Agreed. I even stated something along those lines in the other thread.


R-T-B said:


> I am sure the data is valid for some things, but not for generalized brand recomendations in a home use environment, IMO.


I'm of the school of thought that while a user is unlikely to have a drive failure due to drives being very reliable generally, it's important to know who is or is not the most reliable brand. While the differences are measured in fractions of a percent in many cases, when we're talking about ten of millions of drives sold per year that number adds up. People don't want to be in that extra few tenths of a percent of failed drives if they can help it. In this context, the data provided by BackBlaze in very valuable as it can give indications of which brand is the most durable currently and in the recent past, which are important trends to understand.

Additionally, knowing a drive can withstand tolerances of constant use in a server environment is an excellent indicator of how the drive will perform in the home or office where workloads are reduced.


R-T-B said:


> Of course a lot of this comes down to how strict you want to be, statistically speaking.


True. The data provided act as great, accurate indicators, not hard fact.


----------



## maxfly (Jul 25, 2021)

Hard drives? Are we talkin bout hard drives? Who still uses hard drives? I cant believe we're talkin bout hard drives.


----------



## hat (Jul 25, 2021)

maxfly said:


> Hard drives? Are we talkin bout hard drives? Who still uses hard drives? I cant believe we're talkin bout hard drives.


You go right ahead and build a bulk storage server with SSDs, I'll do it with HDDs and keep my money...


----------



## maxfly (Jul 25, 2021)

hat said:


> You go right ahead and build a bulk storage server with SSDs, I'll do it with HDDs and keep my money...



Nooooo!
Practice!? Are we talkin bout Practice?! 
Cmon man...


----------



## R-T-B (Jul 26, 2021)

maxfly said:


> Nooooo!
> Practice!? Are we talkin bout Practice?!
> Cmon man...


We're talking about the realities of being not incredibly wealthy and wanting a lot of drive space.


----------



## lexluthermiester (Jul 26, 2021)

maxfly said:


> Hard drives? Are we talkin bout hard drives? Who still uses hard drives? I cant believe we're talkin bout hard drives.





maxfly said:


> Nooooo!
> Practice!? Are we talkin bout Practice?!
> Cmon man...


You seem to be stuck firmly in your own world of opinion. MOST people who need mass storage use a combination of SSD for boot/OS drives and mechanical HDDs for bulk/mass storage. If you want to spend $1170 for an 8TB SSD, good for you, have fun. The rest of us will spend $180 on a 2TB SSD for our OS and $180 on an 8TB HDD and spend the difference on other hardware, perhaps a bigger HDD or simply leave it in the bank. It's called being smart and wise.



R-T-B said:


> We're talking about the realities of being not incredibly wealthy and wanting a lot of drive space.


Exactly

Now we return everyone to the regularly scheduled discussion topic..


----------



## newtekie1 (Jul 26, 2021)

R-T-B said:


> We're talking about the realities of being not incredibly wealthy and wanting a lot of drive space.


Seriously, if I converted all the storage I have in my file server over to SSDs it would cost about $7,000 using the cheapest $670 8TB drives. No thank you.


----------



## Shrek (Jul 26, 2021)

lexluthermiester said:


> The rest of us will spend $180 for a 2TB SSD for our OS



I hate having to stay on topic when I'm wondering if the OS needs 2TB and I don't want to open up a new thread; I'd imagine even 128GB might be enough for the OS, so long as one has another drive for files.

If this is off topic, just leave it unanswered.



lexluthermiester said:


> If you want to spend $1170 for an 8TB SSD



$700
Amazon.com: SAMSUNG 870 QVO SATA III 2.5" SSD 8TB (MZ-77Q8T0B): Electronics

but the point remains valid


----------



## maxfly (Jul 26, 2021)

lexluthermiester said:


> Now we return everyone to the regularly scheduled discussion topic..



You must mean the never ending, you two generally bickering incessantly, about everything under the sun but just happens to be loosely interpreted back blaze data this time around- regularly scheduled topic?  
If even for only a moment...


----------



## Hachi_Roku256563 (Jul 26, 2021)

maxfly said:


> Hard drives? Are we talkin bout hard drives? Who still uses hard drives? I cant believe we're talkin bout hard drives.


5.7tbs of SSD storage is quite a bit more expensive then 5.7tbs of SSD storage
im still of the thought process get ssd for your boot drive and everything else is a harddisk


----------



## lexluthermiester (Jul 26, 2021)

maxfly said:


> You must mean the never ending, you two generally bickering incessantly, about everything under the sun but just happens to be loosely interpreted back blaze data this time around- regularly scheduled topic?
> If even for only a moment...


Aww, that was adorable, you losing the argument you started and instead of bowing out with grace you throw out a personal attack. Nice show of maturity. Anything else?



Andy Shiekh said:


> $700
> Amazon.com: SAMSUNG 870 QVO SATA III 2.5" SSD 8TB (MZ-77Q8T0B): Electronics
> 
> but the point remains valid


Granted, that is an 8TB SSD offering, but I was talking about TLC NAND based drives. QLC is out of the question. I consider it to be garbage regardless of who makes it.



Isaac` said:


> 5.7tbs of SSD storage is quite a bit more expensive then 5.7tbs of *HDD* storage


I'm sure you didn't mean SSD in both parts of that statement, right?


----------



## Hachi_Roku256563 (Jul 26, 2021)

lexluthermiester said:


> I'm sure you didn't mean SSD in both parts of that statement, right?


yeah thats right
i can get a 4th amazing wd purple for 150 aud (and it plays all my games fine)
i would pay at least double for ssd


----------



## R-T-B (Jul 26, 2021)

maxfly said:


> You must mean the never ending, you two generally bickering incessantly


We actually get along pretty well, but thanks for the vote of confidence.


----------



## Steevo (Jul 26, 2021)

Context. Simply put their data is good if you are going to buy or have purchased one the drive models listed. If not, the data is useless unless you want to buy a specific drive model they have listed. That being said if their more than (x number of drives you plan on buying greater than the number they represent) is not qualifier enough then buyer beware.

Also, YMMV as your case may be cooler than theirs, your drive may have a unknown flaw that will lead to its demise, it may have been damaged in shipping, lightning may strike, an asteroid may strike it, all very valid things with a statistical chance.

Whining about free data is being a whiner, don’t be a whiner.


----------



## lexluthermiester (Jul 26, 2021)

This comment belongs in this thread to keep from derailing the other thread;


Jism said:


> Talk with any regular or large HDD recovery service and they will agree that seagate is among the top dying disks. The quality of consumer disks for some reason is far worse then the rest.


But can they provide tangible data to support that statement? That's what BackBlaze is doing. Granted, they are not testing/using every drive model on the market(it would be awesome if a company did so) but what they do use, they declare. If companies that had the access to such data compiled and declared such data, we the consumers would have a great deal of info to make a decision upon.


Jism said:


> Disks that can run 5 years without issues for a start


To be fair, I have a Seagate drive made in 2006 that still works perfectly. It's in an external USB3 enclosure and is used as a general data transfer drive, but still. That is 15 years of solid use. No bad sectors, no SMART warnings.


----------



## R-T-B (Jul 26, 2021)

Steevo said:


> Whining about free data is being a whiner, don’t be a whiner.


To be clear, I love free data and have no issue with that part.  What I take issue with is people misusing said free data.

Hope that clarifies.



lexluthermiester said:


> But can they provide tangible data to support that statement? That's what BackBlaze is doing.


tbf, even the backblaze data is not showing a massive trend one way or the other.  It leans one way MAYBE, but only by a few percentage points in most instances.


----------



## 95Viper (Jul 26, 2021)

*Stay on topic!*
The topic is "Discussion about BackBlaze as a source for drive reliability data"; and, not each other's posting relationships.


----------

