• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Question test HDD files

Status
Not open for further replies.
Isso não é apenas fragmentação?


Seriamente?
yes

1 year without use loses magnetization, mechanical and oxidation problems?

Do HDDs also use flash memory to retain firmware and BIOS? Is this a problem since flash memory is not recommended for archiving?

Is it necessary to test annually or every two months to check for any type of corruption?

Is a 2.5" HDD more likely to die when stored or in constant operation?
 
Not for one set of archive zips that are written in one go to an empty drive...
I can't say for sure, but I'd expect ECC action to be transparent, except when it needs to retry reading sectors.
If you see a slowdown in the future, it would be interesting and useful to check the fragmentation there, and compare pre and post the SMART stats for reallocated or pending sectors.

This was the drive needing to take a bit longer to read and the ECC engine to decide what is the proper data output.
...long-term storage tests on new SSD drives after say 6-12 months
But flash is a different beast than HDDs.
I don't recall seeing anyone doing retention tests. More difficult to test, and requires long term commitment. But that would be interesting indeed.
 
Last edited:
I think it goes together: more reliable = more cycles. Presumably NOR, and maybe SLC rather than MLC.
For example, MX25L6433F might be a common BIOS flash chip (I'm unsure, but I think it's the ballpark): 20 year retention, 100K cycles.

Good to see it's that good - never realised they had hit such high numbers since the early days - probably good considering the amount of attribute data they need to process / handle and store.
 
@Vincero
I don't think there's much writing going on. I could be wrong, but I assume once per boot, or at worst just every now and then throughout the day.
AFAIK, unlike NAND, NOR can be written to a byte at a time, so there's less "write amplification".
While erasure still happens on a block level, I assume it would be handled intelligently. Such as, use up all available space and only then erase.
 
I can't say for sure, but I'd expect ECC action to be transparent, except when it needs to retry reading sectors.
If you see a slowdown in the future, it would be interesting and useful to check the fragmentation there, and compare pre and post the SMART stats for reallocated or pending sectors.


Flash is a different beast.
I don't recall seeing anyone doing retention tests. More difficult to test, and requires long term commitment. But that would be interesting indeed.

That was from a flash drive - with no fragmentation (checked with contig command just to make sure). TLC based.

And yeah, for sure the ECC actions are transparent but if a retry is needed there is an instant performance hit (obviously more so for a spinning HDD where the drive electronics need to maybe wait for the data to come back around again - people forget that modern drives *may* still use interleaving vs being set/controlled from BIOS).

* Corrected with respect to @nageme comments
 
Last edited:
Right. I thought we're talking about HDDs, which is the thread's topic.
And people were saying stuff like the following, which doesn't seem likely:
the surface of unused HDDs can get demagnetised, leading to data corruption. It's probably enough to power them on every couple of months

SSDs are a different discussion, and I wouldn't use them for archival storage.

people forget that ALL modern drives now use interleaving
As in 1980s floppy style? I don't think so.
Data is stored linearly (except remapped sectors, if areas got physically corrupt over time).
 
Last edited:
Right. I thought we're talking about HDDs, which is the thread's topic.
And people were saying:


SSDs are a different discussion, and I wouldn't use them for archival storage.
That's fair enough - but I wouldn't treat the methodology any differently be it SSD or HDD.

The causes of problems are different but the things to look for are the same, e.g. bit-rot can happen, and performance dips are not a good thing and a sign of trouble - either way SMART attributes should be checked also.
I do have an archive 8TB HDD but I literally just updated that so a) am not expecting to see any difference, b) it's slower to go through because HDD, and c) more hassle to pull and deal with.
Just so happen to have an SSD that's been in cold storage for >6 months and to a certain extent it highlights a not ideal scenario.

The issue I've shown can happen (and has happened before for me) with HDDs.... one reason I was not too sad when Samsung exited the HDD business / sold it to Seagate. Although, to their credit, they did RMA the drive (because it reallocated sectors retrieving one file and corrupted it - apart from that one bit the drive was fine so likely a media defect).

I've also seen server grade (IBM U320) drives loose their contents after years in storage - wasn't an issue (fortunately) as the contents were never needed.
 
Last edited:
Right. I thought we're talking about HDDs, which is the thread's topic.
And people were saying stuff like the following, which doesn't seem likely:


SSDs are a different discussion, and I wouldn't use them for archival storage.
I was talking about HDDs (read my post again).
 
As in 1980s floppy style? I don't think so.
Data is stored linearly (except remapped sectors, if areas got physically corrupt over time).
As in 1980s MFM/RLL style?? Yes, but its all transparent.
In many cases it's 1:1 (i.e. no interleave), but some drives vary how it's used.
Some datasheets still make reference to it, e.g. Seagate Exos SAS drives (although they only state minimum and not maximum).
Making an assumption that if it wasn't needed at all they would state no interleaving or just 1:1 - not mention a 'minimum'.

I remember reading that some drives can use different interleaving at different points on the disk to optimise the drive electronics read speed (as HDDs don't vary rotation speed). I doubt todays drives would operate at anything worse that 1:2.
 
Last edited:
Some datasheets still make reference to it, e.g. Seagate Exos SAS drives (although they only state minimum and not maximum).
Making an assumption that if it wasn't needed at all they would state no interleaving or just 1:1 - not mention a 'minimum'.
Huh. Curious.
Still, I think there's nothing but 1:1, even in "exotic" Enterprise drives.
Did you see anywhere explicitly mentioning more than 1?
And does "minimum" make sense at all, since how can it be less than 1?

I suspect it's bad wording and a leftover text template from bygone days.
For example, in the Product Manual of a 1996 Seagate 2.1GB SCSI drive, Hawk 2XL:
4.2.3 Generalized performance characteristics
Minimum Sector Interleave (all Hawk 2XL models) 1 to 1

And it also mentions speeds as "MByte/sec divided by (Interleave Factor)".

But even for that old drive it also says:
3.1 Standard features
...
* 1:1 Interleave

I was talking about HDDs (read my post again).
I know.
It wasn't a reply to you, but rather a conversation line with Vincero where I said this thread is about HDDs, and your post was an example.
 
Huh. Curious.
Still, I think there's nothing but 1:1, even in "exotic" Enterprise drives.
Did you see anywhere explicitly mentioning more than 1?
And does "minimum" make sense at all, since how can it be less than 1?

I suspect it's bad wording and a leftover text template from bygone days.
Possibly / Probably... But the ambiguity and the wording of the statement leaves the implication that either the drives can implement a varying approach, or reserved for extremely data dense drives and high spin speeds.
The data sheets no longer list this item per each drive model. Or maybe thee makers just reserve the right to use it where they want as part of their bag of tricks to get the most out of a model.

Certainly, I would struggle to imagine having interleaving skipping sectors in a reliable way when used with SMR or other potentially overlapping data areas.
 
Last edited:
1 year without use loses magnetization, mechanical and oxidation problems?

Do HDDs also use flash memory to retain firmware and BIOS? Is this a problem since flash memory is not recommended for archiving?

Is it necessary to test annually or every two months to check for any type of corruption?

Is a 2.5" HDD more likely to die when stored or in constant operation?
 
Bit rot is very real after several years and if you really want to archive data for a very long time and it is important, buy another drive and copy one to the other to write the data fresh and keep the old one until you overwrite it. I'd do this once a year personally.
i had older HDDs on a shelf (anti static bag, cardboard box) and a lot of files were corrupted even on ironwolfs within 2-3 Years.
 
Is it necessary to test annually or every two months to check for any type of corruption?
If you are concerned you could check the integrity of your files by doing a binary compare on each file using a tool such as Beyond Compare with each of your drives.

Depending on how your files are organized compare from the source or designate one of your backups as the primary and compare that against each copy. Since they are USB drives make sure you have enough USB ports on your motherboard to host all the drives at once you will be comparing to minimize the chance of issues with the USB devices as I found using USB hubs can be problematic with USB drives and a high amount of I/O.

Run multiple compares at the same time to save time but since you infrequently access the drives I would say the time investment to let the compare run for a day or two (depending on how much data you need to validate) would put your mind at ease about the integrity of your files between devices and exercise your drives sufficiently to know if they are near physically failing so you won't be surprised later.

If you find an inconsistency with a file you then can compare against 4 of your devices to determine the correct copy and repair or replace the drive with confidence you have the undamaged copy.

This is of course only one way to deal with verifying multiple copies as there are other strategies but this is simple and cost friendly. I do this with my backups every now an then from my NAS (operating on RAID6 for redundancy with periodic scrubbing to check for bit rot) to ensure my raw backups (not managed by a system that is designed to validate file integrity) are intact. This of course places a high degree of trust on my NAS to ensure files are not corrupted. In my case if I find a problem I have several means to determine how I can recover a file since I have both kinds of backups (automatically verified vs. the NAS backup software and manually verified with my raw backups and Beyond Compare)
Is a 2.5" HDD more likely to die when stored or in constant operation?
"If you don't use it you lose it" vs. "wear and tear" in either case hardware degrades over time. You have 4 copies (4 drives) of the same files which I assume is your mitigation if you should discover a drive fails or if you discover you have a corrupted file.
 
Last edited:
Your files were corrupted because you didn't use the HDD for 1 year?

I don't have much technical knowledge, I just connect the HDD + USB 3.0 case to the PC and try to perform some test on all the files compressed in rar, 7z and zip to find out if they are intact, when a compressed file is created a hash or code is generated that remains saved inside the file if something gets corrupted the code changes and some software says that the file is corrupted?
 
The last hdd that i had a problem with was ibm deskstar: they got labled as deathstar because of how horrible they would fail with the click of doom. Hitachi bought the line and it was a complete 180. Anyways i notice you have multiple threads relating to basically the same problem and I believe you are overthinking on it. Stop it
 
Your files were corrupted because you didn't use the HDD for 1 year?
I've never encountered file corruption from long term idle drives so I can't really answer your concern. I have the same concern with SSD based drives as well but I don't use those for long term storage due to the cost and capacity restraints. You have 4 HDD devices so you could do a test and reserve one device for such a test but HDD storage is reliable as long as you don't buy junk drives. The last time I was aware I even had a problem with HDD's was either immediate device failure or way back in the days of RLL/MFM drives.

Using the Beyond Compare method I described did enable me to detect many years ago when a forced Windows 10 update crippled my RAID6 driver stability on my file server and I found it began writing zeros to my backups during file copy. Just goes to show reliability of your source device is paramount and being able to detect problems is also important so you ensure your backups are reliable in case you do need to perform a recovery.
I don't have much technical knowledge, I just connect the HDD + USB 3.0 case to the PC and try to perform some test on all the files compressed in rar, 7z and zip to find out if they are intact, when a compressed file is created a hash or code is generated that remains saved inside the file if something gets corrupted the code changes and some software says that the file is corrupted?
That is one way to produce backups and check for corruption but relies more heavily on having multiple copies for recovery because if you have a problem, trying to partially recover files from a corrupted compressed and/or encrypted archive might be difficult or impossible. It is much easier and generally faster to copy around larger compressed files to USB than deal with the filesystem overhead of thousands of small files.
 
My concern is to turn on the HDD after an interval (two months, year) and test if all the files that are compressed in RAR, ZIP and 7Z are still readable without corruption and also to prevent the HDD from demagnetizing.

Does WinRAR have any function to test these compressed files for corruption?
 
Does WinRAR have any function to test these compressed files for corruption?
Again, you cannot test for corruption without anything to compare WITH. You need either a copy of the files or a checksum to compare against. Nothing will tell you if the files are corrupted or not in sheer vacuum.
 
My concern is to turn on the HDD after an interval (two months, year) and test if all the files that are compressed in RAR, ZIP and 7Z are still readable without corruption and also to prevent the HDD from demagnetizing.
Generally I don't think you need to be concerned about demagnetizing as a specific use case.

Minimally
1) keep a sufficient number of backup copies for recovery
2) test your archives periodically by unzipping them somewhere
Does WinRAR have any function to test these compressed files for corruption?
The best way to test your archives is to unzip your archives. If a problem occurs the archive software will tell you. Some archive software has a function to validate the archives based on their internal checksums.
 
The last hdd that i had a problem with was ibm deskstar: they got labled as deathstar because of how horrible they would fail with the click of doom. Hitachi bought the line and it was a complete 180. Anyways i notice you have multiple threads relating to basically the same problem and I believe you are overthinking on it. Stop it
AHH yes, the (in)famous glass platter drives...

Hard for the drive to read data tracks when they are literally gone...
1000052786.jpg
 
Last edited:
AHH yes, the (in)famous glass platter drives...

Hard for the drive to read data tracks when they are literally gone...
View attachment 378001
Yup all over the place in the drive, like brake dust in a drum or crummy pads on disk brakes. Hitachi was awesome, too bad wd scooped them up...
 
If you are concerned you could check the integrity of your files by doing a binary compare on each file using a tool such as Beyond Compare with each of your drives.
Full binary compare is very important. My primitive backup strategy is to do a full backup every two months. Then I delete the files from the previous backup set if they are binary identical to those in the new backup set. The program I use is CloneSpy. It's painfully slow process, sure. But I consider it a pretty good mitigation against many causes of data loss, including:
- data transmission errors when copying (if a file exists both in the new and the old backup sets)
- bit rot of the original working files, not just backups, even if the corruption isn't ever noticed.
On top of that, this offers some amount of versioning, as non-identical files are kept forever.

The best way to test your archives is to unzip your archives. If a problem occurs the archive software will tell you. Some archive software has a function to validate the archives based on their internal checksums.
7zip can validate checksums of zip, 7z and rar files without unzipping. All those formats have at least CRC-32 internal checksums.
 
Full binary compare is very important.
Hashes are enough, and more practical.
If someone's uneasy with just CRC32, then MD5, SHA1, or others.

7zip can validate checksums of zip, 7z and rar files without unzipping. All those formats have at least CRC-32 internal checksums.
As can WinRAR, and practically any other compressor in the last 30-40 years (including DOS ones such as ARJ, PKZIP, LHarc, and even ARC.)
 
Last edited:
AHH yes, the (in)famous glass platter drives...

Hard for the drive to read data tracks when they are literally gone...
View attachment 378001

Intriguing, I would have expected the outer coating to hold up better as the heads fly higher there. Or maybe the heads were bouncing when de-parking.
 
Status
Not open for further replies.
Back
Top