The bigger problem is bit-rot. Even with multiple copies of files that is a risk. Spinning or SSD drives are both prone to having the issue.
@nandobadam - I assume it's not possible to connect all the drives together to a system at the same time?
If it is then
SnapRAID is an option which you could use to perform Raid-5 type parity checking across all disks. The only downside (apart from being slightly complex) is that all drives in the set need to be available for it to work.
HDDs don't need to be powered up often (like, every few months). They're not SSDs.
Storage is not a problem. Specs of a random 2.5" HDD:
View attachment 377972
The simplest and most effective way to check files: checksums / hashes.
There's a surprising lack of satisfying tools for that, but here's an old one that works well enough and is more comfortable than most.
It supports CRC and MD5, in the common textual formats (
SFV and
md5sum).
Advanced CheckSum Verifier by Irnis Haliullin
(And OP, no need to repeat your question, slightly reworded, every 2nd post.)
This is sort of the approach I'd take but offers no help if you do have bit rot issues. There is no inherent recovery data.
At risk of complicating things;
I'd personally go the route of using PAR files to checksum and keep recovery data for the files.
The only downside to this is that it has some limits about the number of files and folders it can work across (although 32000+ files is the limit so plenty of scope) - it works best with archives of data (i.e. zip/rar/7z/etc. files) so that these can store any complex folder structure, etc.
It doesn't matter the archives are split in to large chunks - it's actually helpful as if one chunk is damaged it's easier to recover that than a whole massive single archive file.
You can use
MultiPAR to manage this - simpler than SnapRAID and basically allows you to create a recovery file set for the contents of each drive separately. The GUI informs you how many files can be recovered if there is an issue, how big the recovery data set will be, etc.
I'd recommend you keep that recovery data elsewhere.
The end bonus of using MultiPAR is that it performs an integrity check when you open the recovery data set and will highlight and correct any errors as it goes. Even if an entire file is destroyed / missing, it can reconstruct it from the recovery and parity data.
It would be remiss not to mention that RAR/WinRAR has a similar parity feature built in to its standard (i.e. using parity data across archive volume files to detect/correct bit errors), but is not widely used. Of course it only applies to RAR file archives (so no use for files outside the RAR volume).... And of course you need to pay for it. So in that sense it is severely limited.
EDIT: Of course, you should really be running disk monitoring whilst do this. Just to see if uncorrectable error / sector reallocations are being done on the drive as it reads data.
I personally use archive files which are easier to scan and also keep an eye on drive performance so I can see if the drive is having to work harder to read data - the amount of internal ECC activity in any modern drive is actually quite scary - and track drive activity through windows performance monitoring, so for example (and I've pulled one of my archive drives out of storage a few months early):
Red line = transfer rate (x100MB/s), Blue line = Transfers per sec (x1000)
No data errors so looking OK... not amazing though - that dip in the middle shows the drive is clearly needing to think a bit harder to pull the data from that bit of the storage.... time for a refresh methinks...
Data re-written to drive and re-scanned read... nice and consistent (well as good as it's gonna get).
NOTE: when scanning at a file level, small/tiny files and big files mixed together will cause massive swings in read speed - can't be avoided and makes this method unsuitable in that scenario.