Given how much this costs, 24TB hard disks are a lot cheaper
Its not about that though. When you get into this territory its all about use case. SATA is certainly slower than NVMe and HDDs are bigger and cheaper, but the issue is that both of those fail this use case. Something like this is used for warm storage; that can be anything from paging, scratch space, large file operations such as LLM merges etc. When dealing with something like that, you want IOPS and consistency. IOPS are sometimes used interchangeably with "speed" and it certainly usually scales that way, but what you are really buying is the ability to do a task without slowing down. Let me try to be more clear.
It is common knowledge that transferring large files is faster than transferring small files. This has A LOT of factors that we are going to ignore for this explanation, but what you are "visually" seeing is IOPS in action. Generally when you are transferring those files your data rate is maybe a few MB/s while windows shits itself; far less then its max speed. This is because the disk also has to do multiple operations (iops) to those small files.
This is one of the bigger metrics of performance in "big data". Like the examples given above in regards to file work, a balance must be struck. NVMe in this case is not that dense. (It actually is as in the DC we are using U2 drives but more on that later.) maxing out currently at 8TB for 2280 sized drives.
HDDs on the other hand are far cheaper, but they come with a problem; a physical one. You see the laser shoots your data at the disk but it can only shoot at a given speed combined with the arm movement, and the physical RPM of the disk itself. Lots of data to be sure, but almost no IOPs. In fact DCs tend to use HDDs for cold storage, your regular 7200 RPM. Among a myriad of differences, one is actually the same as consumer 7200rpm drives.... they actually cant really break 120 or so IOPS. Thats right, hundreds, not thousands like SSDs. They are that SLOW.
That brings us to U.2 these can break the density chains of 2280 nvme and use pcie to boot netting pretty much the same speeds! You could even get one used, or maybe it fell off a truck.
4TB for lets round down $500 from some random.
You might even go big for 30tb 6500 for $4k from some random
www.newegg.com
However; If you need OEM support. Then it gets a bit more expensive. U.2 like nvme are hot hot hot. They need active airflow to cool all those chips and controllers. If your a normal user, you might get away with a case fan and /flex when you xfer your plex library at warp 8 but we are talking 24/7 usage (Besides your not fast until you have used bonded 800GB/s mellanox nics to a flash array on a stack of Aristas).
For that sata SSDs are much more cost effective for warm layers. They dont have the same cooling or power requirements, they are much thinner so density in storage chassis is a lot better almost 2:1!
Most importantly IOPs. Something like the mushkin source is while not in the millions can push a ton more than a traditional HDD without the extended sunken cost of a U.2 nvme fleet. Take a look at the tech specs:
Buy a Mushkin Source HC - SSD - 16 TB - SATA 6Gb/s at CDW.com
www.cdw.com
Code:
4KB Random Read 97000 IOPS
4KB Random Write 89000 IOPS
Of course lets not forget, that U.2 storage fleets do exist and the network infra to support the speed (cost), and of course SAS/SATA and raid can all play a part, but this is a rudimentary glimpse into how deep we can get.
Fun fact most DCs are not doing 10k RPM and up drives, they exist but this is actually outmoded now. Storage tiering is preferred, automated tiering like HA, or a cache layer that bleeds off. It is actually cheaper and less temperamental then LSI backplanes and old seagate cheetas etc.
Anyway long way of saying that there is infact a niche for this. Big vendors like AWS/META/Google/MS are not grabbing these, they are using nvme or U.2 with software tiering. Data collectors (veeam, acronis, backblaze) they might use them for hot tiers. Conventional datacenters, or small local POPs might be using them for SAN or VM clusters and using smaller NVMe as cache and OS layers.
Hope that shed some insight; I used to be a storage architect once upon a time. It goes hand and hand with cluster and virtual work for which I was an engineer for some time so I learned it and eventually storage and the clusters were my primary responsibility. After years I was responsible for creation and design. I still do it a lot, but as the years progressed I slowly became further away from the hardware. Still love threads like this though. Storage is truly misunderstood and very taken for granted.