# Setting 4k sector size on NVMe SSDs: does performance actually change?



## Solid State Brain (Dec 9, 2021)

NVMe specifications allow the host to send specific low-level commands to the SSD in order to permanently format the drive to 4096 bytes logical sector size (it is possible to go back to a 512 bytes size in the same way). Not all NVMe SSDs have this capability.

Most client-oriented storage operates by default in "512-bytes emulation" mode, where although the logical sector size is 512 byes/sector, internally the firmware uses 4096 bytes/sector. Storage with a 4096 byte size for both logical and physical sectors operates in what is commonly called "4K native" mode or "4Kn". Due to possible software compatibility issues that have still not been completely solved yet (for instance, cloning partitions from a 512B drive to a 4096B drive is not directly possible), these drives tend to be quite rare in the client space and it is mostly enterprise class drives that employ it.

Some background information on this subject on Wikipedia:








						Advanced Format - Wikipedia
					






					en.wikipedia.org
				




Why change this setting? In theory, the 4K native LBA mode would get away with the "translation" the firmware has to do with 512-bytes logical sectors to map them to the underlying 4K "physical" arrangement (if a physical/logical distinction makes sense for SSDs) and may offer somewhat higher performance in this way.

This is possibly true for fast NVMe SSDs and high-performance (non-Windows) file systems in high-I/O environments, but it is unclear whether Windows performance with ordinary NTFS partitions would be improved, and the subject is sort of obscure and somewhat confusing. Some people for instance may think that the logical sector size is the same as the partition's cluster size (which defaults to 4 kB on Windows), but they are unrelated with each other. Furthermore, changing the logical sector size requires to delete everything on the SSD and basically reinstall the OS from scratch, which makes it even more unlikely for users to attempt it and see if differences arise. This is better tested with brand-new, empty drives.

I have a WD SN850 which allows changing this setting, but after doing it and reinstalling the OS I cannot say I have really observed significant differences on Windows 11 (also, Windows 11 currently has performance consistency issues with NVMe SSDs under certain configurations, but I didn't know this at the time).

I did it after following this blogpost where a WD SN850 user on Linux reportedly measured 10% higher performance on EXT4 partitions with basic benchmarks:








						Switching your NVME ssd to 4k - Bjonnh.net
					

I recently got a WD SN850. There is a little trick to do when you receive it to switch it to 4k LBA and thus getting better performance by using native block size.



					www.bjonnh.net
				




Some sources claim that the 4 kB LBA format provides "the best performance and endurance". This is also what appears to be hinted by low-level NVMe utilities listing the supported sector size format by the SSD. This source below however did not provide any benchmark to back this claim:





						How to format an NVMe drive
					

A blog about computer storage tips.




					filers.blogspot.com
				




In this website a user saw very marginal improvements on a Samsung Pm1725a 800GB (in Chinese):








						教你把 NVMe SSD 切换到原生4K模式
					

虽然 SSD 已经普及很久，机械硬盘用4K扇区模拟512的时间更久，久到硬盘工具都能很好的处理4K对齐，没人到处问4K对齐检查了。 不过尴尬的是， 至今为止大部分 SSD 为了保证兼容性，仍在使用512扇区，比如我的这块西…




					zhuanlan.zhihu.com
				




Here a Micron 9300 apparently gets better performance on ZFS filesystem (Linux) with 4K native LBA:








						Proxmox VE ZFS Benchmark with NVMe
					

To optimize performance in hyper-converged deployments with Proxmox VE and ZFS storage, the appropriate hardware setup is essential. This benchmark presents a possible setup and its resulting performance, with the intention of supporting Proxmox users in making better decisions.  Download PDF...




					forum.proxmox.com
				




However, a user with a WD SN550 saw no differences in basic benchmarks:

__
		https://www.reddit.com/r/hardware/comments/m18a5c/_/gqcyjt0

There is some information on the Linux utilities used to change this setting on a blogpost of NVM Express organization:








						Open Source NVMe™ Management Utility – NVMe Command Line Interface (NVMe-CLI)
					

By Jonmichael Hands, NVMe MWG Co-Chair, Sr. Strategic Planner / Product Manager, Intel     NVM Express™ (NVMe™) technology has enabled a robust set of industry-standard software, drivers, and management tools that have been developed for storage. The tool to manage NVMe SSDs in Linux is called




					nvmexpress.org
				




Some vendors may provide tools to change this from Windows effortlessly, for example like Sabrent for its Rocket 4 SSDs with a "Sector Size Converter" utility. However, there is no mention about possible performance differences, only issues pertaining to "data cloning scenarios":








						SSC | Sabrent
					






					www.sabrent.com
				





Has anybody else tried and benchmarked any performance difference (possibly also in mixed read/writes, high-I/O scenarios) in a more controlled environment?


----------



## cfelicio (Jan 16, 2022)

I ended up using Anvil Benchmark, as it checks both IOPS and Throughput, and found no measurable difference in performance, at least using Windows / NTFS / Sabrent Rocket NVME. From reading Seagate's paper, it seems like using 4096 has some advantages on the error correction side of things, as well as potentially avoid the controller having to emulate / convert from 512 to 4096. I will keep using 512 for my existing boxes, but will certainly add 4096 whenever I provision something new.


----------



## plastiscɧ (Jan 16, 2022)

i have samsung drives; when i  -- secure erase -- my NVMe its erased with 512, for sure. i have no other option in BIOS.
after clean installing the OS everything is automatically converted to 4096 cus u start with a raw drive. Win11 gives no choice.






but that information about error correction has value for me! thanks 


cfelicio said:


> 4096 has some advantages on the error correction side of things


----------



## cfelicio (Jan 16, 2022)

plastiscɧ said:


> i have samsung drives; when i  -- secure erase -- my NVMe its erased with 512, for sure. i have no other option in BIOS.
> after clean installing the OS everything is automatically converted to 4096 cus u start with a raw drive. Win11 gives no choice.
> 
> View attachment 232562
> ...


To double check, you can run on the command line fsutil fsinfo sectorinfo C:, and it should return something like this:




If both physical and logical sectors are showing 4096, you are running on 4KN.


----------



## eidairaman1 (Jan 16, 2022)

I just recalled that after Windows XP SP3 launched, anything larger than 4KB sectors didn't work for NTFS. SSDs have made it where setting sector size greater than 4KB is pointless due to firmware management algorythims.

I do recall 8KB sectors were faster but they used up space quicker.


----------



## Solid State Brain (Jan 16, 2022)

cfelicio said:


> I ended up using Anvil Benchmark, as it checks both IOPS and Throughput, and found no measurable difference in performance, at least using Windows / NTFS / Sabrent Rocket NVME. From reading Seagate's paper, it seems like using 4096 has some advantages on the error correction side of things, as well as potentially avoid the controller having to emulate / convert from 512 to 4096. I will keep using 512 for my existing boxes, but will certainly add 4096 whenever I provision something new.



I didn't find significant performance differences either on Windows with my WD Black SN850, but I haven't been able yet to to do better testing on Linux with other filesystems. At some point I also had to revert 512b sectors due to having to clone partitions to/from a different SATA SSD that does not support changing the physical sector size.



plastiscɧ said:


> i have samsung drives; when i  -- secure erase -- my NVMe its erased with 512, for sure. i have no other option in BIOS.
> after clean installing the OS everything is automatically converted to 4096 cus u start with a raw drive. Win11 gives no choice.





eidairaman1 said:


> I just recalled that after Windows XP SP3 launched, anything larger than 4KB sectors didn't work for NTFS. SSDs have made it where setting sector size greater than 4KB is pointless due to firmware management algorythims.
> 
> I do recall 8KB sectors were faster but they used up space quicker.



Keep in mind that the post is about the "physical" sector size, not the cluster size set when formatting the SSD with Windows. Windows uses a 4096-bytes cluster size by default with NTFS size regardless if the drive has 512-bytes or 4096-bytes physical sectors.


----------



## AsRock (Jan 16, 2022)

Solid State Brain said:


> I didn't find significant performance differences either on Windows with my WD Black SN850, but I haven't been able yet to to do better testing on Linux with other filesystems. At some point I also had to revert 512b sectors due to having to clone partitions to/from a different SATA SSD that does not support changing the physical sector size.
> 
> 
> 
> ...



You will still lose space, i did when i changed mine 4k and put the files back on the drive and ended up losing space because of it.


----------



## R-T-B (Jan 16, 2022)

You shouldn't be losing any disk space as the ntfs default cluster size is already 4kb.


----------



## AsRock (Jan 17, 2022)

R-T-B said:


> You shouldn't be losing any disk space as the ntfs default cluster size is already 4kb.



I'd re-check but it takes some time to do.

Just checked and do believe you are 100% right.


----------



## plastiscɧ (Jan 17, 2022)

i was unsure as well. but after research it turned out that my NVMe stays like it is. @eidairaman1 was right regarding the firmware

Samsung 980 pro












eidairaman1 said:


> greater than 4KB is pointless due to firmware management


----------



## chrcoluk (Jan 17, 2022)

I only use 4k now on my OS drive for compatibility reasons, I am in the process of migrating most of my other partitions/drives to 64k clusters.

The loss of space is small compared to the hit on performance, especially for spindles which benefit a lot from 64k.

My gaming partition on my NVME SSD might get changed to 8k or 16k. From 4k, to reduce io/sec loads.  Samsung SSD's are known to perform very well with 8k, and a new micron SSD has launched with a native 16k storage block.

512 bytes in my opinion shouldnt even be considered at this point.


----------



## Ferrum Master (Jan 17, 2022)

I do not think it matters for SSD's really, as those do not write in sequential manner. They write in rather random behavior based on free space and wear level mitigation techniques.

Yes it applied to spindles, hands down 4K is a default... 512 were for floppy magnetics and super early five incher HDD drives?

The thing it matters more are in removable media, where you need to decrease the overhead thus making large sectors makes sense due to performance reasons only.


----------



## gahabana (Apr 7, 2022)

on Kingston KC3000 temps are MUCH lower with 4KB vs 512B sector/LBA size​hi all, 
    this thread motivated me to experiment. got Kingston KC3000 which supports both 512 and 4k LBA (sector?) size. 
    Did not do enough experiments - performance with 2 apps that scan music library and read tags (on average 10-120kB of each file must be read though files are on average 20-40MB each) are simmilar.
    HOWEVER - when loading SSD (lot of these files get copied 1.5TB) - with same performance i get MUCH lower temperatures on NVME drives. in 50C+ range rather then even throttling (72C+) ( in spite of passive heatsinks) ... my guess translation of amount of operations causes controler to use lot of power. 
    Anyone else care to check ? 
   Will check with Samsung 980 Pro as well and let you all know !


----------



## Solid State Brain (Apr 7, 2022)

Unfortunately it is not possible to change the LBA size on the Samsung 980 Pro, which I currently have (I moved the previous WD Black 850 onto a different PC). The usual NVMe utilities show 512-bytes as the only supported format.

Thanks for your report on the KC3000, though. Interesting that you get lower temperatures with 4KB sectors; I didn't pay attention to them last time with the WD850.



```
#> sudo smartctl -x /dev/nvme0n1
smartctl 7.2 2021-09-14 r5237 [x86_64-linux-5.17.1-1-default] (SUSE RPM)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number: Samsung SSD 980 PRO 1TB
Serial Number: xxxxxxxxxxxxxxxxxxx
Firmware Version: 5B2QGXA7
PCI Vendor/Subsystem ID: 0x144d
IEEE OUI Identifier: 0x002538
Total NVM Capacity: 1,000,204,886,016 [1.00 TB]

[...]

Namespace 1 Formatted LBA Size: 512

[...]

Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0
```

*EDIT*: Or also:


```
#> sudo nvme id-ns -H /dev/nvme0n1

[...]

LBA Format  0 : Metadata Size: 0   bytes - Data Size: 512 bytes - Relative Performance: 0 Best (in use)
```
(no other supported format shown)


----------



## Mussels (Apr 7, 2022)

All mine show the same, but i'm still interested in how this turns out




As long as they're aligned, there should be no performance difference.

When i've seen posts about people who fixed performance issues, that's what's always come to mind (older/uncommon OS, not 4K aligning drives that show as 512b)


----------



## Solid State Brain (Apr 7, 2022)

The above screen shows that your drives are using a 512-bytes logical block size, while their underlying physical structure uses 4096-bytes (4 kB) blocks. This is common for modern storage and it's called "512e mode" or "512 emulation". It does not tell whether the drives also support setting a 4096-bytes logical block size, i.e. "4Kn mode" or "4k native".

In theory, if logical block size equals physical block size, there might potentially be performance benefits, since the firmware will not have to "translate" between the two formats anymore. However, generally they are unlikely to be very large and I certainly didn't notice significant differences with the WD SN850 I previously had. On the other hand, I had with it some inconveniences with partition cloning tools and so on.

This is not related with partition alignment, although a 4096-bytes logical block size will prevent the problem entirely (since partition alignment will then be possible only in multiples of 4 kB blocks).


----------



## Deleted member 24505 (Apr 7, 2022)

Here's mine. Both my SN850's are on 512.


----------



## mb194dc (Apr 7, 2022)

It's pretty obscure, what we need is review of two drives,  one with 4k and one with 512b under otherwise controlled conditions. With a variety of disk work loads to see if it makes any difference.


----------



## gahabana (Apr 7, 2022)

Solid State Brain said:


> Unfortunately it is not possible to change the LBA size on the Samsung 980 Pro, which I currently have (I moved the previous WD Black 850 onto a different PC). The usual NVMe utilities show 512-bytes as the only supported format.
> 
> Thanks for your report on the KC3000, though. Interesting that you get lower temperatures with 4KB sectors; I didn't pay attention to them last time with the WD850.
> 
> ...


you're so right. just installed - one with latest 5xxx firmware and one with 3xxx... - 512b is the only option (unlike kingston kc3000)


----------



## AsRock (Apr 7, 2022)

mb194dc said:


> It's pretty obscure, what we need is review of two drives,  one with 4k and one with 512b under otherwise controlled conditions. With a variety of disk work loads to see if it makes any difference.



I noticed no real difference


----------



## Mussels (Apr 11, 2022)

Solid State Brain said:


> The above screen shows that your drives are using a 512-bytes logical block size, while their underlying physical structure uses 4096-bytes (4 kB) blocks. This is common for modern storage and it's called "512e mode" or "512 emulation". It does not tell whether the drives also support setting a 4096-bytes logical block size, i.e. "4Kn mode" or "4k native".
> 
> In theory, if logical block size equals physical block size, there might potentially be performance benefits, since the firmware will not have to "translate" between the two formats anymore. However, generally they are unlikely to be very large and I certainly didn't notice significant differences with the WD SN850 I previously had. On the other hand, I had with it some inconveniences with partition cloning tools and so on.
> 
> This is not related with partition alignment, although a 4096-bytes logical block size will prevent the problem entirely (since partition alignment will then be possible only in multiples of 4 kB blocks).


I know they're 512e, my point is that a modern OS will know that and work with 4K block size, aligned with the drives hardware
older OS's, or just weird esoteric OS's that dont try and compensate can screw that up - and therefore show improvements when forced to 4K


We need someone with a proper test setup like @W1zzard, with drive that can be toggled between the two operating states.


Anything less is just theoretical, because so far the only times this has improved performance for people has been light on details. Look at just a few posts up claiming it can lower temps by @gahabana 
We dont know the OS, the file system, the cluster size, or any other important information there (No offense to you @gahabana - you did provide what you thought was relevant)


----------



## W1zzard (Apr 11, 2022)

Mussels said:


> with drive that can be toggled between the two operating states


Any suggestions?


----------



## Deleted member 24505 (Apr 11, 2022)

Could you do the tests with a SN850?


----------



## Mussels (Apr 12, 2022)

W1zzard said:


> Any suggestions?





Tigger said:


> Could you do the tests with a SN850?


This article has a guide for the SN850
Switching your NVME ssd to 4k - Bjonnh.net

How to switch your NVME SSD to 4KN Advanced Format - Carlos Felicio
(This is a longer guide, that links to that first guide)

Same person did basic testing themselves, found almost no difference
512E vs 4KN NVME Performance - Carlos Felicio


Edit: cant adjust the setting over USB. sigh. This will take more effort than expected.

Final edit: My intel drive is 512 only, my SN730's are 4k out of the box and my other two drives are kinda critical, so i cant format those for testing easily.


----------



## Athlonite (Apr 12, 2022)

The Adata SX8200 Pro 1TB NVMe SSD use the 512b 4096 block sizing no 4Kn that I could find out about


----------



## Mussels (Apr 12, 2022)

I cant help with this, most of my drives are 4K native already, and the ones that arent are 512e only


----------



## Solid State Brain (Apr 12, 2022)

Mussels said:


> I cant help with this, most of my drives are 4K native already



Did they come like this from the factory or did you configure them that way? I assumed that most if not all consumer-grade SSDs would be pre-configured as 512e to prevent compatibility issues, even those models that allow end-users to change it later on.


----------



## Athlonite (Apr 12, 2022)

Solid State Brain said:


> Did they come like this from the factory or did you configure them that way? I assumed that most if not all consumer-grade SSDs would be pre-configured as 512e to prevent compatibility issues, even those models that allow end-users to change it later on.


It's a factory setting and I think it'd be good luck in changing it unless you know how to frig with the firmware


----------



## OneMoar (Apr 12, 2022)

512k sectors hasn't been the default in a decade


----------



## Athlonite (Apr 12, 2022)

OneMoar said:


> 512k sectors hasn't been the default in a decade


on HDD's yeah that it has so I don't know why they went backwards with NVMe


----------



## OneMoar (Apr 12, 2022)

Athlonite said:


> on HDD's yeah that it has so I don't know why they went backwards with NVMe


its not truly 512k its an emulated layer displayed to the os for compatibility with older boards/controllers
if the motherboard is sane it will initialize at 4K
remember ssds don't have physical sectors in the traditional sense the sector size is just a layer on the block device its still always 4K internally


----------



## Mussels (Apr 13, 2022)

OneMoar said:


> 512k sectors hasn't been the default in a decade


512e is the default on many modern drives still, since it works with older OS as well as modern 4k aware OS

to my knowledge, if its formatted with a 4k aware OS and the partitions are aligned, it can still be used in older systems without issue if they're 512e


----------



## R-T-B (Apr 13, 2022)

OneMoar said:


> if the motherboard is sane it will initialize at 4K


That's not how 512e works.  It will use 4K clusters sure but you can't just change the low level format or presentation without special tools.

4kn drives are still the exception rather than the rule.

You can also still special order true 512n hard disks, if you really need them (I still don't know who would, but there are part numbers for them).

Anyways, my crucial ssds present as 512 sectors and I can't change that, so nothing to show here.


----------



## Mussels (Apr 13, 2022)

Whatever i tried to say, didn't come out. migraine. words. bleh.


----------



## oobymach (May 6, 2022)

Not sure if it affects ssd speed (don't have one to test on) but it definitely affects hdd speed, 512b cluster vs 4kb.


----------



## Solid State Brain (May 6, 2022)

The thread is not about NTFS cluster size that you can change when formatting the drive with Windows.


----------



## oobymach (May 6, 2022)

Solid State Brain said:


> The thread is not about NTFS cluster size that you can change when formatting the drive with Windows.
> 
> *Has anybody else tried and benchmarked any performance difference (possibly also in mixed read/writes, high-I/O scenarios) in a more controlled environment?*


I don't format drives with windows unless I have no other choice, the software I used to format the drive is Easeus partition master. You yourself asked for data to show the dif between 512b and 4k in the first post, well there's some data to indicate that drive speed is affected by cluster size. I will test further if/when I can.


----------



## Solid State Brain (May 6, 2022)

To put it in different words, it's not about the cluster size of the filesystem used. It doesn't matter whether you use Windows or Easeus or other programs to format your partitions.

The 512/4k difference this thread is about is about Logical Block Address (LBA) size, sometimes called "disk sector size", which can only be configured—on the few drives that allow it—with low-level or firmware tools.









						Advanced Format - Wikipedia
					






					en.wikipedia.org


----------



## oobymach (May 6, 2022)

My bad I misunderstood the concept, all my internal drives have 512b sector size and 4kb cluster, both ssd's and hdd's.





I did find an article for how to do it to an Intel ssd however.

*Instructions for changing the physical sector size*









						Gain Optimal Performance by Changes to SSD Physical Sector Size
					

How to change to 512 or 4096 Byte Sector Size on Intel® Solid State Drive Data Center S3500 and S3700 to gain optimal performance.




					www.intel.ca


----------



## Solid State Brain (May 6, 2022)

oobymach said:


> [...] I did find an article for how to do it to an Intel ssd however.
> 
> *Instructions for changing the physical sector size*
> 
> ...



Interesting that the Intel link above mentions that with a 4096 byte sector size performance would be "optimal", although things may differ on datacenter-grade SSDs.

It looks like they may be only changing the Physical sector size though. If they changed the _Logical _sector size as well, data loss would be inevitable since the block range addressed by the underlying file system (at the OS level) would completely change.

For example, with 512/512 or 512/4096 bytes sector size (logical/physical) the OS will see about 1 billion LBA on a 500 GB SSD.
If one were to change this to 4096/4096 bytes (i.e. 4k sector size), the maximum LBA would become 1/8 of the original value, i.e. 125 millions (since 4096/512 = 8).


----------



## Mussels (May 7, 2022)

I ran into where i think the confusion on this comes from yesterday, fixing up an old PC
The owner had been re-using his 80GB IDE hard drive for a few operating systems, (XP to 7, at least) and then he bought a 1TB drive and cloned the OS over.


The PC was brought to me because despite being old, it shouldnt have ran as ass as it did - the HDD was always grinding, task manager showing 100% usage of the HDD at sub 5MB/s, and simple tasks like opening a browser tab could be instant or take a minute

In the end it turned out to be simple: His source drive was native 512, his destination was 4K/512e
And the original drive was not 4K aligned, since the partition was made in XP before 4K was even a thing, let alone AF (advanced format) 512e drives.


After spending assloads of time finding free software that could do the job without erroring out (the original drive had bad sectors and most tools detected 'corruption' and had fits) i got the job done, and the drive went from taking 45 minutes to complete a HDD benchmark to 200MB/s peak, 150MB/s sustained and 90MB/s minimum.

I get the feeling that any time this actually gives people performance boosts, it's because their partitions were not 4K aligned initially


----------



## Solid State Brain (May 7, 2022)

A 4kB sector size eliminates partition alignment issues, but in modern systems where NVMe SSDs are used this shouldn't normally be a problem.

Besides, the point of this discussion is also that wherever a selection is possible, manufacturers suggest that performance will be best by using 4k sector size. NVMe specifications allow manufacturers to assign a descriptive relative performance level to the various supported Formatted LBA Sizes (FLBAS). On an enterprise drive supporting not only 512 and 4096 sector size, but also metadata (for error correction, etc) one might see an even wider performance level range:


```
$ nvme id-ns /dev/nvme1n1 | grep LBA
LBA Format  0 : Metadata Size: 0   bytes - Data Size: 512 bytes - Relative Performance: 0x1 Better
LBA Format  1 : Metadata Size: 8   bytes - Data Size: 512 bytes - Relative Performance: 0x3 Degraded
LBA Format  2 : Metadata Size: 0   bytes - Data Size: 4096 bytes - Relative Performance: 0 Best  (in use)
LBA Format  3 : Metadata Size: 8   bytes - Data Size: 4096 bytes - Relative Performance: 0x2 Good
```

See the at the end of the lines the qualifiers: "Degraded", "Good", "Better", "Best".
There are the same listed in NVME specifications:







			https://nvmexpress.org/wp-content/uploads/NVM-Express-1_4-2019.06.10-Ratified.pdf


----------



## oobymach (May 10, 2022)

I did a before and after test with 4k alignment on my 1tb Silicon Power ssd (another feature of easeus partition master, you can do it without destroying data on the ssd drive).


----------



## Espionage724 (May 10, 2022)

I have a Hynix P31 and by-default it's in 512-emulation mode. I didn't see it advertised anywhere for this drive, but it exposes a 4K sector LBA.

Switching to it on Windows exposed some kind of incompatibility with Steam that I ran into: https://steamcommunity.com/discussi...76804012/?tscn=1641033020#c320374734291541958

If I plan on a Windows install, I keep it in 512e mode, but if I do a Linux install, I do 4K since it seemingly works fine.


----------



## Mussels (May 12, 2022)

oobymach said:


> I did a before and after test with 4k alignment on my 1tb Silicon Power ssd (another feature of easeus partition master, you can do it without destroying data on the ssd drive).
> 
> View attachment 246907


Yep, thats just from making sure the partitions are lined up in a way that works properly on 512e and 4kn drives, and what i think is actually happening - people doing this are aligning their drives and seeing changes - not from changing between 512e and 4kn (because with a 4k cluster size, they SHOULD perform identical or within margin of error)


----------



## R-T-B (May 21, 2022)

Mussels said:


> Yep, thats just from making sure the partitions are lined up in a way that works properly on 512e and 4kn drives, and what i think is actually happening - people doing this are aligning their drives and seeing changes - not from changing between 512e and 4kn (because with a 4k cluster size, they SHOULD perform identical or within margin of error)


We are aware of disk alignment.  I really doubt people here seeing performance issues are misaligning their partitions with winxp era tools.


----------



## Mussels (May 23, 2022)

R-T-B said:


> We are aware of disk alignment.  I really doubt people here seeing performance issues are misaligning their partitions with winxp era tools.


I'd think that too, if i hadnt seen it 3 times this year from people cloning drives/OS's over and over and over - and it seems more likely with server grade OS's in the linux ecosystem, which is the only examples of this showing up online

I still dont think its a coincidence you clean format as part of this, as that clean format could be the entire reason people are seeing a change


----------



## R-T-B (May 23, 2022)

Mussels said:


> I'd think that too, if i hadnt seen it 3 times this year from people cloning drives/OS's over and over and over - and it seems more likely with server grade OS's in the linux ecosystem, which is the only examples of this showing up online
> 
> I still dont think its a coincidence you clean format as part of this, as that clean format could be the entire reason people are seeing a change


I mean in 4k mode you literally can't misalign, so I suppose that could be true, but the vendors themselves are providing the firmware responses that indicate 4k mode is "higher performance" so there still could be something there.

I'm really curious what out of date linux tool will create a misaligned drive though...


----------



## L'Eliminateur (Nov 16, 2022)

my intel DC P4610 drives report 512 and 4096 data sizes, with 4K one offering "BEST" performance (0x00), so i changed them to that and then secure erased just in case


----------



## SchumannFrequency (Nov 25, 2022)

The most effective method to increase the IOPS of your SSD is not by changing the block size, but by changing the operating system.

Here are some benchmarks on the potential performance gains you can make:


			https://openbenchmarking.org/embed.php?i=1812249-SP-WINSERVER76&sha=4347141&p=2
		



			https://openbenchmarking.org/embed.php?i=1812249-SP-WINSERVER76&sha=0ac3ab0&p=2
		









						FreeBSD vs. Linux – Virtualization Showdown with bhyve and KVM
					

Not too long ago, we walked you through setting up bhyve on FreeBSD 13.1. Today, we’re going to take a look specifically at how bhyve stacks up against the Linux Kernel Virtual Machine—but before we can do that, we need to talk about the best performing configurations under bhyve itself.




					klarasystems.com


----------



## L'Eliminateur (Nov 25, 2022)

well that's not something that you can usually do.

also, i'm surprised at the shitty results of clear linux which is the most optimized linux out there, and conversely the big result on freebsd which is avery conservative AND BSD


----------



## SchumannFrequency (Nov 25, 2022)

L'Eliminateur said:


> well that's not something that you can usually do.
> 
> also, i'm surprised at the shitty results of clear linux which is the most optimized linux out there, and conversely the big result on freebsd which is avery conservative AND BSD


CPU performance and IOPS are two different things. Clear Linux is best known for its CPU performance.

BSD has better performance than Linux (Ubuntu) in many domains:














						PostgreSQL benchmark on FreeBSD, CentOS, Ubuntu Debian and openSUSE | redByte blog
					






					redbyte.eu
				








						FreeBSD/Ubuntu Dual Boot Homelab in The Bedroom by the bed testbed
					






					adventurist.me
				

















						Linux Distributions vs. BSDs With netperf & iperf3 Network Performance - Phoronix
					






					www.phoronix.com
				





			Scalable Event Multiplexing: epoll vs. kqueue
		






						FreeBSD: A Faster Platform For Linux Gaming Than Linux? - Phoronix
					






					www.phoronix.com
				





			Linux – NetBSD web server performance – Valuable Tech Notes
		


__
		https://www.reddit.com/r/freebsd/comments/rkayfs



			Benchmark comparison BSD and Linux
		






						Performance of Linux vs. FreeBSD NFS clients
					

I am seeing a dramatic difference in performance between Linux and FreeBSD servers mounting the FreeNAS server I have setup. I have searched the forums and not found a discussion of this as a general problem, so wonder what in my setup might be causing this.   The FreeNAS 8.3 server has a raidz2...



					www.truenas.com
				








						FreeBSD Linux Benchmark
					

FreeBSD and Linux Nginx benchmark




					docs.pritunl.com
				




I find it very easy to switch operating systems. Especially if you start with something like Nobara Project, Mint, Artix Linux or MX Linux.

If you use the above systems for several years, the step to Void Linux, NetBSD, FreeBSD, NixOS, Clear Linux as a daily driver becomes relatively simple, you don't have to be gifted or anything like that.

I know many people who started using a 'more advanced Unix-like system' as a daily driver straight away and were successful at this. Obviously these are the people who have a certain level of talent and interest in this sort of thing, but they are quite numerous.


----------



## Ferrum Master (Nov 25, 2022)

SchumannFrequency said:


> 'more advanced Unix-like system' as a daily driver straight away and were successful at this. Obviously these are the people who have a certain level of talent and interest in this sort of thing, but they are quite numerous.



It ain't that bad, there are enough of Linux users, I just don't experiment much and don't switch to anything else than Fedora and plain Debian for core tasks out of habbit using it on RPI. We are actually pretty conservative and these experiments are like sports news about drag racing. Unless it hits second stable noone would touch it.

But... the point is... I was waiting you to say btw I use arch


----------



## SchumannFrequency (Nov 25, 2022)

Ferrum Master said:


> It ain't that bad, there are enough of Linux users, I just don't experiment much and don't switch to anything else than Fedora and plain Debian for core tasks out of habbit using it on RPI. We are actually pretty conservative and these experiments are like sports news about drag racing. Unless it hits second stable noone would touch it.
> 
> But... the point is... I was waiting you to say btw I use arch


I've used Arch Linux for quite some time, and I've also maintained and supported it on my dad's laptop for five years. Now I installed Void Linux on my dad's laptop because that laptop is getting old, and with Void it boots 17 seconds faster than with Arch, which is a noticeable difference.

On my desktop, I've been using FreeBSD for about four years, and I've never compiled anything, always just used pkg to install stuff. I have Alpine Linux and MX Linux (Fluxbox) installed in a VM but I hardly ever use them because FreeBSD has good support for Linux binaries and even for RPM packages. On the old netbook I have FreeBSD and use it as a music player. On a laptop I have MX Linux with XFCE.

I'm actually quite happy with this general setup. Related to this topic I can say that ZFS on FreeBSD 12.3 has quite a lot of IOPS. App launches are always very fast, especially with 0 A.D. which boots in a second, and GIMP in 4 seconds. Firefox and Chromium start in about 2 seconds (the first time I start it). For my old hardware that is very good.


----------



## Ferrum Master (Nov 26, 2022)

SchumannFrequency said:


> For my old hardware that is very good.



Well that's the main point. The HW is the decisive factor often when picking distro branches. I you look up at the tree for the same driver it really acts different on each distros. For example like toothache named RTL8125 on BSD while it worked fine in Fedora/Debian it is kinda fixed but pfSense peps still are sore about it. Not that I can change what kind of NIC IC is soldered on the boards lately it is what it is. I had much better experience on Fedora than anything else, maybe Slackware being the second best and any Ubuntu flavor being the brown worst stability wise for me.

Boot times are really irrelevant factor for me. It just depends how you HW is configured and how much you have of it and how straight the hands were for the BIOS coder. I am really hoping BTRFS will take up pacing. But the recent kernel speedups for EXT4 are also very good and welcome as a healthy competition goes on. Filesystem 4K wise linux should loose to windows due to less aggressive CPU drivers and maybe even absence of them in general, it needs tweaks. The sad story, the boost algo drivers come very very late. And 4K benchies are very single CPU thread Turbo sensitive. On AMD the Turbo stucks on the wrong CCX as usual and on Intel it doesn't kick in at all, it is too lazy(conservative). There could be one more cheat tho. On intel you can properly disable all mitigations at grub and those really influence SSD 4K benchmarks. I've seen numbers degrading around 30% depending on the arch. On windows some are baked in kernel, you cannot turn them off anymore for recent actual Win 10/11 builds. Maybe we are digging the wrong way with acceleration for nvme performance at all. I am not sure the Benchmarks we see are made with mitigations on or off also. Knowing those are Linux peps, my vote goes on OFF.

Many thanks for crap BIOS makers, I currently at war with Gigabyte Refusing to fix bios for my Z590I-Vision D that constantly produces MCE error on each boot no matter what distro or kernel I use, a circus, on Windows you don't get the error, so no wonder no one complains. Maybe I will post my full outcome of support ticket to shame them in public, it takes two months already on QA and yesterday I was asked again how to reproduce the bug thus it looped(I said just install any Linux distro and see the error pop up or do journalctl -b). For example my tickets for Asus motherboard were answered and BIOS was fixed without much complain, the better - they fixed all my complaints within few weeks.


----------



## R-T-B (Nov 26, 2022)

This OS talk is somewhat off-topic though.  This thread is about the hardware-based benefits of 4k LBA sectors vs 512 LBA sectors.


----------



## AsRock (Nov 26, 2022)

Tried it and made no difference what so ever


----------



## R-T-B (Nov 26, 2022)

AsRock said:


> Tried it and made no difference what so ever


I have doubts of the performance benefits myself, but maybe some drives have some and some don't.  At any rate the threads subject is still about the LBA size.


----------



## Solid State Brain (Nov 26, 2022)

R-T-B said:


> I have doubts of the performance benefits myself, but maybe some drives have some and some don't.



Or maybe there are other potential benefits like lower temperatures as another user suggested earlier.

I started this thread because SSD manufacturers generally suggest that the larger LBA size is the "best" option for performance, but there are no benchmarks or reports showing this clearly, in particular on Windows.

I still don't have other SSDs with configurable LBA size that I can play with, though.


----------



## SchumannFrequency (Nov 26, 2022)

This topic has also been discussed in the FreeBSD forums related to ZFS: https://forums.freebsd.org/threads/best-ashift-for-samsung-mzvlb512hajq-pm981.75333/#post-529225

_OTOH using 4k filesystem sectors on a 512n disk would only mean that the drive firmware has to translate that one request into 8 distinct requests to the storage. You needed to access all of those 512byte sectors anyway, so the performance impact is miniscule if there even is any.

So it is always the safer option to go for 4k sectors on the filesystem side, especially if your pools are usually "here to stay". I've used ashift=13 during pool creation for several years now and even pools that have migrated from spinning rust to SSD and now to NVMe I've never seen any performance issues.

Except that the disk we're talking about here is not spinning rust, but flash cells. It has a real block size, which is likely to be much larger than either 512 or 4K. There are interestingly complex performance implications of adjust the block size that the file system uses (ashift for ZFS), and bigger is not alway better. The biggest problem is that a larger ashift value may waste space. In my personal opinion, that is much less relevant than people typically think (since most space in a file system is typically used by a small fraction of large files), but depending on the usage and workload, the correct answer can matter significantly.

Is there a measurable performance improvement using ashift=13 vs ashift=12. Personally i don't think so.

I am sure NVMe have some flash translation. Do you think they use pages too?
Deeper Question. Why do we need software trim when the drives handle garbage collection at the controller level?_

Then there is also someone who has done a benchmark:

*4096*            # sectorsize
4 kbytes:     17.6 usec/IO =    222.2 Mbytes/s
8 kbytes:     19.3 usec/IO =    405.6 Mbytes/s
16 kbytes:     23.6 usec/IO =    661.4 Mbytes/s
32 kbytes:     29.7 usec/IO =   1050.8 Mbytes/s
64 kbytes:     42.4 usec/IO =   1473.2 Mbytes/s
128 kbytes:     68.6 usec/IO =   1823.1 Mbytes/s
256 kbytes:    118.8 usec/IO =   2104.9 Mbytes/s
512 kbytes:    235.2 usec/IO =   2125.7 Mbytes/s
1024 kbytes:    468.8 usec/IO =   2133.2 Mbytes/s
2048 kbytes:    938.9 usec/IO =   2130.2 Mbytes/s
4096 kbytes:   1870.4 usec/IO =   2138.6 Mbytes/s
8192 kbytes:   3758.2 usec/IO =   2128.7 Mbytes/s

*512*             # sectorsize
4 kbytes:     17.5 usec/IO =    223.3 Mbytes/s
8 kbytes:     19.1 usec/IO =    409.5 Mbytes/s
16 kbytes:     23.8 usec/IO =    657.9 Mbytes/s
32 kbytes:     29.7 usec/IO =   1052.2 Mbytes/s
64 kbytes:     42.5 usec/IO =   1471.1 Mbytes/s
128 kbytes:     68.1 usec/IO =   1834.2 Mbytes/s
256 kbytes:    130.2 usec/IO =   1920.9 Mbytes/s
512 kbytes:    252.6 usec/IO =   1979.4 Mbytes/s
1024 kbytes:    497.4 usec/IO =   2010.6 Mbytes/s
2048 kbytes:    958.2 usec/IO =   2087.2 Mbytes/s
4096 kbytes:   1861.7 usec/IO =   2148.6 Mbytes/s
8192 kbytes:   3718.1 usec/IO =   2151.6 Mbytes/s

No increase in speed at any level.









						ZFS - Best ashift for SAMSUNG MZVLB512HAJQ (PM981)
					

Hello,  I'm running a server with 4xSAS raidz plus 1xNVMe SSD (SAMSUNG MZVLB512HAJQ). The NVMe drive is split in 3 parts:  # zpool status   pool: nvme  state: ONLINE   scan: none requested config:      NAME            STATE     READ WRITE CKSUM     nvme            ONLINE       0     0     0...




					forums.freebsd.org
				




He achieves 1834.2 Mbytes/s for 128kbytes, but he achieves this for _random_ writes. While this should be about the maximum _sequential_ write speed according to Samsung.

This is striking because for 128kbytes there are huge differences between sequential and random writing:








						I/Os Are Not Created Equal – Random I/O Versus Sequential I/O
					

Sequential I/Os always outperform Random I/Os. Accessing data randomly is slower & less efficient than accessing it sequentially. True for HDDs and SSDs.




					condusiv.com
				





			https://i0.wp.com/codecapsule.com/wp-content/uploads/2014/02/ssd-presentation-05.jpg?w=720&ssl=1
		


My guess is that windows 11 with ZFS will only get around 600 Mbytes/s with this drive for (128 kbytes) random writes or most likely even lower.

Linux will get around 1000 Mbytes/s for (128 kbytes) random writes on ZFS, or lower.

ZFS is mainly designed for features and it's the most robust form of data storage that exists. It is not designed for speed..
These are remarkable results.
We should not forget that NTFS is more than 40 years behind ZFS in terms of technological sophistication.

We can clearly see that the operating system is by far the single biggest determinant of write/read and IOPS performance.


----------



## Calenhad (Nov 26, 2022)

One theoretical benefit of having a drive with a reported 4K "physical" sector size, is that a file system with 4K allocation unit size will perform one write or read operation, instead of 8, per 4K unit. But due to how modern ssd controllers work, this is most likely not a performance issue anyways. The worst case scenario is higher temps for some controllers. And depending on the NAND used, the actual read and (especially) write operations can not directly address 4K sized groups of cells anyway.


----------



## Mussels (Nov 28, 2022)

R-T-B said:


> I have doubts of the performance benefits myself, but maybe some drives have some and some don't.  At any rate the threads subject is still about the LBA size.


I haven't been able to find any reports of this actually doing anything, other than OS's that were originally 4K unaware and potentially had misaligned partitions in the first place.


Calenhad said:


> is that a file system with 4K allocation unit size will perform one write or read operation, instead of 8, per 4K unit.


they buffer these things, they're 4K natively with 512 emulated - not the other way around. The only time they'd do those extra writes is a really old OS, like formatting the drive in DOS and running XP on it.

Everything is buffered in the DRAM or HBM, so writes this small should never happen one at a time, regardless


----------



## chrcoluk (Nov 28, 2022)

Its just a firmware change, in it makes the drive report itself as a native 4k instead of 512 emulated.

If you already using a filesystem with 4k blocks then changing this firmware setting wont affect performance.

Its main benefit is to filesystems like ZFS automatically configuring a 512byte ashift which it is prone to doing on 512e drives.

Personally I dont know why drive vendors are still producing 512e drives.  Is there really that much demand for it?

4k formatted FS should reduce erase cycles vs 512byte, which is probably the bigger reason to use 4k not performance.


----------



## Veseleil (Nov 28, 2022)

chrcoluk said:


> Personally I dont know why drive vendors are still producing 512e drives. Is there really that much demand for it?


Many companies still use Win XP, along with hospitals, civil services like police etc., and governments worldwide.


----------



## R-T-B (Nov 28, 2022)

Veseleil said:


> Many companies still use Win XP, along with hospitals, civil services like police etc., and governments worldwide.


This is probably a factor too.  It's been surprisingly hard to kill XP, especially in small integrated machines like POS.


----------



## Assimilator (Nov 28, 2022)

Smells like snake oil to me.


----------



## Athlonite (Nov 29, 2022)

Veseleil said:


> Many companies still use Win XP, along with hospitals, civil services like police etc., and governments worldwide.


And most likely on old hardware that at most may have an SATA SSD replacement for the original HDD but definitely not an NVMe SSD though


----------

