• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Optane 1600X 118GB - Lots of CDM benching and some thoughts

Unfortunately Intel NAND and Optane drives become read-only once they reach 100-105% of their rated TBW.

So trying to speed run the 1300TBW rating isn’t recommended. Even if the media itself can last far beyond that rating.
o.O
Any way to 'spoof' the recorded writes? (Like those 'generic Chinese' NVMEs do?)
 
I found out today that I'm not even using the "optimal" configuration of the drive, because I have the sector size set to 512B which is default rather than the reported "best" setting of 4096B/4KiB.

Unfortunately changing that setting would be rather involved as doing so deletes all data on the drive. I probably won't go to the trouble until I have a new system.

I'll console myself with the thought that it "probably" doesn't make much of a difference in performance, in benchmarks let alone in real-world experience.
 
I found out today that I'm not even using the "optimal" configuration of the drive, because I have the sector size set to 512B which is default rather than the reported "best" setting of 4096B/4KiB.

Unfortunately changing that setting would be rather involved as doing so deletes all data on the drive. I probably won't go to the trouble until I have a new system.

I'll console myself with the thought that it "probably" doesn't make much of a difference in performance, in benchmarks let alone in real-world experience.
Best setting for Optane specifically or ssd generally? I seem to remember regular nvme drives cannot do lower than 4KiB, whereas Optane can. I’d be curious to see real world difference indeed if any but cannot really make an educated guess about that. I’m thinking the default sector size is attributed by operational capabilities but might not be optimal in Windows.

I do remember back in the hdd/win 7 era that it was optimal to force a 4KiB sector when partitioning, the theory long lost to me since except that it was supposed to be optimal performance indeed.
 
i can already tell you whats going to happen: random performance (the metric you'd care for) will actually get worse. don't do raid0 w/ ssds. remember raid0 = 0brain
What about soft-raid like from Windows Disk Management? My understanding was at the very least there would be some OS/CPU overhead but the OS would access the drives at their native speed probably better than some cheap consumer firmware raid.
 
What about soft-raid like from Windows Disk Management? My understanding was at the very least there would be some OS/CPU overhead but the OS would access the drives at their native speed probably better than some cheap consumer firmware raid.
Your random access speed/latency will be orders of magnitude slower, even with VROC backed mdraid on Linux (semi-hardware).

Best setting for Optane specifically or ssd generally? I seem to remember regular nvme drives cannot do lower than 4KiB, whereas Optane can. I’d be curious to see real world difference indeed if any but cannot really make an educated guess about that. I’m thinking the default sector size is attributed by operational capabilities but might not be optimal in Windows.

I do remember back in the hdd/win 7 era that it was optimal to force a 4KiB sector when partitioning, the theory long lost to me since except that it was supposed to be optimal performance indeed.
Enterprise-adjacent storage has been 4k capable for about a decade. Capable Optane and NAND drives can be easily be converted with the “nvme” command on Linux. Notably, the Optane 905p that is reasonably common has no 4k support.
 
Your random access speed/latency will be orders of magnitude slower, even with VROC backed mdraid on Linux (semi-hardware).
What would be the best (free) windows tool for measuring the speed/latency?
 
Last edited:
How much are you guys paying for these drives? I noticed an upturn in people saying they buying them.

No stock on amazon, some on ebay used, I checked where I got my DC P4600 from and its silly money.

I *almost* pulled the trigger on a 1.9TB 905P for <$400. Realized that I have 'enough' to 'play with' as-is, and if I'm lucky can score something better on the used market, later.
Here 240gig optane's cost more then that. Guessing its one of those things where its cheaper to import and pay tariffs on.
 
Best setting for Optane specifically or ssd generally? I seem to remember regular nvme drives cannot do lower than 4KiB, whereas Optane can. I’d be curious to see real world difference indeed if any but cannot really make an educated guess about that. I’m thinking the default sector size is attributed by operational capabilities but might not be optimal in Windows.

I do remember back in the hdd/win 7 era that it was optimal to force a 4KiB sector when partitioning, the theory long lost to me since except that it was supposed to be optimal performance indeed.
For both Optane and NAND drives, 4096B is supposed to be preferable, but 512B is the default. The P1600X, at least, supports both.

How much are you guys paying for these drives? I noticed an upturn in people saying they buying them.

No stock on amazon, some on ebay used, I checked where I got my DC P4600 from and its silly money.


Here 240gig optane's cost more then that. Guessing its one of those things where its cheaper to import and pay tariffs on.
I paid $65 for the 118GB back when it was available on Amazon. Only other place I saw them was Newegg.

The entreprise-level 905P/P4800X/P5800X and so on are much more expensive (and usually not available as M.2). The 3.2GB version is even available on Mouser if you have $6500 to spare.
 
Unfortunately Intel NAND and Optane drives become read-only once they reach 100-105% of their rated TBW.

So trying to speed run the 1300TBW rating isn’t recommended. Even if the media itself can last far beyond that rating.
I thought that was only on enterprise optane, and that even there, later models just speed throttled when you hit this threshold?

The entreprise-level 905P/P4800X/P5800X
Small correction. 905p is considered a consumer part.

The 3.2GB version is even available on Mouser if you have $6500 to spare
Actually no. It's nonstocked but they can get one for you (in 6 weeks).

Only other place I saw them was Newegg.
There was a sale going on there with a glut of 905ps with oddly recent manufacture dates on them yeah. I got one.
 
Last edited:
look at the RESEARCH PAPER: Towards an Unwritten Contract of Intel Optane SSD

"...to exploit the bandwidth of Optane SSD, clients should never issue requests less than 4KB (Avoid Tiny Accesses rule).
Sixth, to get the best latency, requests issued to Optane SSD should align to eight sectors (Issue 4KB Aligned Requests rule).
Finally, when serving sustained workloads, there is no cost of garbage collection in Optane SSD (Forget Garbage Collection rule)..."

So use a properly aligned exFAT partition with a 4096 byte file allocation unit size. (NOT the std 32k)
There's no TRIM on exFAT and way less overhead than NTFS etc = 10+ MB/s more Q1T1 R4k on a 800p.
You know Q1T1 R4k: The reason you need RAM... :)

If you just have to use NTF(ukt up)S; look up:
command line switches NTFS format
and disable features you don't need.
Disable TRIM.

For any FS besides NTFS, MS only pretends to enable write caching.
ie: "Write caching is on" but FS isn't NTFS: It's a LIE!

Look up the registry mod:
WriteCacheEnableOverride
to make honest men of them.
Like here, under USB-WriteCache V0.2:

(While there: That and the next app: MaximumTransferLength will make it so you do'nt recognise your old USB flash drives; they're so fast!
Not true for newer flash drives that use UASP)

This also gets around the "you cant write cache this type of drive" BS
I don't know that this makes the system snappier..? But worth testing as; large (=faster) writes seldom, gives more time for uninterrupted (random) reads..

While you're in the registry; also experiment with:
CacheIsPowerProtected
too, to make it so MS does less of it's de facto er... pretending..

"But I'll lose data..."
Yes... maybe... but you have a backup policy in place, for these once every 6 months anyway issues, right..?
 
Last edited:
"...to exploit the bandwidth of Optane SSD, clients should never issue requests less than 4KB (Avoid Tiny Accesses rule).
Sixth, to get the best latency, requests issued to Optane SSD should align to eight sectors (Issue 4KB Aligned Requests rule).
Finally, when serving sustained workloads, there is no cost of garbage collection in Optane SSD (Forget Garbage Collection rule)..."
I feel like this might partially explain why AMD-RAID+Optane stumbles all over itself, even JBOD.
-no easy way to configure these options at the firmware level (that I'm aware of).
 
66% of windows I/O is Random 4K at a Que Depth of 1. (R4K Q1T1)
Less than 1% of windows I/O is the large sequential
#s advertisers like to wave around like burning flags..!

Newer games are doing more 16KB and 32KB reads:
(But also NB; lots of 4KB writes)

More proof:
People spend a lot of money and time upgrading to more and faster RAM and OCing and cooling it and tweaking the latencies.
That's RANDOM Access Memory; RAM
NOT
LSAM or LARGE SEQUENTIAL Access Memory.
It works because computers thrive on Q1T1 R4K..!

In short:
You should multiply large sequential MB/s numbers by 0.01
and multiply your random I/O MB/s numbers by 0.66
Then decide if RAID is worth your trouble.

IMHO you do not want RAID (Unless you have a specific large file use case) as it slows down Q1T1R3K.
You want your Optane attached directly to the CPU, not through the controller chip, for lower latency:

Bifuracating your PCIe in Bios and only giving 8 lanes to the GPU drops graphics performance negligibly:
(It's a shame one can't do more 'fine grained' Bifurcation. 14-2 and 12-4 would be great!)

For AMD:
The MOST IMPORTANT thing to do is install the Win-Raid mod signed Intel drivers for your Optane!
NB that only the .inf IIRC file is changed
and can be checked with a text editor NOT the driver.
On a 800p I saw a ~100 MB/s increase in Q1T1 R4K from ~200 MB/s to 300 MB/s!!!

You can find the driver I use, and my benchmarks here:
NB the simple process for signing the driver on that excellent forum.
(Aligned exFAT 4096 byte file allocation unit size got me to over 300 MB/s of Q1T1 R4K on my 800p)

If I had 2 Optanes:
I would try to divide where the OS, software and games are installed to try and get as much simultaneous I/O going as I could.
ie: Load Windows on one and games on the other so both Windows I/O and Game I/O can happen at the same time.

Use 2 (or more) Pagefiles.
(only 1 Per physical drive!) with their own partitions:
Windows will auto use the least in use drive at the time, for paging.
Windows will also use both simultaneously to give you a kind of RAID 0 for Paging.

File fragmentation can and does negatively affect any non volatile storage:
It's said that SSDs have X latency per file.
That's incorrect; they have that X latency per (file) fragment as the address of each fragment needs to be looked up and passed down the I/O stack etc-etc.
eg: 1 millisecond per fragment X 1000 fragments = 1 second...

So keeping the Pagefile (and any file) contiguous is a good idea.
Hence the dedicated, aligned, 4K file allocation unit size (same as DRAM block size), Low overhead FS, partition for the Pagfiles.

More detail in my posts here:
 
Last edited:
The amount of misunderstanding in the previous comment is staggering.

First, the claim that Windows uses two page files in a “RAID 0,” while simultaneously only using the least busy drive, contradicts itself. Windows paging operates on a first come, first served basis and will place pages on whichever drive is least active. It will in no case intelligently stripe pages across two or more page files.

Paging to disk, even bit-addressable storage like Optane, is so detrimental to performance that it should be avoided at all costs.

Second, file fragmentation as you describe it does not exist in flash storage. Having a file in a contiguous physical location was key to performance with hard disks because it minimized the time spent seeking between locations by the read head. Less movement across the disk platter meant more time was spent transferring data. None of this exists in flash or 3D Xpoint.

The Intel SLL3D or SLM58 controller IC in your Optane drive has 7 channels, with a certain number of Optane memory packages connected to each channel. High performance is obtained by interleaving between these seven channels so that commands do not pile up. If files were not “fragmented,” as you say, and all IO commands were being sent to a single Optane package on a single channel, performance would suffer catastrophically.

Luckily the controller is smart enough to never do this! All writes are spread across channels and memory packages optimally. Write commands also take into account wear leveling, and so the “fragmentedness” of a file is completely outside the user’s control. No file is contiguous.

Optane does not store data in addressable “fragments,” whatever that means to you, but in bits. Optane is bit addressable, which distinguishes it from common NAND flash SSDs. Latency is based on access to these bits, and neither the NVME driver, nor the file system, nor the controller has any concept of “fragments.”
 
Last edited:
The amount of misunderstanding in the previous comment is staggering.

The amount of misunderstanding in THIS comment is staggering.
You don't seem to have a comprehension problem, so I must assume you either did NOT read my comment properly, or you just love a good argument!? :)
First, the claim that Windows uses two page files in a “RAID 0,” while simultaneously only using the least busy drive, contradicts itself. Windows paging operates on a first come, first served basis and will place pages on whichever drive is least active. It will in no case intelligently stripe pages across two or more page files.

Basically we agree, however I said:
"Windows will auto use the least in use drive at the time, for paging.
Windows will also use both simultaneously to give you a kind of RAID 0 for Paging."
ie:
As long as the PageFiles are on 2 or more separate physical drives; Windows is able to access them both (or more) at the same time.

That's kind of like RAID 0 but file blocks are NOT being striped over multiple drives and nowhere is that implied.

Paging to disk, even bit-addressable storage like Optane, is so detrimental to performance that it should be avoided at all costs.

I quite agree!
Nowhere do I advocate extra Paging.
Paging simply happens when it happens, which is more often the less RAM you have.
But even with a terabyte of RAM, some paging will happen.

Second, file fragmentation as you describe it does not exist in flash storage. Having a file in a contiguous physical location was key to performance with hard disks because it minimized the time spent seeking between locations by the read head. Less movement across the disk platter meant more time was spent transferring data. None of this exists in flash or 3D Xpoint.

:)

Article11/01/2024

"...When run from the scheduled task, defrag uses the below policy guidelines for SSDs:​
Traditional optimization processes. Includes traditional defragmentation, for example moving files to make them reasonably contiguous and retrim. This is done once per month...."​

Memory block fragmentation, filesystem fragmentation, and TRIM​

There are 2 kinds of fragmentation that concern SSD disks. The first kind of fragmentation is memory block fragmentation. SSD disks are written in pages (generally 4KB in size) but can only be erased in larger groups called blocks (generally 128 pages or 512KB).​
This causes fragmentation and results in severe performance loss after the disk has been used for a while. Speed can easily drop by 50% or more. The SSD manufacturers have developed a solution called the TRIM instruction, for more information see  * this Wikipedia article. It is a hardware solution that needs support in the operating system, and only applies when files are being deleted. MyDefrag knows nothing about memory block fragmentation because MyDefrag operates at the filesystem level, not the hardware level. However, the MyDefrag script for Flash memory disks will consolidate free space, and this reduces the problems caused by this kind of fragmentation.​
The second kind of fragmentation is filesystem fragmentation. Files can be split into parts that are placed anywhere on the disk, just like on harddisks. Many users think that this kind of fragmentation does not matter for SSD disks, because the disks have a very low latency (no harddisk heads that have to move about). But Windows still has to do more work when a file is fragmented, to gather all the fragments. There is significant overhead inside Windows, nothing to do with the hardware, and it is all the more noticeable because SSD is so fast. MyDefrag deals with this kind of fragmentation.​

Each fragment or extent of a file requires its own separate I/O request. if a file is in 1000 fragments according to windows that means 1000 I/O operations to be processed by the CPU, RAM etc, for that file that could have taken a single I/O.
The issue here is I/O overhead.

The Intel SLL3D or SLM58 controller IC in your Optane drive has 7 channels, with a certain number of Optane memory packages connected to each channel. High performance is obtained by interleaving between these seven channels so that commands do not pile up. If files were not “fragmented,” as you say, and all IO commands were being sent to a single Optane package on a single channel, performance would suffer catastrophically.

Luckily the controller is smart enough to never do this! All writes are spread across channels and memory packages optimally. Write commands also take into account wear leveling, and so the “fragmentedness” of a file is completely outside the user’s control. No file is contiguous.

Optane does not store data in addressable “fragments,” whatever that means to you, but in bits. Optane is bit addressable, which distinguishes it from common NAND flash SSDs. Latency is based on access to these bits, and neither the NVME driver, nor the file system, nor the controller has any concept of “fragments.”

Any drive, Optane included, has the above discussed overhead for each file fragment/extent.
Latency is not zero in benchmarks...
PageFiles grow and shrink automatically unless min and max size are set to the same number.
The only way to let the PageFile shrink and grow, as needed, without the issues a fixed size might cause if more virtual RAM is required is to set up a sizable, dedicated (aligned) partition for the PageFile/s.

I do like a good debate/argument as much as you do however!
Lets pick another subject!
It has to be a subject I know about fully, or all you'll get from me is: "I don't know" and that's no fun at all! :)
 
The amount of misunderstanding in the previous comment is staggering.

“Paging simply happens when it happens,” unless the user simply clicks “no pagefile” and reboots Windows :laugh:

An even better solution to prevent detrimental paging (for Windows users) is to lock pages in memory for their user account.

Regarding the bizarre MyDefrag advertisement block quote, the types of performance regression described apply to (very old) NAND flash, not 3D Xpoint.

I mention “very old” because that site was last updated in 2010. The NVME specification was not ratified until 2013.
 
:laugh::laugh:The amount of misunderstanding in the previous comment is staggering.
:roll: Wha-Ha-Ha! So lead in by pissing off your opponent "because then he's likely to say something stupid".
So....
This is no longer about who wrong or right.
It's all about winning an argument at all costs. Right?
So there's no longer any reason to read the opponent's reply with the intent to understand/learn, just with the intent to reply..?

This IS going to be fun! :)
Everybody: bring the popcorn and don't forget to vote! :)

“Paging simply happens when it happens,” unless the user simply clicks “no pagefile” and reboots Windows :laugh:

LOL! :laugh: That's a fail:
Why:
Because I completely agree!
But good try at derailing the point/s.

And the point iiiiis...!?
WHEN it happens: Windows will auto use the pagefile on the least in use drive.
IF windows has 2 or more pagefiles, each on their own physical drive; windows can use them both/all at the same time to write (read) out the 4K blocks of data in RAM to more than one Pagefile simultaneously.

eg:
You click on your fav game.
Windows needs space in ram to load it, so decides to write some Least Recently/Frequently Used data in RAM, to the pagefile.
IF the only pagefile is on the same drive the game is; Windows has to stop* reading the game into RAM, to write data out to the same drive.
IF the pagefile is on a different drive; Windows can read game data into RAM and write data out to the other drive at the same time.

* NVMe Drives have 4 lanes and can use them to read and write data simultaneously.
Smaller, older Optanes only 2.
But even NVMe SSDs, NAND or Optane, don't like mixed read-writes:

Now Optane is fast, but can windows read and write to one SIMULTANEOUSLY faster than windows can read from it and write to some other SSD SIMULTANEOUSLY..?
(Now there's a real point to argue :) BUT do NB NAND SSD's 100% write speeds 1st )

ie: Advertisers quote the read numbers on the left of that graph and the write numbers on the right, but conveniently forget to mention the low Mixed I/O numbers windows etc actually does all the time if using 1 drive, shown in the middle of the graph.

An even better solution to prevent detrimental paging (for Windows users) is to lock pages in memory for their user account.

Sure. :)
And when you cant avoid it..?
Not everyone has huge amounts of RAM.

Regarding the bizarre MyDefrag advertisement block quote, the types of performance regression described apply to (very old) NAND flash, not 3D Xpoint.

I mention “very old” because that site was last updated in 2010. The NVME specification was not ratified until 2013.

:laugh: And when was the last time that the NTFS File System was changed..? Hmmm? :)
What the Optane (or any other NAND SSD) controller does is a black box and outside our control.
All we CAN do to optimize I/O is manipulate the part of the I/O stack that takes place withing the OS.

There; if Windows sees a file that's fragmented into a 1000 pieces it needs to process and send out a 1000 I/O requests.
Not that 1, potentially large sequential file. (to windows) Which is then handled as 1000 small random files would be.

Pick a drive.
Any drive. Including Optane:
Whats faster: Random 4K, or large sequential..?


You may want to go read the "bizarre MyDefrag advertisement block quote" again. (It's freeware that uses the Windows API btw)
Especially the bit where it states that "MyDefrag (and by implication the OS) knows nothing about memory block fragmentation because [the OS] operates at the filesystem level..."

And the last bit that implies that; while we cant do anything about the memory blocks in SSDs, we can optimize the filesystem.
 
Last edited:
Back
Top