Cache Size Effects - The Best NVMe SSD for Laptops and Notebooks: SK hynix Gold P31 1TB SSD Reviewed

Publish date: 2024-05-26

Whole-Drive Fill

This test starts with a freshly-erased drive and fills it with 128kB sequential writes at queue depth 32, recording the write speed for each 1GB segment. This test is not representative of any ordinary client/consumer usage pattern, but it does allow us to observe transitions in the drive's behavior as it fills up. This can allow us to estimate the size of any SLC write cache, and get a sense for how much performance remains on the rare occasions where real-world usage keeps writing data after filling the cache.

The SLC write cache in the 1TB SK hynix Gold P31 runs out after just over 100GB of writes. After the SLC cache fills up, the Gold P31's sequential write performance becomes highly variable, ranging from about 1.4 to 2.3 GB/s with little change in character across the entire TLC filling phase. There are no obvious patterns of periodic garbage collection cycles visible at this scale.

Despite the variability, the P31's long-term sustained write performance is excellent. It averages out to the best overall write throughput we've measured from a 1TB TLC drive, and in all that variation the performance never drops down to a disappointing level.

Working Set Size

Most mainstream SSDs have enough DRAM to store the entire mapping table that translates logical block addresses into physical flash memory addresses. DRAMless drives only have small buffers to cache a portion of this mapping information. Some NVMe SSDs support the Host Memory Buffer feature and can borrow a piece of the host system's DRAM for this cache rather needing lots of on-controller memory.

When accessing a logical block whose mapping is not cached, the drive needs to read the mapping from the full table stored on the flash memory before it can read the user data stored at that logical block. This adds extra latency to read operations and in the worst case may double random read latency.

We can see the effects of the size of any mapping buffer by performing random reads from different sized portions of the drive. When performing random reads from a small slice of the drive, we expect the mappings to all fit in the cache, and when performing random reads from the entire drive, we expect mostly cache misses.

When performing this test on mainstream drives with a full-sized DRAM cache, we expect performance to be generally constant regardless of the working set size, or for performance to drop only slightly as the working set size increases.

As expected for a drive with a full size DRAM buffer, the P31's random read latency is unaffected by spatial locality: reading across the whole drive is just as fast as reading from a narrow range. And the only other TLC drives that can match this read latency are the two Toshiba/Kioxia SSDs with 96L BiCS4 TLC NAND, but they can't maintain this performance across the entire test.

ncG1vNJzZmivp6x7orrAp5utnZOde6S7zGiqoaenZH53fJBrZq2glWLArHnHsqWisF2cvK2wjKlqamWjqLFuvsSvoJ6vX2c%3D