ORCID Identifier(s)

ORCID 0000-0002-4098-6260

Graduation Semester and Year

Summer 2025

Language

English

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computer Science

Department

Computer Science and Engineering

First Advisor

Song Jiang

Abstract

Modern storage hierarchies exhibit performance differences spanning eight orders of magnitude. Each tier in the hierarchy presents fundamentally different optimal access sizes and patterns. Most systems read and write data in fixed-size units across different tiers, typically power-of-2 sizes. It simplifies data management but also suffers from large write amplification when handling small accesses in many real-world workloads. This fundamental disconnect between fixed-size I/O design assumption and real-world usage patterns represents the core challenge.

The storage community has recognized the limitations of fixed-size I/O abstractions and is exploring alternatives, like the new emerging NVMe-KV interface and object-based storage, which aim to provide more flexible data access models. These approaches aim to offload services to dedicated devices or distributed service, reducing host CPU usage and device read/write amplifications. However, this approach introduces a new challenge - the protocol overhead. The individual NVMe command processing costs become prohibitively expensive for small KV operations, creating a new performance bottleneck.

This dissertation directly/indirectly addresses both categories of limitations to improve overall storage systems performance. (1) To overcome fixed-size I/O limitations, the Buffered Hash Table utilizes in-DRAM buffering to align access patterns with the specific properties of persistent memory. This approach transforms small random writes into larger sequential operations, which better match PMem's internal 256-byte access characteristics and mitigate potential write amplification from read-modify-write operations. Moreover, the IndeXY framework unifies in-memory and on-disk indexes into a single extensible system with optimized policies, allowing for the selection of the most suitable index structure for each distinct storage tier, rather than imposing a single data structure across vastly different storage technologies. (2) In terms of variable-size I/O limitations, SAKER, Software Accelerated Key-value Service via NVMe, introduces innovative command batching and processing mechanisms that substantially reduce per-operation overhead while preserving the semantic benefits of the KV model, optimizing communication patterns for small operations where command overhead would otherwise dominate total cost.

Beyond I/O abstraction, we also address critical cache management challenges. This includes LIRS2, an improved LIRS replacement algorithm that incorporates a new data locality measure with low time and space overheads. By improving cache hit ratios compared to existing approaches, LIRS2 enhances cache efficiency and better utilizes available memory resources, which in turn contributes to reducing unnecessary I/O and improving overall system performance.

Keywords

NVMe-KV, persistent memory, write amplification, index structures, storage hierarchy, cache replacement

Disciplines

Computer and Systems Architecture | Data Storage Systems

License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.