Computer Science and Engineering Dissertations - Archive

Enhancing the performance of disk-based key-value stores: From learned index acceleration to I/O-efficient hybrid caching

Sujit Maharjan, University of Texas at ArlingtonFollow

ORCID Identifier(s)

0009-0006-6155-4527

Graduation Semester and Year

Spring 2026

Language

English

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computer Science

Department

Computer Science and Engineering

First Advisor

Song Jiang

Second Advisor

Jia Rao

Third Advisor

Mohammad Atiqul Islam

Fourth Advisor

Dianqi Han

Abstract

The exponential growth of data in modern computing environments has rendered the efficient extraction of information from massive datasets a critical systemic requirement. Key-value (KV) storage systems serve as the backbone for these operations; however, their performance is consistently bottlenecked by two primary functional requirements: identifying the data's location and managing the physical cost of accessing the storage device. Data locations are typically identified via an index, while disk I/O is minimized through caching. This dissertation presents LearnedStore, TurboIndex, and ReadBooster, which break these performance bottlenecks by introducing architectural modifications to the index and cache. LearnedStore accelerates operations by adapting the Learned Index to jump directly to the leaf node. Utilizing machine learning models to predict the physical location of the leaf node significantly increases search throughput while maintaining block-device-friendly, tree-based systems. TurboIndex and ReadBooster target the inefficiencies inherent in the disk-to-memory transition. Since disks operate as block devices, existing KV stores typically utilize page-based caches to bridge the gap between block-addressable storage and byte-addressable memory. However, we demonstrate that page-level granularity often results in sub-optimal memory utilization by caching "cold" data adjacent to "hot" records, and increases disk I/O when writing to a cold page. TurboIndex and ReadBooster propose a sophisticated solution to this memory-efficiency problem through a dual-granularity caching architecture. TurboIndex accumulates insertions on cold pages to reduce disk I/O, while ReadBooster minimizes I/O by caching specific hot keys from evicted pages. Experimental results indicate that this unified approach substantially increases system throughput and reduces I/O, providing a scalable framework for next-generation, high-performance database systems.

Keywords

Index, learned index, cache, database, storage system, key value storage system, page cache, record cache, hybrid cache

Disciplines

Data Storage Systems

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Recommended Citation

Maharjan, Sujit, "Enhancing the performance of disk-based key-value stores: From learned index acceleration to I/O-efficient hybrid caching" (2026). Computer Science and Engineering Dissertations - Archive. 435.
https://mavmatrix.uta.edu/cse_dissertations/435

Enhancing the Performance of Disk-Based Key-Value Stores.pptx (3116 kB)
revision

Download

Included in

Data Storage Systems Commons

COinS

Computer Science and Engineering Dissertations - Archive

Enhancing the performance of disk-based key-value stores: From learned index acceleration to I/O-efficient hybrid caching

ORCID Identifier(s)

Graduation Semester and Year

Language

Document Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Fourth Advisor

Abstract

Keywords

Disciplines

License

Recommended Citation

Included in

Search

Browse

Author & Creator Corner

Links

Computer Science and Engineering Dissertations - Archive

Enhancing the performance of disk-based key-value stores: From learned index acceleration to I/O-efficient hybrid caching

Author

ORCID Identifier(s)

Graduation Semester and Year

Language

Document Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Fourth Advisor

Abstract

Keywords

Disciplines

License

Recommended Citation

Included in

Share

Search

Browse

Author & Creator Corner

Links