Efficient and Scalable External Sort Framework for NVMe SSD

Kihyeon Myung, Sunggon Kim, Heon Young Yeom, Jiwoong Park

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

As the size of data grows in modern applications, the efficient usage of limited resources is becoming crucial. In order to reorganize large data under memory limitations, many data-intensive applications utilize external sort as a critical component. To streamline external sort, a storage framework is especially needed since the entire dataset must be loaded and flushed a couple of times during the sorting process. Most existing frameworks have attempted to simplify the storage access pattern by associating each thread with a separate storage device. This prevents randomized and concurrent I/O requests, which impose a huge overhead for legacy drives in order to enable the parallelism needed for external sort. However, such regulations excessively restrain the capabilities of NVMe-based SSDs that deliver high throughput with abundant parallelism. In this article, we present a new framework for external sort that exploits both external and internal parallelism. Externally, any number of threads are mobilized to parallel external sort in a scalable way, even with one NVMe SSD. Meanwhile, some arbitration schemes, such as adaptive resource allocation and fairness control, are adopted to preserve the internal efficiency of storage devices. Our evaluation results demonstrate that our scheme can greatly improve both the I/O efficiency and scalability compared to the existing frameworks.

Original languageEnglish
Pages (from-to)2211-2217
Number of pages7
JournalIEEE Transactions on Computers
Volume70
Issue number12
DOIs
StatePublished - 1 Dec 2021

Keywords

  • External sort
  • multi-cores
  • NVMe SSD
  • partitioning
  • storage optimization

Fingerprint

Dive into the research topics of 'Efficient and Scalable External Sort Framework for NVMe SSD'. Together they form a unique fingerprint.

Cite this