DL-Sort: A Hybrid Approach to Scalable Hardware-Accelerated Fully-Streaming Sorting

Hyun Woo Oh, Joungmin Park, Seung Eun Lee

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

Designing high-performance hardware sorter for resource-constrained systems is challenging due to physical limitations and the need to balance streaming bandwidth with memory throughput. This brief introduces a novel, scalable hardware sorter architecture with fully-streaming support and an accompanying RTL generator to provide versatile, energy-efficient hardware acceleration. Our solution employs a dual-layer architecture consisting of a parallel one-way linear insertion sorter (OLIS) for bandwidth optimization and a cyclic bitonic merge network (CBMN) for a compact, high-throughput implementation. Furthermore, we developed the RTL generator written in Chisel to provide the agile implementation of the scalable architecture. Experimental results targeting the Xilinx XVU37P-FSVH2892-2L-E FPGA show that our design achieves throughput increasing by 126.26% and latency decreasing by 68.46%, with an area increment of no more than 132.94% for LUTs and a decrement of flip-flops by 79.84%, compared to state-of-the-art streaming sorter. The source code is available at https://github.com/hyun-woo-oh/DL-Sort-Generator.

Original languageEnglish
Pages (from-to)2549-2553
Number of pages5
JournalIEEE Transactions on Circuits and Systems II: Express Briefs
Volume71
Issue number5
DOIs
StatePublished - 1 May 2024

Keywords

  • bitonic sort
  • energy-efficient computing
  • hardware acceleration
  • scalable architecture
  • Sorting network

Fingerprint

Dive into the research topics of 'DL-Sort: A Hybrid Approach to Scalable Hardware-Accelerated Fully-Streaming Sorting'. Together they form a unique fingerprint.

Cite this