Auto-accel: a SW/HW co-design framework with adaptive unified buffer mapping and power-of-two quantization for FPGA-based object detection accelerators

  • Junha Ko
  • , Dongjun Lee
  • , Youngchan Kim
  • , Hyun Kim

Research output: Contribution to journalArticlepeer-review

Abstract

Recent advancements in convolutional neural network (CNN)-based object detection models have significantly improved detection accuracy; however, these improvements have come at the cost of increased computational and memory demands, posing challenges for efficient deployment in resource-constrained edge environments. To address these challenges, existing CNN accelerators have primarily focused on enhancing computational efficiency through the adoption of lightweight models such as MobileNetV1. Nevertheless, systematic analyses of layer-wise memory requirements have been relatively lacking, often resulting in inefficient utilization of on-chip memory (OCM) resources during deployment. To overcome these limitations, this paper proposes Auto-Accel, a SW/HW co-design solution for CNN accelerators that simultaneously enables energy-efficient computation and improves memory resource utilization. On the SW side, Auto-Accel effectively reduces hardware resource utilization by proposing a fused quantization technique based on the sum-of-power-of-two scaling factor approximation. On the HW side, Auto-Accel includes an adaptive unified buffer mapping that efficiently reallocates buffer resources according to the layer-wise memory requirements of activations and weights. Furthermore, to maximize data reuse and reduce off-chip memory accesses, we propose a tile-based adaptive pipelined dataflow, which maximizes computational and energy efficiency. The MobileNetV1-SSD lite accelerator equipped with the proposed Auto-Accel achieves an energy efficiency of 11.5 FPS/W when implemented on a ZCU102 board, representing an improvement of approximately 1.33× to 4.41× in energy efficiency compared to prior studies.

Original languageEnglish
Article number7
JournalJournal of Real-Time Image Processing
Volume23
Issue number1
DOIs
StatePublished - Jan 2026

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 7 - Affordable and Clean Energy
    SDG 7 Affordable and Clean Energy

Keywords

  • Convolutional neural network
  • Dataflow
  • Design optimization
  • Field-programmable gate array (FPGA)
  • Hardware accelerator

Fingerprint

Dive into the research topics of 'Auto-accel: a SW/HW co-design framework with adaptive unified buffer mapping and power-of-two quantization for FPGA-based object detection accelerators'. Together they form a unique fingerprint.

Cite this