Hardware-friendly Activation Functions for HybridViT Models

Beom Jin Kang, Nam Joon Kim, Jong Ho Lee, Hyun Kim

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

In recent years, CNN+ ViT hybrid models have shown promising performance in computer vision tasks. To implement the CNN+ViT Hybrid model in resource-limited devices, various studies have been ongoing to address issues of parameter size and computational complexity through quantization, aiming to enable hardware-friendly low-bit integer operations. However, commonly used ViT activation functions (e.g., GeLU, Swish) inevitably require floating-point operations. To address this problem, some studies have been conducted to approximate these functions with alternatives that allow integer operations. Inspired by the Shift-GeLU approach, which approximates the GeLU function to enable integer operations, we propose and evaluate the Shift-Swish function on the MobileViT model at both software and hardware levels. Experimental results show that the hardware-level RTL design of the proposed method can reduce LUT by 63.25 %, FF usage by 87.69 %, and power consumption by 46.57 % with a minimum accuracy drop of 0.6 % compared to the baseline.

Original languageEnglish
Title of host publicationProceedings - International SoC Design Conference 2023, ISOCC 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages147-148
Number of pages2
ISBN (Electronic)9798350327038
DOIs
StatePublished - 2023
Event20th International SoC Design Conference, ISOCC 2023 - Jeju, Korea, Republic of
Duration: 25 Oct 202328 Oct 2023

Publication series

NameProceedings - International SoC Design Conference 2023, ISOCC 2023

Conference

Conference20th International SoC Design Conference, ISOCC 2023
Country/TerritoryKorea, Republic of
CityJeju
Period25/10/2328/10/23

Keywords

  • Activation function
  • Convolutional neural network
  • Quantization
  • Vision Transformer

Fingerprint

Dive into the research topics of 'Hardware-friendly Activation Functions for HybridViT Models'. Together they form a unique fingerprint.

Cite this