FPGA Based Approximate Vector Operation Accelerator for VLMs

  • Raehyeong Kim
  • , Chaebin Lee
  • , Dayoung Lee
  • , Yue Ri Jeong
  • , Seung Eun Lee

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

As artificial intelligence continues to advance, the importance of autonomous robotic systems and human-robot in-teractions is growing, particularly in the areas of language-based control and behavior prediction. Vision-language models (VLMs), which simultaneously process images and texts, are essential for interpreting commands and analyzing environments. However, their high computational demands, particularly for vector op-erations like softmax and layer normalization, pose significant challenges. These operations, involving complex functions such as exponentials and square roots, consume substantial resources as model sizes grow. This paper proposes a vector operation accelerator for VLMs through approximation techniques, specifically Newton-Raphson and piecewise linear approximations, to improve speed while reducing resource usage. The optimized architecture reuses resources by targeting redundant operations. Implemented on an FPGA, the accelerator achieved up to 54% faster performance compared to an RTX 3070 GPU, with minimal aproximation error.

Original languageEnglish
Title of host publication2025 International Conference on Electronics, Information, and Communication, ICEIC 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331510756
DOIs
StatePublished - 2025
Event2025 International Conference on Electronics, Information, and Communication, ICEIC 2025 - Osaka, Japan
Duration: 19 Jan 202522 Jan 2025

Publication series

Name2025 International Conference on Electronics, Information, and Communication, ICEIC 2025

Conference

Conference2025 International Conference on Electronics, Information, and Communication, ICEIC 2025
Country/TerritoryJapan
CityOsaka
Period19/01/2522/01/25

Keywords

  • accelerator
  • approximation
  • transformer
  • vision-language models

Fingerprint

Dive into the research topics of 'FPGA Based Approximate Vector Operation Accelerator for VLMs'. Together they form a unique fingerprint.

Cite this