ANNA: Specialized Architecture for Approximate Nearest Neighbor Search

Yejin Lee, Hyunji Choi, Sunhong Min, Hyunseung Lee, Sangwon Beak, Dawoon Jeong, Jae W. Lee, Tae Jun Ham

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

20 Scopus citations

Abstract

Similarity search or nearest neighbor search is a task of retrieving a set of vectors in the (vector) database that are most similar to the provided query vector. It has been a key kernel for many applications for a long time. However, it is becoming especially more important in recent days as modern neural networks and machine learning models represent the semantics of images, videos, and documents as high-dimensional vectors called embeddings. Finding a set of similar embeddings for the provided query embedding is now the critical operation for modern recommender systems and semantic search engines. Since exhaustively searching for the most similar vectors out of billion vectors is such a prohibitive task, approximate nearest neighbor search (ANNS) is often utilized in many real-world use cases. Unfortunately, we find that utilizing the server-class CPUs and GPUs for the ANNS task leads to suboptimal performance and energy efficiency. To address such limitations, we propose a specialized architecture named ANNA (Approximate Nearest Neighbor search Accelerator), which is compatible with state-of-the-art ANNS algorithms such as Google ScaNN and Facebook Faiss. By combining the benefits of a specialized dataflow pipeline and efficient data reuse, ANNA achieves multiple orders of magnitude higher energy efficiency, 2.3-61.6× higher throughput, and 4.3-82.1× lower latency than the conventional CPU or GPU for both million- and billion-scale datasets.

Original languageEnglish
Title of host publicationProceedings - 2022 IEEE International Symposium on High-Performance Computer Architecture, HPCA 2022
PublisherIEEE Computer Society
Pages169-183
Number of pages15
ISBN (Electronic)9781665420273
DOIs
StatePublished - 2022
Event28th Annual IEEE International Symposium on High-Performance Computer Architecture, HPCA 2022 - Virtual, Online, Korea, Republic of
Duration: 2 Apr 20226 Apr 2022

Publication series

NameProceedings - International Symposium on High-Performance Computer Architecture
Volume2022-April
ISSN (Print)1530-0897

Conference

Conference28th Annual IEEE International Symposium on High-Performance Computer Architecture, HPCA 2022
Country/TerritoryKorea, Republic of
CityVirtual, Online
Period2/04/226/04/22

Keywords

  • Approximate Nearest Neighbor Search
  • Hardware Accelerator
  • Product Quantization
  • Similarity Search

Fingerprint

Dive into the research topics of 'ANNA: Specialized Architecture for Approximate Nearest Neighbor Search'. Together they form a unique fingerprint.

Cite this