Content-Based Video Retrieval With Prototypes of Deep Features

Hyeok Yoon, Ji Hyeong Han

Research output: Contribution to journalArticlepeer-review

19 Scopus citations

Abstract

The rapid development in the area of information and communication technologies has enabled the transfer of high-resolution, large-sized videos, and video applications have also evolved according to data quality levels. Content-based video retrieval (CBVR) is an essential video application because it can be applied to various domains, such as surveillance, education, sports, and medicine. In this paper, we propose a CBVR method based on prototypical category approximation (PCA-CBVR), which calculates prototypes of deep features for each category to predict the user's query video category without a classifier. We also undertake fine searching to retrieve the video most similar to the user's query video from the predicted category database of videos. The proposed PCA-CBVR approach is efficient in terms of its computational cost and maintains meaningful information of the videos. It does not need to train a classifier even when the database is updated and uses all deep features without any dimension reduction step, such as those in CBVR studies. Moreover, we conduct fine-tuning of the 3D CNN feature extractor based on a few-shot learning approach for better domain adaptation ability and apply salient frame sampling instead of uniform frame sampling to improve the performance of the PCA-CBVR method. We demonstrate the performance capability of the proposed PCA-CBVR approach through experiments on various benchmark video datasets, in this case the UCF101, HMDB51, and ActivityNet datasets.

Original languageEnglish
Pages (from-to)30730-30742
Number of pages13
JournalIEEE Access
Volume10
DOIs
StatePublished - 2022

Keywords

  • cross-domain evaluation
  • deep learning
  • few-shot learning
  • prototypes
  • video analytics
  • Video retrieval

Fingerprint

Dive into the research topics of 'Content-Based Video Retrieval With Prototypes of Deep Features'. Together they form a unique fingerprint.

Cite this