Random Swin Transformer

Keong Hun Choi, Jong Eun Ha

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

After deep learning appeared, the convolutional neural network (CNN) dominated various applications of image classification, object detection, and semantic segmentation. Recently, a transformer based on various attention mechanisms performed better than the CNN. But, the transformer requires a large amount of memory for full attention among tokens. Recently, a Swin transformer has been proposed to solve that memory issue. It applies the attention per sub-regions on an image. Also, it solves a problem caused by not using full attention on an image by shifting window that guarantees more tokens are involved in attention. In this paper, we investigate a method of randomly selecting tokens in Swin transformer. We randomly choose tokens within a certain range rather than using a fixed shift value in the Swin transformer. Experimental results show the feasibility of the proposed method.

Original languageEnglish
Title of host publication2022 22nd International Conference on Control, Automation and Systems, ICCAS 2022
PublisherIEEE Computer Society
Pages1611-1614
Number of pages4
ISBN (Electronic)9788993215243
DOIs
StatePublished - 2022
Event22nd International Conference on Control, Automation and Systems, ICCAS 2022 - Busan, Korea, Republic of
Duration: 27 Nov 20221 Dec 2022

Publication series

NameInternational Conference on Control, Automation and Systems
Volume2022-November
ISSN (Print)1598-7833

Conference

Conference22nd International Conference on Control, Automation and Systems, ICCAS 2022
Country/TerritoryKorea, Republic of
CityBusan
Period27/11/221/12/22

Keywords

  • Classification
  • Deep learning
  • Swin transformer
  • Transformer

Fingerprint

Dive into the research topics of 'Random Swin Transformer'. Together they form a unique fingerprint.

Cite this