Scene Text Recognition with Multi-decoders

Yao Wang, Jong Eun Ha

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

In this article, we focus on the scene text recognition problem, which is one of the challenging sub-files of computer vision because of the random existence of scene text. Recently, scene text recognition has achieved state-of-art performance because of the improvement of deep learning. At present, encoder-decoder architecture was widely used for scene recognition tasks, which consist of feature extractor, sequence module. Specifically, at the decoder part, connectionist temporal classification(CTC), attention mechanism, and transformer(self-attention) are three main approaches used in recent research. CTC decoder is flexible and can handle sequences with large changes in length for its align sequences features with labels in a frame-wise manner. Attention decoder can learn better and deeper feature expression and get the better position information of each character. Attention decoder can get more robust and accurate performance for both regular and irregular scene text. Moreover, a novel decoder mechanism is introduced in our study. The proposed architecture has several advantages: the model can be trained using the end-to-end manner under the condition of multi decoders, and can deal with the sequences of arbitrary length and the images of arbitrary shape. Extensive experiments on standard benchmarks demonstrate that our model's performance is improved for regular and irregular text recognition.

Original languageEnglish
Title of host publication2021 21st International Conference on Control, Automation and Systems, ICCAS 2021
PublisherIEEE Computer Society
Pages1523-1528
Number of pages6
ISBN (Electronic)9788993215212
DOIs
StatePublished - 2021
Event21st International Conference on Control, Automation and Systems, ICCAS 2021 - Jeju, Korea, Republic of
Duration: 12 Oct 202115 Oct 2021

Publication series

NameInternational Conference on Control, Automation and Systems
Volume2021-October
ISSN (Print)1598-7833

Conference

Conference21st International Conference on Control, Automation and Systems, ICCAS 2021
Country/TerritoryKorea, Republic of
CityJeju
Period12/10/2115/10/21

Keywords

  • Attention decoder module
  • CTC decoder module
  • End to end frame
  • Scene text recognition

Fingerprint

Dive into the research topics of 'Scene Text Recognition with Multi-decoders'. Together they form a unique fingerprint.

Cite this