Natural scene text recognition using convolutional recurrent neural network

Yao Wang, Jongeun Ha

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

In this article, we explore the scene text recognition problem, which is one of the challenging sub-fields of computer vision. Recently, deep learning has achived state-of-the-art performance for recognition task. The convolutional recurrent neural network (CRNN) architecture is explored for this task, which consists of feature extraction, sequence modeling. Moreover, an attention mechanism is introduced in our study. Unlike many of previous scene text recognition systems, the proposed architecture has several advantages: the model can be trained using the end-to-end manner and the CRNN can deal with the sequences of arbitrary length. Comparing the detection results of several mainstream CNN network structures, the experimental results show that the accuracy of the detection results is improved, and false positives are reduced, which clearly demonstrate its effectiveness.

Original languageEnglish
Title of host publicationICCSE 2021 - IEEE 16th International Conference on Computer Science and Education
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages789-793
Number of pages5
ISBN (Electronic)9781665414685
DOIs
StatePublished - 17 Aug 2021
Event16th IEEE International Conference on Computer Science and Education, ICCSE 2021 - Lancaster, United Kingdom
Duration: 17 Aug 202121 Aug 2021

Publication series

NameICCSE 2021 - IEEE 16th International Conference on Computer Science and Education

Conference

Conference16th IEEE International Conference on Computer Science and Education, ICCSE 2021
Country/TerritoryUnited Kingdom
CityLancaster
Period17/08/2121/08/21

Keywords

  • Attention Mechanism
  • Convolutional recurrent neural network
  • Scene text recognition

Fingerprint

Dive into the research topics of 'Natural scene text recognition using convolutional recurrent neural network'. Together they form a unique fingerprint.

Cite this