New RNN Activation Technique for Deeper Networks: LSTCM Cells

Soo Han Kang, Ji Hyeong Han

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Long short-term memory (LSTM) has shown good performance when used with sequential data but gradient vanishing or exploding problem can arise especially when using deeper layers to solve complex problems. Thus in this paper we propose a new LSTM cell termed long short-time complex memory (LSTCM) that applies an activation function to the cell state instead of a hidden state for better convergence in deep layers. Moreover we propose a sinusoidal function as an activation function for LSTM and the proposed LSTCM instead of a hyperbolic tangent activation function. The performance capabilities of the proposed LSTCM cell and the sinusoidal activation function are demonstrated through experiments on various natural language benchmark datasets in this case the Penn Tree-bank IWSLT 2015 English-Vietnamese and WMT 2014 English-German datasets.

Original languageEnglish
Article number9269980
Pages (from-to)214625-214632
Number of pages8
JournalIEEE Access
Volume8
DOIs
StatePublished - 2020

Keywords

  • language modeling
  • Long short-term memory
  • neural machine translation

Fingerprint

Dive into the research topics of 'New RNN Activation Technique for Deeper Networks: LSTCM Cells'. Together they form a unique fingerprint.

Cite this