Abstract
Long short-term memory (LSTM) has shown good performance when used with sequential data but gradient vanishing or exploding problem can arise especially when using deeper layers to solve complex problems. Thus in this paper we propose a new LSTM cell termed long short-time complex memory (LSTCM) that applies an activation function to the cell state instead of a hidden state for better convergence in deep layers. Moreover we propose a sinusoidal function as an activation function for LSTM and the proposed LSTCM instead of a hyperbolic tangent activation function. The performance capabilities of the proposed LSTCM cell and the sinusoidal activation function are demonstrated through experiments on various natural language benchmark datasets in this case the Penn Tree-bank IWSLT 2015 English-Vietnamese and WMT 2014 English-German datasets.
| Original language | English |
|---|---|
| Article number | 9269980 |
| Pages (from-to) | 214625-214632 |
| Number of pages | 8 |
| Journal | IEEE Access |
| Volume | 8 |
| DOIs | |
| State | Published - 2020 |
Keywords
- language modeling
- Long short-term memory
- neural machine translation