Applying convolution filter to matrix of word-clustering based document representation

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Word-clustering based document representation approaches have been suggested recently to overcome previous limitations such as high dimensionality or loss of innate interpretation; they show higher classification performance than other recent methods. Thus, we present a novel way to combine the advantages of various word-clustering based representation approaches. Instead of previous approaches, which represent documents in vector form, we represent documents in matrix form while concatenate various representation results. And we proposed another novel way to apply convolution filter to those representation while rearranging the elements by preserving the semantic distance. In order to verify the representation performance of our proposed methods, we utilized the kinds of dataset: customer-voice data from LG Electronics, public Reuter news dataset and 20 Newsgroup dataset. The results demonstrated that the proposed method outperforms all other methods and achieves a classification accuracy of 88.73%, 89.16%, and 88.06% for each dataset.

Original languageEnglish
Pages (from-to)210-220
Number of pages11
JournalNeurocomputing
Volume315
DOIs
StatePublished - 13 Nov 2018

Keywords

  • Convolution filter
  • Document representation
  • Linear transformation
  • Matrix representation
  • t-SNE
  • Word clustering

Fingerprint

Dive into the research topics of 'Applying convolution filter to matrix of word-clustering based document representation'. Together they form a unique fingerprint.

Cite this