TY - JOUR
T1 - An improved semi-supervised dimensionality reduction using feature weighting
T2 - Application to sentiment analysis
AU - Kim, Kyoungok
N1 - Publisher Copyright:
© 2018 Elsevier Ltd
PY - 2018/11/1
Y1 - 2018/11/1
N2 - Analyzing a large number of documents for sentiment analysis entails huge complexity and cost. To alleviate this burden, dimensionality reduction has been applied to documents as a preprocessing step. Among dimensionality reduction algorithms, compared with feature selection, feature extraction can reduce information loss and achieve a higher discriminating power in sentiment classification. However, feature extraction suffers from lack of interpretability and many nonlinear extraction methods, which generally outperform linear methods, are not applicable for sentiment classification because of the characteristics that only provide corresponding low-dimensional coordinates without mapping. Therefore, this research proposes an improved semi-supervised dimensionality reduction framework that simultaneously preserves the advantages of feature extraction and addresses the drawbacks for sentiment classification. The proposed framework is mainly based on linear feature extraction providing mapping and feature weighting is applied before feature extraction. Feature weighting and extraction are conducted in a semi-supervised manner so that both label information and structural information of data can be considered. The superiority of both feature weighting and feature extraction was verified by conducting extensive experiments in six benchmark datasets.
AB - Analyzing a large number of documents for sentiment analysis entails huge complexity and cost. To alleviate this burden, dimensionality reduction has been applied to documents as a preprocessing step. Among dimensionality reduction algorithms, compared with feature selection, feature extraction can reduce information loss and achieve a higher discriminating power in sentiment classification. However, feature extraction suffers from lack of interpretability and many nonlinear extraction methods, which generally outperform linear methods, are not applicable for sentiment classification because of the characteristics that only provide corresponding low-dimensional coordinates without mapping. Therefore, this research proposes an improved semi-supervised dimensionality reduction framework that simultaneously preserves the advantages of feature extraction and addresses the drawbacks for sentiment classification. The proposed framework is mainly based on linear feature extraction providing mapping and feature weighting is applied before feature extraction. Feature weighting and extraction are conducted in a semi-supervised manner so that both label information and structural information of data can be considered. The superiority of both feature weighting and feature extraction was verified by conducting extensive experiments in six benchmark datasets.
KW - Feature extraction
KW - Feature weighting
KW - Natural language processing (NLP)
KW - Semi-supervised dimensionality reduction
KW - Sentiment analysis
UR - http://www.scopus.com/inward/record.url?scp=85047330895&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2018.05.023
DO - 10.1016/j.eswa.2018.05.023
M3 - Article
AN - SCOPUS:85047330895
SN - 0957-4174
VL - 109
SP - 49
EP - 65
JO - Expert Systems with Applications
JF - Expert Systems with Applications
ER -