TY - GEN
T1 - Semi-Supervised Learning based on Auto-generated Lexicon using XAI in Sentiment Analysis
AU - Hwang, Hohyun
AU - Lee, Younghoon
N1 - Publisher Copyright:
© 2021 Incoma Ltd. All rights reserved.
PY - 2021
Y1 - 2021
N2 - In this study, we proposed a novel Lexicon-based pseudo-labeling method utilizing explainable AI(XAI) approach. Existing approach have a fundamental limitation in their robustness because poor classifier leads to inaccurate soft-labeling, and it lead to poor classifier repetitively. Meanwhile, we generate the lexicon consists of sentiment word based on the explainability score. Then we calculate the confidence of unlabeled data with lexicon and add them into labeled dataset for the robust pseudo-labeling approach. Our proposed method has three contributions. First, the proposed methodology automatically generates a lexicon based on XAI and performs independent pseudo-labeling, thereby guaranteeing higher performance and robustness compared to the existing one. Second, since lexicon-based pseudo-labeling is performed without re-learning in most of models, time efficiency is considerably increased, and third, the generated high-quality lexicon can be available for sentiment analysis of data from similar domains. The effectiveness and efficiency of our proposed method were verified through quantitative comparison with the existing pseudo-labeling method and qualitative review of the generated lexicon.
AB - In this study, we proposed a novel Lexicon-based pseudo-labeling method utilizing explainable AI(XAI) approach. Existing approach have a fundamental limitation in their robustness because poor classifier leads to inaccurate soft-labeling, and it lead to poor classifier repetitively. Meanwhile, we generate the lexicon consists of sentiment word based on the explainability score. Then we calculate the confidence of unlabeled data with lexicon and add them into labeled dataset for the robust pseudo-labeling approach. Our proposed method has three contributions. First, the proposed methodology automatically generates a lexicon based on XAI and performs independent pseudo-labeling, thereby guaranteeing higher performance and robustness compared to the existing one. Second, since lexicon-based pseudo-labeling is performed without re-learning in most of models, time efficiency is considerably increased, and third, the generated high-quality lexicon can be available for sentiment analysis of data from similar domains. The effectiveness and efficiency of our proposed method were verified through quantitative comparison with the existing pseudo-labeling method and qualitative review of the generated lexicon.
UR - http://www.scopus.com/inward/record.url?scp=85123633203&partnerID=8YFLogxK
U2 - 10.26615/978-954-452-072-4_067
DO - 10.26615/978-954-452-072-4_067
M3 - Conference contribution
AN - SCOPUS:85123633203
T3 - International Conference Recent Advances in Natural Language Processing, RANLP
SP - 593
EP - 600
BT - International Conference Recent Advances in Natural Language Processing, RANLP 2021
A2 - Angelova, Galia
A2 - Kunilovskaya, Maria
A2 - Mitkov, Ruslan
A2 - Nikolova-Koleva, Ivelina
PB - Incoma Ltd
T2 - International Conference on Recent Advances in Natural Language Processing: Deep Learning for Natural Language Processing Methods and Applications, RANLP 2021
Y2 - 1 September 2021 through 3 September 2021
ER -