TY - JOUR
T1 - Improving the Summarization Effectiveness of Abstractive Datasets through Contrastive Learning
AU - Shin, Junho
AU - Lee, Younghoon
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/4/15
Y1 - 2025/4/15
N2 - Most studies on abstractive summarization are conducted in a supervised learning framework, aiming to generate a golden summary from the original document. In this process, the model focuses on portions of the document that closely resemble the golden summary to produce a coherent output. Consequently, current methodologies tend to achieve higher performance on extractive datasets compared to abstractive datasets, indicating diminished effectiveness on more abstracted content. To address this, our study proposes a methodology that maintains high effectiveness on abstractive datasets. Specifically, we introduce a multi-task learning approach that incorporates both salient and non-salient information during training. This is implemented by adding a contrastive objective to the fine-tuning phase of an encoder-decoder language model. Salient and non-salient parts are selected based on ROUGE-L F1 scores, and their relationships are learned through a triplet loss function. The proposed method is evaluated on five benchmark summarization datasets, including two extractive and three abstractive datasets. Experimental results demonstrate significant performance improvements on abstractive datasets, particularly those with high levels of abstraction, compared to existing abstractive summarization methods.
AB - Most studies on abstractive summarization are conducted in a supervised learning framework, aiming to generate a golden summary from the original document. In this process, the model focuses on portions of the document that closely resemble the golden summary to produce a coherent output. Consequently, current methodologies tend to achieve higher performance on extractive datasets compared to abstractive datasets, indicating diminished effectiveness on more abstracted content. To address this, our study proposes a methodology that maintains high effectiveness on abstractive datasets. Specifically, we introduce a multi-task learning approach that incorporates both salient and non-salient information during training. This is implemented by adding a contrastive objective to the fine-tuning phase of an encoder-decoder language model. Salient and non-salient parts are selected based on ROUGE-L F1 scores, and their relationships are learned through a triplet loss function. The proposed method is evaluated on five benchmark summarization datasets, including two extractive and three abstractive datasets. Experimental results demonstrate significant performance improvements on abstractive datasets, particularly those with high levels of abstraction, compared to existing abstractive summarization methods.
KW - abstractive dataset
KW - abstractive summarization
KW - contrastive attention
KW - Text summarization
UR - https://www.scopus.com/pages/publications/105009042137
U2 - 10.1145/3716851
DO - 10.1145/3716851
M3 - Article
AN - SCOPUS:105009042137
SN - 2157-6904
VL - 16
JO - ACM Transactions on Intelligent Systems and Technology
JF - ACM Transactions on Intelligent Systems and Technology
IS - 3
M1 - 52
ER -