Normalized class coherence change-based kNN for classification of imbalanced data

Research output: Contribution to journalArticlepeer-review

27 Scopus citations

Abstract

kNN is a widely used machine learning algorithm in many different domains because of its fairly good performance in actual cases and its simplicity. This study aims to enhance the performance of kNN for imbalanced datasets, a topic that has been relatively ignored in kNN research. The proposed kNN algorithm, called normalized class coherence change-based k-nearest neighbor (NCC-NN) algorithm, determines the label of a test sample by computing the normalized class coherence changes at class and sample levels for every possible class and assigning the sample to the class with the maximum value. It considers the tendency that the minority classes usually show the lower-class coherence than the majority class. NCC-kNN also utilizes the adaptive k for the class coherence, which is calculated in a weighted manner to reduce the sensitivity to the selection of k. NCC-kNN was applied to 20 benchmark datasets with varying class imbalance and coherence, and its performance was compared with that of five kNN algorithms, SMOTE and MetaCost with standard kNN as a base classifier. The proposed NCC-kNN outperformed the other kNN algorithms in classification of imbalanced data, especially for imbalanced data with low positive class coherence.

Original languageEnglish
Article number108126
JournalPattern Recognition
Volume120
DOIs
StatePublished - Dec 2021

Keywords

  • Class coherence
  • Imbalanced data
  • Nearest neighbor classification
  • kNN

Fingerprint

Dive into the research topics of 'Normalized class coherence change-based kNN for classification of imbalanced data'. Together they form a unique fingerprint.

Cite this