Improved classification for imbalanced data using ensemble
clustering
Dublin Core
Title
Improved classification for imbalanced data using ensemble
clustering
clustering
Subject
Auxiliary features
Classification
Ensemble clustering
Imbalanced data
Minority class
Classification
Ensemble clustering
Imbalanced data
Minority class
Description
Imbalanced datasets frequently occur in fields like fraud detection and medical
diagnosis, where the number of instances in the majority class vastly exceeds
those in the minority class. Traditional classification algorithms often become
biased towards the majority class in these scenarios. To address this challenge,
we introduce a novel method called improved classification using ensemble clustering
(ICEC) for imbalanced datasets in this paper. ICEC merges classification
with the strengths of consensus clustering to improve the classifier’s generalization
ability. This approach utilizes a cluster ensemble to capture the structural
characteristics of both the majority and minority classes, and the stable clustering
scheme thus delivered is used to generate new auxiliary features. These
features enhance the existing feature set, helping classifiers develop a more robust
predictive model. Extensive testing on fifteen imbalanced datasets from the
knowledge extraction based on evolutionary learning (KEEL) repository demonstrates
the effectiveness of our proposed method. The approach was evaluated
for random forest (RF) and linear support vector machine (SVM) classifiers on
these data sets. Results indicate that ICEC proved to be effective for both classifiers,
with an observed F1-score improvement of more than 10% for SVM and
3% for RF.
diagnosis, where the number of instances in the majority class vastly exceeds
those in the minority class. Traditional classification algorithms often become
biased towards the majority class in these scenarios. To address this challenge,
we introduce a novel method called improved classification using ensemble clustering
(ICEC) for imbalanced datasets in this paper. ICEC merges classification
with the strengths of consensus clustering to improve the classifier’s generalization
ability. This approach utilizes a cluster ensemble to capture the structural
characteristics of both the majority and minority classes, and the stable clustering
scheme thus delivered is used to generate new auxiliary features. These
features enhance the existing feature set, helping classifiers develop a more robust
predictive model. Extensive testing on fifteen imbalanced datasets from the
knowledge extraction based on evolutionary learning (KEEL) repository demonstrates
the effectiveness of our proposed method. The approach was evaluated
for random forest (RF) and linear support vector machine (SVM) classifiers on
these data sets. Results indicate that ICEC proved to be effective for both classifiers,
with an observed F1-score improvement of more than 10% for SVM and
3% for RF.
Creator
Sharanjit Kaur1, Manju Bhardwaj2, Adi Maqsood1, Aditya Maurya1, Mayank Kumar1, Nishant
Pratap Singh1
Pratap Singh1
Source
Journal homepage: http://journal.uad.ac.id/index.php/TELKOMNIKA
Date
Aug 1, 2025
Contributor
PERI IRAWAN
Format
PDF
Language
ENGLISH
Type
TEXT
Files
Collection
Citation
Sharanjit Kaur1, Manju Bhardwaj2, Adi Maqsood1, Aditya Maurya1, Mayank Kumar1, Nishant
Pratap Singh1, “Improved classification for imbalanced data using ensemble
clustering,” Repository Horizon University Indonesia, accessed January 12, 2026, https://repository.horizon.ac.id/items/show/10331.
clustering,” Repository Horizon University Indonesia, accessed January 12, 2026, https://repository.horizon.ac.id/items/show/10331.