Performance Analysis of Hybrid Machine Learning Methods on
Imbalanced Data (Rainfall Classification)
Dublin Core
Title
Performance Analysis of Hybrid Machine Learning Methods on
Imbalanced Data (Rainfall Classification)
Imbalanced Data (Rainfall Classification)
Subject
Rainfall, Machine Learning, Hybrid Methods, Classification, SMOTE
Description
This study proposes several methods to analyze the performance of the hybrid machine learning method using Voting and
Stacking on rainfall classification. The two hybrid methods will combine five classification methods, namely Logistic
Regression, Support Vector Machine, Random Forest, Artificial Neural Network, and eXtreme Gradient Boosting. The data
used is Bandung City rainfall data for the years 2005 until 2021. The hybrid method is classified as an ensemble, which means
combining several individual classification models to improve the performance of the built model. Voting algorithm has
weaknesses in imbalanced data, while stacking does not. The results show that by combining five machine learning methods
on an imbalanced dataset, the Stacking algorithm obtains an accuracy value of 99.60%. Meanwhile, with the addition of the
SMOTE technique, the accuracy increases to 99.71%. This is supported by the performance of the Stacking method which is
superior because it takes the best classification value for each individual model and can overcome the imbalance. Model
evaluation does not only focus on accuracy, but also precision, recall, and f1-score. The contribution of this research is to
provide information about the best Hybrid method between Voting and Stacking in obtaining model performance results on
rainfall classification.
Stacking on rainfall classification. The two hybrid methods will combine five classification methods, namely Logistic
Regression, Support Vector Machine, Random Forest, Artificial Neural Network, and eXtreme Gradient Boosting. The data
used is Bandung City rainfall data for the years 2005 until 2021. The hybrid method is classified as an ensemble, which means
combining several individual classification models to improve the performance of the built model. Voting algorithm has
weaknesses in imbalanced data, while stacking does not. The results show that by combining five machine learning methods
on an imbalanced dataset, the Stacking algorithm obtains an accuracy value of 99.60%. Meanwhile, with the addition of the
SMOTE technique, the accuracy increases to 99.71%. This is supported by the performance of the Stacking method which is
superior because it takes the best classification value for each individual model and can overcome the imbalance. Model
evaluation does not only focus on accuracy, but also precision, recall, and f1-score. The contribution of this research is to
provide information about the best Hybrid method between Voting and Stacking in obtaining model performance results on
rainfall classification.
Creator
Aditya Gumilar1
, Sri Suryani Prasetiyowati2
, Yuliant Sibaroni3
, Sri Suryani Prasetiyowati2
, Yuliant Sibaroni3
Publisher
Telkom University
Date
15-07-2022
Contributor
Fajar bagus W
Format
PDF
Language
Indonesia
Type
Text
Files
Collection
Citation
Aditya Gumilar1
, Sri Suryani Prasetiyowati2
, Yuliant Sibaroni3, “Performance Analysis of Hybrid Machine Learning Methods on
Imbalanced Data (Rainfall Classification),” Repository Horizon University Indonesia, accessed June 27, 2025, https://repository.horizon.ac.id/items/show/9192.
Imbalanced Data (Rainfall Classification),” Repository Horizon University Indonesia, accessed June 27, 2025, https://repository.horizon.ac.id/items/show/9192.