A Gaussian Naive Bayes and SMOTE-Based Approach for Predicting
Breast Cancer Aggressiveness in Imbalanced Datasets

Dublin Core

Title

Subject

Breast Cancer, Gaussian Naive Bayes, Classification, SMOTE, Medical Diagnosis, Machine Learning.

Description

Breast cancer remains one of the leading causes of death among women worldwide, making early and accurate detection essential to improving
patient outcomes. This study aims to develop a predictive model for breast cancer aggressiveness using the Gaussian Naive Bayes algorithm on
the Breast Cancer Wisconsin Diagnostic Dataset. The dataset contains 569 instances with 30 numerical features representing various cell
characteristics. Preprocessing steps included data cleaning, label encoding, and Min-Max normalization. The model was evaluated using
accuracy, precision, recall, F1-score, and a confusion matrix. Initially, the model achieved an accuracy of 78.88%; however, the recall for
malignant cases was relatively low at 45.5%, highlighting a critical limitation in detecting aggressive cancer. To address class imbalance and
improve model sensitivity, the Synthetic Minority Oversampling Technique (SMOTE) was applied. While detailed post-SMOTE metrics were
not reported in this version, the approach is expected to enhance recall and F1-score for the malignant class. This research demonstrates the
potential of Gaussian Naive Bayes, combined with data balancing techniques, as a fast and interpretable tool for early breast cancer diagnosis.
Future work will focus on model comparison, cross-validation, and statistical evaluation to improve robustness and reliability.

Creator

Deshinta Arrova Dewi,
1,* Tri Basuki Kurniawan2

Source

https://ijiis.org/index.php/IJIIS/article/view/250/158

Publisher

INTI International University, Malaysia,

Date

january 2025

Contributor

Fajar bagus W

Format

PDF

Language

English

Type

Text

Files

250-766-1-PB.pdf

Collection

VOL 8, NO 1
JANUARY 2025

Citation

Deshinta Arrova Dewi, 1,* Tri Basuki Kurniawan2, “A Gaussian Naive Bayes and SMOTE-Based Approach for Predicting
Breast Cancer Aggressiveness in Imbalanced Datasets,” Repository Horizon University Indonesia, accessed January 26, 2026, https://repository.horizon.ac.id/items/show/9726.

A Gaussian Naive Bayes and SMOTE-Based Approach for Predicting Breast Cancer Aggressiveness in Imbalanced Datasets

Dublin Core

Title

Subject

Description

Creator

Source

Publisher

Date

Contributor

Format

Language

Type

Files

Collection

Citation

A Gaussian Naive Bayes and SMOTE-Based Approach for Predicting
Breast Cancer Aggressiveness in Imbalanced Datasets