XGBoost Algorithm for Cervical Cancer Risk Prediction: Multi-dimensional Feature Analysis
Dublin Core
Title
XGBoost Algorithm for Cervical Cancer Risk Prediction: Multi-dimensional Feature Analysis
Subject
cervical cancer screening; computational oncology; machine learning; risk stratification; XGBoost
Description
Cervical cancer continues to pose a significant global health challenge, with early detection remaining the cornerstone for effective intervention. This study is situated at the intersection of clinical oncology and computational intelligence, exploring the potential of gradient-boosting algorithms to overcome the limitations of conventional screening methodologies. An XGBoost model was developed to predict cervical cancer risk. This model incorporates demographic, behavioral, and clinical parameters. The model was developed using data from 858 patients at the Hospital Universitario de Caracas. The preprocessing pipeline was designed to address the complexities inherent in medical data, including strategic management of missing values and standardizing heterogeneous features. The model demonstrated an overall accuracy of 96.3%, with a sensitivity of 66.7% and a specificityof 97.6%. This performance profile indicates adept navigation of the delicate balance between missed diagnoses and unnecessary interventions. Feature importance analysis revealed a multifaceted risk landscape, where screening test results contributed substantial predictive power (approximately 60%), complemented by demographic and behavioral factors, including age, reproductive history, and contraceptive usage patterns. The confusion matrix analysis revealed the clinical implications of the model predictions, demonstrating a promising positive predictive value of 55.0% despite the pronounced class imbalance. These findings suggest that ensemble learning approaches can effectively synthesize diverse patient data into meaningful risk assessments, potentially enhancing screening efficiency through personalized stratification. Future research directions include prospective validation across diverse populations, integration of longitudinal data, and further exploration of explainable AI techniques to bridge the gap between algorithmic predictions and clinical implementation.
Creator
Sudi Suryadi1*, Masrizal2
Source
https://jurnal.iaii.or.id/index.php/RESTI/article/view/6587/1085
Publisher
Information System, Facultyof Science and Technology, Universitas Labuhanbatu,Rantauprapat, Indonesia
Date
June 21, 2025
Contributor
FAJAR BAGUS W
Format
PDF
Language
ENGLISH
Type
TEXT
Files
Collection
Citation
Sudi Suryadi1*, Masrizal2, “XGBoost Algorithm for Cervical Cancer Risk Prediction: Multi-dimensional Feature Analysis,” Repository Horizon University Indonesia, accessed January 27, 2026, https://repository.horizon.ac.id/items/show/10514.