Optimized multi correlation-based feature selection in software defect prediction
Dublin Core
Title
Optimized multi correlation-based feature selection in software defect prediction
Subject
Correlation-based
Feature selection
High dimensional
Noisy attribute
Software defect
Feature selection
High dimensional
Noisy attribute
Software defect
Description
In software defect prediction, noisy attributes and high-dimensional data remain to be a critical challenge. This paper introduces a novel approach known as multi correlation-based feature selection (MCFS), which seeks to address these challenges. MCFS integrates two feature selection techniques, namely correlation-based feature selection (CFS) and correlation matrix-based feature selection (CMFS), intending to reduce data dimensionality and eliminate noisy attributes. To accomplish this, CFS and CMFS are applied independently to filter the datasets, and a weighted average of their outcomes is computed to determine the optimal feature selection. This approach not only reduces data dimensionality but also mitigates the impact of noisy attributes. To further enhance predictive performance, this paper leverages the particle swarm optimization (PSO) algorithm as a feature selection mechanism, specifically targeting improvements in the area under the curve (AUC). The evaluation of the proposed method is conducted on 12 benchmark datasets sourced from the NASA metrics data program (MDP) corpus, renowned for their noisy attributes, high dimensionality, and imbalanced class records. The research findings demonstrate that MCFS outperforms CFS and CMFS, yielding an average AUC value of 0.891, thereby emphasizing it is efficacy in advancing classification performance in the context of software defect prediction using k-nearest neighbors (KNN) classification.
Creator
Muhammad Nabil Muyassar Rahman, Radityo Adi Nugroho, Mohammad Reza Faisal, Friska Abadi, Rudy Herteno
Source
Journal homepage: http://telkomnika.uad.ac.id
Date
Jan 19, 2024
Contributor
PERI IRAWAN
Format
PDF
Language
ENGLISH
Type
TEXT
Files
Collection
Citation
Muhammad Nabil Muyassar Rahman, Radityo Adi Nugroho, Mohammad Reza Faisal, Friska Abadi, Rudy Herteno, “Optimized multi correlation-based feature selection in software defect prediction,” Repository Horizon University Indonesia, accessed February 3, 2026, https://repository.horizon.ac.id/items/show/10109.