Inicio  /  Applied Sciences  /  Vol: 9 Par: 13 (2019)  /  Artículo
ARTÍCULO
TITULO

Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach

Abdullateef Oluwagbemiga Balogun    
Shuib Basri    
Said Jadid Abdulkadir and Ahmad Sobri Hashim    

Resumen

Software Defect Prediction (SDP) models are built using software metrics derived from software systems. The quality of SDP models depends largely on the quality of software metrics (dataset) used to build the SDP models. High dimensionality is one of the data quality problems that affect the performance of SDP models. Feature selection (FS) is a proven method for addressing the dimensionality problem. However, the choice of FS method for SDP is still a problem, as most of the empirical studies on FS methods for SDP produce contradictory and inconsistent quality outcomes. Those FS methods behave differently due to different underlining computational characteristics. This could be due to the choices of search methods used in FS because the impact of FS depends on the choice of search method. It is hence imperative to comparatively analyze the FS methods performance based on different search methods in SDP. In this paper, four filter feature ranking (FFR) and fourteen filter feature subset selection (FSS) methods were evaluated using four different classifiers over five software defect datasets obtained from the National Aeronautics and Space Administration (NASA) repository. The experimental analysis showed that the application of FS improves the predictive performance of classifiers and the performance of FS methods can vary across datasets and classifiers. In the FFR methods, Information Gain demonstrated the greatest improvements in the performance of the prediction models. In FSS methods, Consistency Feature Subset Selection based on Best First Search had the best influence on the prediction models. However, prediction models based on FFR proved to be more stable than those based on FSS methods. Hence, we conclude that FS methods improve the performance of SDP models, and that there is no single best FS method, as their performance varied according to datasets and the choice of the prediction model. However, we recommend the use of FFR methods as the prediction models based on FFR are more stable in terms of predictive performance.

 Artículos similares

       
 
Margarita Garcia-Vila, Rodrigo Morillo-Velarde and Elias Fereres    
Process-based crop models such as AquaCrop are useful for a variety of applications but must be accurately calibrated and validated. Sugar beet is an important crop that is grown in regions under water scarcity. The discrepancies and uncertainty in past ... ver más
Revista: Water

 
Bahareh Kalantar, Husam A. H. Al-Najjar, Biswajeet Pradhan, Vahideh Saeidi, Alfian Abdul Halin, Naonori Ueda and Seyed Amir Naghibi    
Assessment of the most appropriate groundwater conditioning factors (GCFs) is essential when performing analyses for groundwater potential mapping. For this reason, in this work, we look at three statistical factor analysis methods?Variance Inflation Fac... ver más
Revista: Water

 
M.H.J.P. Gunarathna, Kazuhito Sakai, Tamotsu Nakandakari, Kazuro Momii and M.K.N. Kumari    
Poor data availability on soil hydraulic properties in tropical regions hampers many studies, including crop and environmental modeling. The high cost and effort of measurement and the increasing demand for such data have driven researchers to search for... ver más
Revista: Water

 
Benjamin Bett Cheruiyot     Pág. 88 - 97
The focus of this study was to investigate the influence of training strategies on employee performance in public university campuses in Kericho County, Kenya. The study was motivated by concerns on employee performance in public university campuses desp... ver más

 
Aden Iftin Janjane,Jackson Ndolo Muthini     Pág. 105 - 112
The logistics service industry has evolved quickly to deliver a variety of services including frequently outsourced warehousing, logistics, and freight forwarding, as well as value-added services which include; order management and fulfillment as well as... ver más