A Framework for Software Defect Prediction and Metric Selection

Shamsul Huda, Sultan Alyahya, Md Mohsin Ali, Shafiq Ahmad*, Jemal Abawajy, Hmood Al-Dossari, John Yearwood

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

56 Citations (Scopus)

Abstract

Automated software defect prediction is an important and fundamental activity in the domain of software development. However, modern software systems are inherently large and complex with numerous correlated metrics that capture different aspects of the software components. This large number of correlated metrics makes building a software defect prediction model very complex. Thus, identifying and selecting a subset of metrics that enhance the software defect prediction method's performance are an important but challenging problem that has received little attention in the literature. The main objective of this paper is to identify significant software metrics, to build and evaluate an automated software defect prediction model. We propose two novel hybrid software defect prediction models to identify the significant attributes (metrics) using a combination of wrapper and filter techniques. The novelty of our approach is that it embeds the metric selection and training processes of software defect prediction as a single process while reducing the measurement overhead significantly. Different wrapper approaches were combined, including SVM and ANN, with a maximum relevance filter approach to find the significant metrics. A filter score was injected into the wrapper selection process in the proposed approaches to direct the search process efficiently to identify significant metrics. Experimental results with real defect-prone software data sets show that the proposed hybrid approaches achieve significantly compact metrics (i.e., selecting the most significant metrics) with high prediction accuracy compared with conventional wrapper or filter approaches. The performance of the proposed framework has also been verified using a statistical multivariate quality control process using multivariate exponentially weighted moving average. The proposed framework demonstrates that the hybrid heuristic can guide the metric selection process in a computationally efficient way by integrating the intrinsic characteristics from the filters into the wrapper and using the advantages of both the filter and wrapper approaches.

Original languageEnglish
Pages (from-to)2844-2858
Number of pages15
JournalIEEE Access
Volume6
DOIs
Publication statusPublished - 27 Dec 2017
Externally publishedYes

Fingerprint

Dive into the research topics of 'A Framework for Software Defect Prediction and Metric Selection'. Together they form a unique fingerprint.

Cite this