Loading…

An Improved SDA Based Defect Prediction Framework for Both Within-Project and Cross-Project Class-Imbalance Problems

Background. Solving the class-imbalance problem of within-project software defect prediction (SDP) is an important research topic. Although some class-imbalance learning methods have been presented, there exists room for improvement. For cross-project SDP, we found that the class-imbalanced source u...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on software engineering 2017-04, Vol.43 (4), p.321-339
Main Authors: Jing, Xiao-Yuan, Wu, Fei, Dong, Xiwei, Xu, Baowen
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background. Solving the class-imbalance problem of within-project software defect prediction (SDP) is an important research topic. Although some class-imbalance learning methods have been presented, there exists room for improvement. For cross-project SDP, we found that the class-imbalanced source usually leads to misclassification of defective instances. However, only one work has paid attention to this cross-project class-imbalance problem. Objective. We aim to provide effective solutions for both within-project and cross-project class-imbalance problems. Method. Subclass discriminant analysis (SDA), an effective feature learning method, is introduced to solve the problems. It can learn features with more powerful classification ability from original metrics. For within-project prediction, we improve SDA for achieving balanced subclasses and propose the improved SDA (ISDA) approach. For cross-project prediction, we employ the semi-supervised transfer component analysis (SSTCA) method to make the distributions of source and target data consistent, and propose the SSTCA+ISDA prediction approach. Results. Extensive experiments on four widely used datasets indicate that: 1) ISDA-based solution performs better than other state-of-the-art methods for within-project class-imbalance problem; 2) SSTCA+ISDA proposed for cross-project class-imbalance problem significantly outperforms related methods. Conclusion. Within-project and cross-project class-imbalance problems greatly affect prediction performance, and we provide a unified and effective prediction framework for both problems.
ISSN:0098-5589
1939-3520
DOI:10.1109/TSE.2016.2597849