An Empirical Study Of Ranking-oriented Cross-project Software Defect Prediction

Posted on:2018-01-05

Degree:Master

Type:Thesis

Country:China

Candidate:G A You

Full Text:PDF

GTID:2428330512483577

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

In recent years,software defect prediction has become popular in the field of software quality assurance and software maintenance.Because within-project defect prediction is highly dependent on data sets,it is difficult to train an effective prediction model for a new software project.For this case,a feasible solution is to train the prediction model by using the historical data of other projects,which is the cross-project software defect prediction.Many previous studies consider cross-project software defect prediction as a binary classification problem or regression problem.Cross-project software prediction only predicts the defect proneness for a given software entity(such as software classes,modules etc.),and the limitation of this method is relatively large especially when the period of software project development is urgent and human resources are lacking.For software developers and software testers,the software entity's ranking information is particularly important,by which software developers and software testers can objectively improve and repair software entities.However,few studies have been reported in this area.Based on the analysis and summarization of machine learning and statistical methods,this paper makes a systematic study on cross-project software defect prediction.The main research work and contributions of this paper are summarized as follows:(1)We define cross-project software defect prediction as ranking problem.Inspired by the Point-wise method of Learning to Rank(LTR),we propose a ranking oriented cross-project software defect prediction method,which is called ROCPDP.The method ranks the software entities according to the number of defects contained in the software entity.In order to obtain accurate ranking results,we trained a multiple linear regression model using gradient regression optimization.Considering the high dimension of software defect data,which will increase the training cost of the model and may cause over-fitting,the method proposed in this paper will use PCA to select the feature.In order to make the gradient descent process converge quickly,we applied the Z-score to the features(software metrics)before the model training.(2)In order to verify the effectiveness of the ROCPDP method for ranking,this paper has carried out several experiments.A case study of the data sets collected from AEEEM and PROMISE shows that ROCPDP is superior to the other eight benchmarks in one-to-one and many-to-one CPDP scenarios.Besides,in the many-to-one scenarios,ROCPDP is,by and large,comparable to the best baseline method which is performed in a specific within-project defect prediction scenario.

Keywords/Search Tags:

Ranking, Single Objective Optimization, Gradient Descent, Multiple Linear Regression

PDF Full Text Request

Related items

1	The Reseach And Application Of Stochastic Gradient Descent And Dual Coordinate Descent Algorithm
2	Application Of Gradient Descent Method In Machine Learning
3	Dynamic Regret Of Online Gradient Descent:Analyses And Applications
4	A Research Of Stochastic Gradient Descent Algorithm
5	Research On Big Data Multiple Regression Prediction Method Based On Hadoop
6	A Ranking Algorithm ListNet Based On Stochastic Gradient Descent
7	Research On Optimization And Application Technology Of Gradient Descent Algorithm In Deep Learning
8	Adaptive Stochastic Ranking Constraint Handling Methods For Evolutionary Optimization
9	Research On Human Face Recognition Under Complex Lighting Condtions
10	Optimization Algorithms Of Neural Networks Weights Based On Stochastic Gradient Descent