Effort-aware Software Defect Prediction Methods Based On Learning To Rank And Clustering

Posted on:2023-03-28

Degree:Master

Type:Thesis

Country:China

Candidate:J Q Rao

Full Text:PDF

GTID:2558307118499334

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Software defect prediction technology aims to find defective modules in advance and provide guarantee for software quality.However,the traditional software defect prediction model only predicts whether the software module contains defects and does not consider the effort required to inspect the software module,i.e.,inspecting the modules with more lines of code requires more effort.Effort-Aware Software Defect Prediction(EASDP)allows testers to find more defects while reviewing a certain amount of code(i.e.,20% lines of code of the dataset)and allocates testing resources more effectively.The EASDP model based on the classification algorithm first predicts the probability that the software module contains defects.Then the probability of a module containing defect is divided by lines of code as the defect density,and finally the software modules are ranked in descending order according to the defect density,but the software modules were given a poor ranking by the model,resulting in a lower proportion of the inspected bugs when inspecting 20%lines of code in the dataset.In addition,in actual software testing scenarios,newly developed projects often lack historical defect data to train a defect prediction model,so how to build an effective EASDP model in this case is also a problem worth studying.For the above problems,this thesis proposes an EASDP model based on an improved learning to rank algorithm,which is suitable for EASDP tasks in crossversion scenarios.This thesis then proposes an unsupervised EASDP model based on clustering and feature ranking,which is suitable for performing EASDP tasks in the lack of historical defect data.The specific research work is as follows:1)Aiming at the problem that the model based on classification algorithm leads to a low proportion of the inspected bugs,this thesis proposes an EASDP model based on improved learning to rank algorithm,which is called EffortAware Learning to Rank Algorithm(EALTR).The model uses the composite differential evolution algorithm to cross and mutate the model parameters,directly optimizes the performance of the model on the effort-aware evaluation metric(i.e.,proportion of the inspected bugs),and finally uses the parameters with the highest proportion of the inspected bugs to build the model.The experimental results show that,compared with the recently proposed CBS+ and EASC models,this model improves the proportion of inspected bugs by 32.97% and 54.47%,respectively,when inspecting20% lines of code in the dataset.2)Aiming at the problem that newly developed projects often lack historical defect data to train defect prediction models,this thesis proposes an unsupervised EASDP model based on Affinity Propagation(AP)clustering and Average Method Complexity(AMC)ascending ranking,referred to as AP-AMC.The model firstly uses the affinity propagation clustering algorithm to cluster the data of the target project and classify it into two clusters: defective and non-defective.Then,the software modules in the two clusters are ranked in ascending order according to the feature of AMC,and the ranking sequence of the modules classified as non-defective is connected to the tail of the ranking sequence of the modules classified as defective to form the overall ranking result.The experimental results show that,compared with 17 unsupervised models,AP-AMC average improves the proportion of inspected bugs by71.96%.Compared with the EALTR model combined with the data filtering method in the cross-project scenario,AP-AMC improves the proportion of inspected bugs by26.97%.Compared with the EALTR model combined with the transfer learning method in the cross-project scenario,AP-AMC improves the proportion of inspected bugs by14.20%.However,compared with the supervised EALTR model in the cross-version scenario,AP-AMC has no significant difference in the proportion of inspected bugs,and the precision is reduced by 19.23%.

Keywords/Search Tags:

Effort-Aware, Defect Prediction, Learning to Rank, Clustering Algorithm, Unsupervised Learning

PDF Full Text Request

Related items

1	Research And Application Of File-Level Effort-Aware Software Defect Prediction
2	Research And Application Of Effort-aware Software Defect Prediction Based On Approximate Density
3	Research On Cross-Project Software Defect Prediction
4	Research On The Methods Of Cross Project Software Defect Prediction Based On Learning To Rank Approach
5	Research On Just-in-time Software Defect Prediction Method Based On Learning To Rank
6	Fine-Grained Fault-Proneness Prediction
7	Incomplete Supervision Of Software Defect Prediction Technology Research
8	Research On Methods Of Ranking-Oriented Software Defect Prediction
9	Software Defect Prediction Based On Machine Learning
10	Research And Implementation Of Product Surface Defect Detection Method Based On Unsupervised Learning