| Software defect prediction technology aims to find defective modules in advance and provide guarantee for software quality.However,the traditional software defect prediction model only predicts whether the software module contains defects and does not consider the effort required to inspect the software module,i.e.,inspecting the modules with more lines of code requires more effort.Effort-Aware Software Defect Prediction(EASDP)allows testers to find more defects while reviewing a certain amount of code(i.e.,20% lines of code of the dataset)and allocates testing resources more effectively.The EASDP model based on the classification algorithm first predicts the probability that the software module contains defects.Then the probability of a module containing defect is divided by lines of code as the defect density,and finally the software modules are ranked in descending order according to the defect density,but the software modules were given a poor ranking by the model,resulting in a lower proportion of the inspected bugs when inspecting 20%lines of code in the dataset.In addition,in actual software testing scenarios,newly developed projects often lack historical defect data to train a defect prediction model,so how to build an effective EASDP model in this case is also a problem worth studying.For the above problems,this thesis proposes an EASDP model based on an improved learning to rank algorithm,which is suitable for EASDP tasks in crossversion scenarios.This thesis then proposes an unsupervised EASDP model based on clustering and feature ranking,which is suitable for performing EASDP tasks in the lack of historical defect data.The specific research work is as follows:1)Aiming at the problem that the model based on classification algorithm leads to a low proportion of the inspected bugs,this thesis proposes an EASDP model based on improved learning to rank algorithm,which is called EffortAware Learning to Rank Algorithm(EALTR).The model uses the composite differential evolution algorithm to cross and mutate the model parameters,directly optimizes the performance of the model on the effort-aware evaluation metric(i.e.,proportion of the inspected bugs),and finally uses the parameters with the highest proportion of the inspected bugs to build the model.The experimental results show that,compared with the recently proposed CBS+ and EASC models,this model improves the proportion of inspected bugs by 32.97% and 54.47%,respectively,when inspecting20% lines of code in the dataset.2)Aiming at the problem that newly developed projects often lack historical defect data to train defect prediction models,this thesis proposes an unsupervised EASDP model based on Affinity Propagation(AP)clustering and Average Method Complexity(AMC)ascending ranking,referred to as AP-AMC.The model firstly uses the affinity propagation clustering algorithm to cluster the data of the target project and classify it into two clusters: defective and non-defective.Then,the software modules in the two clusters are ranked in ascending order according to the feature of AMC,and the ranking sequence of the modules classified as non-defective is connected to the tail of the ranking sequence of the modules classified as defective to form the overall ranking result.The experimental results show that,compared with 17 unsupervised models,AP-AMC average improves the proportion of inspected bugs by71.96%.Compared with the EALTR model combined with the data filtering method in the cross-project scenario,AP-AMC improves the proportion of inspected bugs by26.97%.Compared with the EALTR model combined with the transfer learning method in the cross-project scenario,AP-AMC improves the proportion of inspected bugs by14.20%.However,compared with the supervised EALTR model in the cross-version scenario,AP-AMC has no significant difference in the proportion of inspected bugs,and the precision is reduced by 19.23%. |