Research On Multi-source Heterogeneous Software Defect Prediction Based On Data Selection

Posted on:2023-02-13

Degree:Master

Type:Thesis

Country:China

Candidate:J H Deng

Full Text:PDF

GTID:2558306620455214

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Software defects have an important impact on software quality and even software economy.In order to reduce the loss caused by software defects,one of the most active problems in the field of software engineering is how to find software defects efficiently and accurately.In the 1990 s,it was discovered that defects were not distributed randomly in software.Then a series of prediction models for software defect tendency,quantity,severity and distribution were proposed.However,in real software development scenarios,it is impossible to guarantee that every software system has rich change log data.Especially for newly developed and small-scale software systems,the lack of training data leads to the problem of "cold start" in software defect prediction modeling,which limits the application scope of research results.Multi-sources Cross Project Defect Prediction(MCPDP)is designed to use multiple historical data from other projects(source projects)to predict the likelihood of software module defects in the target project.This study solves the problem of cold start of defect prediction modeling and provides a solution to build defect prediction model for new software or software system lacking historical data.However,due to the different development languages,programming styles and design patterns of different projects,data are heterogeneous,which makes the distribution state of source data and target data different.This paper proposes a solution to the heterogeneity of cross-project data.Firstly,the source data and target data are mapped to the same public space,and then the feature space of source data and target data is overlapsed by the rotation adjustment of projection matrix,so as to achieve the purpose of feature alignment.Secondly,in order to further improve the accuracy of heterogeneous defect prediction across projects,a source data selection method is designed.On this basis,a cross-project defect prediction model is constructed.In order to prove the effectiveness of the proposed method,experiments were carried out based on four open data sets,SOFTLAB,NASA,Relink and AEEEM,and the results showed that the proposed method improved 4%and 5% in F-measure index,respectively,compared with the baseline method,proving that the proposed method has good performance...

Keywords/Search Tags:

multi-source domain, heterogeneous, Defect prediction, Data selection, Characteristics of the alignment

PDF Full Text Request

Related items

1	Research On Software Defect Prediction Technology For Few-sample Data
2	Research And Application Of Heterogeneous Software Defect Prediction Based On Multi-source Transfer Learning
3	Research And Application Of Cross Domain Identification Method For Multi-source Heterogeneous Data
4	Research On Domain Adaptation Methods For Modeling Multi-source Heterogeneous Data
5	Research On Prediction Of Multi-source Data Based On Time And Space
6	Research On Anomaly Accident Prediction Via Multi-Source Heterogeneous Spatio-Temporal Data
7	Research On Cross-project Software Defect Prediction Method Based On Feature Selection And Instance Transfer
8	A Research On Popularity Prediction Of Tourist Attractions Based On Multi-source Heterogeneous User-Generated Data
9	Research On Deep Multi-Source Domain Adaption Recognition Method Under Complex Data Conditions
10	Research On Software Defect Prediction Method Based On Training Data Selection