Font Size: a A A

Source File Consecutive Change Impact Analysis On Software Quality

Posted on:2016-05-06Degree:MasterType:Thesis
Country:ChinaCandidate:M X DaiFull Text:PDF
GTID:2308330476953498Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Over software development process, various changes are submitted by different developers, including new features, bug fixing or refactoring. However, these changes often induce bugs, and these bugs would bring losses if they do not be reported and fixed in time.By observing patterns in the software code change history, this paper proposes a concept called file consecutive changes, and then studies the phenomenon of file consecutive changes and their impact on software quality. A set of consecutive change features are designed to predict defects.Firstly, beginning with four research questions about source file consecutive changes, this paper puts forward source file consecutive change impact analysis method. We use fixed interval and adaptive interval to extract change chains from version control repository. There are two types of change chain, namely source file consecutive change chain and source file frequent change chain. Then we study the distribution of change chain from two aspects —— time and revision number. We analyze the impact of change chain on software quality by comparing change chain with other changes.Next, this paper selects several open source projects from Github, to conduct a series of experiments using empirical software engineering technology. The experimental results show that: ① the quantity of change chain is enough; ② most change chains occur at the early stage; ③ change chains have a strong and negative impact on quality in a short time. With the time range increasing, the impact becomes slight. When the length of change chain is 4 or 5, the impact becomes strongest; ④ compared with the impact of consecutive developer change chain and frequent change chain, the impact of consecutive bug fixing change chain is strongest.Then, this paper designs new features of source file consecutive change, and conducts experiments to know correlation between these features and software defects. The experimental results show that these features are more correlated to software defects than change features and source code features. Finally, several experiments of defect prediction are conducted based on above features. We use projects from Github as data set, and select three defect prediction algorithms, namely logistic regression, naive bayes and decision tree, to create various defect prediction models. After analyzing experimental results, we find that consecutive change features could make defect prediction models work better, which proves the effectiveness of source file consecutive change features.
Keywords/Search Tags:consecutive code changes, quality impact analysis, software repository mining, defect prediction
PDF Full Text Request
Related items