| During the software development,developers introduce a large number of code clones into software through reusing the existing code fragments.As code clones evolving with the time and the update of software,the software become more and more bloated and diffi-cult to maintain,which can impact on the quality,comprehensibility,and maintainability of software.Hence,it sparks a great deal of research on code clones,such as clone de-tection,clone analysis,clone maintenance and etc.Clone detection can help developers gather the code clones from software,and clone analysis can help developers understand the presence of code clones in software,and clone maintenance can help developers solve the issues caused by code clones.Consequently,code clone research has a great theo-retical significance and practical applicative value for improving the quality of software,enhancing the comprehensibility,and maintainability of software.During the process of code clone evolution,code clones may be modified by devel-opers that resulting the changes to clones,which will exacerbate the issue of code clones.The changes to the evolving code clone make them difficult to understand,and decrease the comprehensibility of the software.Meanwhile,changes to one code clone may give rise to some related changes to other code clones in a clone group according to the sim-ilarity of these code clones.We term such changes as clone consistent change to a clone group.The consistent change of code clones will increase additional maintenance cost,and will lead to clone consistency-defect when failing in making the changes,which will reduce the quality and maintainability of the software.Therefore,this thesis studies on the analysis and consistency maintenance of code clone based on software evolution.Analyz-ing and exploring the clone evolutionary characteristics of code clones can help developers understand the code clones.Predicting the consistent change of code clones can help de-velopers solve the issues of code clone consistency maintenance.Thus,this thesis can help developers improve software quality,and reduce the cost of software maintenance through synchronously maintain code clones at the development time.Aiming to address the difficulty on analyzing and understanding the evolving code clones,an approach for exploring code clone evolutionary characteristics based on clus-tering method is proposed in this thesis,which lay the foundation on the prediction of clone consistency-requirement.We firstly detect all code clones with detection tool from software’s repositories,and build all the clone genealogies for software to describe the evolution of the code clones.After that,the corresponding attribute sets are extracted from three different perspectives of the code clones and their evolution,including clone fragment,clone group and clone genealogy.Finally,the clustering method is employed for excavating and analyzing the clone evolutionary characteristics from all the code clones and their evolution.The experimental results show that most of the code clones are sta-ble during evolution;yet,there are also a significant number of changed code clones,and more than a half of them possess the consistent change.Aiming to address the problem of additional maintenance cost that caused by the clone consistent change in the evolution of a clone creating operation,we propose an approach for predicting clone creating consistency-requirement in this thesis,which can help developers avoid the extra maintenance cost of code clones from the perspective of clone prevention.We call the earliest code clone in software as "clone creating instance",and call that of whether this creating instance will lead to consistent change in its evolu-tion as "clone creating consistency-requirement".Firstly,we collect all the clone creating instances through detecting all code clones and building all clone genealogies from soft-ware’s repositories.And then,we extract code attribute set to represent the copied code clone and context attribute set for the pasted code clone.Finally,the machine-learning models are trained with these collection of clone creating instances,and are employed to predict clone creating consistency-requirement.The experimental results show that our approach can effectively predict the consistency-requirement for clone creating instances,which can help the developers to reduce the consistency maintenance cost of the code clone.Aiming to address the problem of consistency-defect that caused by the clone con-sistent change in their evolution,an approach for predicting clone changing consistency-requirement is proposed in this thesis,which can help developers to synchronize the de-velopment and maintenance of code clones.We call the changed clone code as“clone changing instance",and call that of whether such changes will lead to a consistent change to this changed clone group as“clone changing consistency-requirement".Firstly,through detecting all the code clones and building all the clone genealogies,all the clone chang-ing instances can be collected from software’s repositories.After that,three different attribute sets from the perspective of clone group are extracted to represent the clone changing instance,including code attribute set,the context attribute set,and the evolu-tionary attribute set respectively.Lastly,the machine-learning models are trained with the collection of clone changing instances,and are employed to predict clone changing consistency-requirement.The experimental results show that the proposed approach can reasonably predict the consistency-requirement for changing instances,which can help the developers avoid clone consistency-defects.Aiming to the problem that the insufficient clone instances is not enough to pre-dict clone consistency-requirement at the early software development phase,an empiri-cal study on clone cross-project consistency-requirement prediction is constructed in this thesis.We unify the clone creating instances and changing instances as "clone instances",and unify the corresponding consistency-requirement as“clone consistency-requirement".Firstly,for the software repositories,we collect their clone instances by detecting code clones and building clone genealogist from software repositories,and represent all clone instances with different attribute sets.Then,the different softwares are divided into train-ing softwares and testing softwares.We employ the data from training softwares to train the machine-learning models,and verify the ability of clone cross-project consistency-requirement on the testing softwares.The experimental results show that the cross-project prediction can be employed to predict clone consistency requirement at the early stage of software development.Our approach can help to maintain code clones at the development phase that reducing the cost of software maintenance.Combining with the software de-velopment,we develop and implement an eclipse plug-in for predicting clone consistency requirement,which can help maintain the code clone at the developing time that reducing the cost of software maintenance.In summary,this thesis presents a method for code clone analysis and consistency maintenance based on software evolution,that providing new ideas and approaches for addressing the analysis and understand of code clone,the maintenance of code clone con-sistency,the avoidance of code clone consistency-defects,the reduction of code clone consistency maintenance costs,and the improvement of software quality and maintain-ability during at the software development phase. |