Font Size: a A A

Research On Identification And Handling Methods For Coincidental Correct Test Cases In Multi-Fault Localization

Posted on:2024-12-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y H WuFull Text:PDF
GTID:1528307334450474Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Software testing and debugging are crucial stages in software development,encompassing steps such as fault localization,failure analysis,and repair.To meet the demands of rapid iteration and continuous integration in modern software development,automated fault localization methods have been widely proposed.However,the presence of Coincidental Correctness(CC)test cases poses a challenge to the accuracy of fault localization,making the identification and handling of CC test cases critical in the field of automated fault localization.Current research on CC generally assumes that a program contains only a single fault.However,in real industrial settings,programs often contain multiple faults.Consequently,failing test cases no longer have a unique correlation with errors,making existing methods for identifying and handling CC test cases inapplicable.The identification and management of coincidentally correct test cases in multi-fault scenarios thus present new challenges.This paper conducts an in-depth study on CC test cases in multi-fault scenarios,with the main contents and contributions as follows:(1)Addressing the limitations of current research in analyzing the mechanism of CC test cases’ impact on multiple fault localization,a classification method for CC test cases in multi-fault scenarios is proposed,and the effect of different categories of CC test cases on fault localization accuracy is analyzed.This research initiates with an in-depth analysis of the relationship between CC test cases and multiple faults in multi-fault scenarios,proposing and defining three categories of CC test cases: specific,unspecific,and irrelevant.It then explores the impact of these different categories of CC test cases on the accuracy of fault localization through both theoretical analysis and empirical research methods.The findings indicate that employing matched handling strategies for the various categories of CC test cases can significantly enhance fault localization accuracy.This analysis lays a theoretical foundation for the methods of identifying CC test cases and the corresponding handling strategies in multi-fault programs that are proposed later in the study.(2)Addressing the issue where the single correlation between CC test cases and faulty program entities changes in multi-fault scenarios,rendering similarity-based CC test case identification methods ineffective,a high-precision CC test case identification method is proposed.In multi-fault scenarios,the complex association between CC test cases and multiple faults makes traditional methods,which rely on the similarity of coverage information,inadequate for accurately identifying CC test cases.This research initially focuses on enhancing the representation of test cases,extracting three key features from the test case execution process: suspicion factors,coverage factors,and similarity factors.Based on these features,a 126-dimensional test case feature vector is constructed.Subsequently,it proposes a machine learning-based method for identifying CC test cases,named MLCCI(Machine Learning-based Coincidental Correct Identification).This method utilizes ensemble learning,a machine learning approach effective in handling highdimensional data,to classify test cases,thereby achieving precise identification of CC test cases and improving the accuracy of fault localization in multi-fault environments.Experimental results show that in terms of accuracy in identifying CC test cases,MLCCI achieves a 23.69% and 13.01% increase in Recall and F-measure evaluation values,respectively,compared to FCCI.Compared to FW-KNN,MLCCI increases by 50.03% and 35.01% in Recall and Fmeasure values.(3)Addressing the issue where different categories of CC test cases have varying impacts on fault localization in multi-fault scenarios,a CC test case categorization processing method is proposed.Building on the theoretical analysis results from research content(1),in light of the diversity of CC test cases in multi-fault scenarios and their varying impacts on fault localization,this study proposes handling strategies for CC test cases in multi-fault programs.Employing a deep clustering algorithm,this research further distinguishes between specific and other CC test cases within the pool of CC test cases.By applying different handling strategies to these categorized test cases,the effectiveness of multi-fault localization techniques is further enhanced.Experimental results show that using the Ochiai,Dstar,and Jaccard fault localization formulas,the CC test case handling strategy for multifault localization achieves further improvements of 11.11%,9.09%,and9.52%,respectively,in the TOP-1 evaluation values compared to the MLCCI method.Moreover,in TOP-1,TOP-3,and TOP-5 evaluation values,compared to the original fault localization method,this strategy improves by 15.00%,9.94%,and 11.06%,respectively;compared to the FCCI method,the increases are 34.62%,20.51%,and 9.48%,respectively;and compared to the FW-KNN method,the improvements are 80.00%,37.68%,and 11.59%,respectively.In summary,this study focuses on the identification methods and handling strategies for coincidentally correct test cases in the context of multi-fault localization,verified through experiments with 539 real multifault programs.Experimental results demonstrate that the methods proposed by this study can accurately identify coincidentally correct test cases in multi-fault scenarios,and the corresponding categorized handling strategies effectively enhance the accuracy of multi-fault localization.
Keywords/Search Tags:multi-fault localization, coincidentally correct test cases, test case clustering, ensemble learning
PDF Full Text Request
Related items