Font Size: a A A

Research On Symbolic Execution Method Based On Machine Learning Solvability Prediction

Posted on:2024-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:Z H LiFull Text:PDF
GTID:2568307085987299Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet and the widespread application of information technology,various application software has become an indispensable tool in people’s daily lives.The multifaceted daily needs have led to the expansion of software scale and complexity of functions,while at the same time,the hidden vulnerabilities in software are also increasing day by day.As the core technology to ensure software quality,software testing can detect product errors in advance and improve software security.In recent years,Symbolic Execution has attracted extensive attention as an effective technology that can generate high coverage test suite and find deep errors in complex software applications.As the core technology of Symbolic Execution,constraint solving is often one of the main reasons that Symbolic Execution tools can’t be well applied to real programs.When the Symbolic Execution tool encounters some complex path constraints in the process of testing the program,it will cause the solver to spend a lot of time solving for it.When reducing the unlimited increase in solving time on a single path by setting a timeout threshold,the time spent on the abandoned path is in vain.Finally,the Symbolic Execution tool cannot cover more branches within a given time limit,and Symbolic Execution is inefficient.In view of the above problems,this thesis proposed a symbol execution method based on machine learning solvability prediction,and further introduced the online learning idea,weighting technology and the improved cluster-based under-sampling method to solve the problem of unbalanced data and poor scalability of the predictor caused by the introduction of machine learning technology.The main ideas of this thesis are as follows:(1)This method embedded a path constraint solvability prediction module into the traditional constraint solving process to predict the solvability of path constraints that had exceeded the solution time limit,and adopted different processing strategies based on the prediction results to reduce excessive invalid solving processes.In this method,the problem of path constraint solvability inference was solved as a machine learning problem,and the K-nearest neighbor algorithm was used to learn the relationship between path constraint and solvability.If the prediction result is solvable,it was the same as the normal symbol execution processed and directly sent to the solver for further solution.Otherwise,the current solution process was abandoned and a new path constraint was selected for processing.(2)To deal with the complexity of path constraint solvability inference problems,online learning strategies and weighting techniques were introduced in(1).Adding the constraint model generated by the program runtime to the KNN sample library is based on the assumption that path constraints from the same program have greater similarity in syntactic structure,which is more helpful for prediction.In addition,in order to further improve the accuracy of prediction,samples were given weights,and weights were updated based on the differences between the prediction results and the actual solution results.The prediction results were jointly determined based on this weight and distance.Due to the serious class imbalance and the computational complexity of the predictor in path constrained datasets,under-sampling method based on prototype selection and generation proposed in this thesis was applied to construct the predictor.Finally,a balanced dataset with a class balance ratio close to 1was obtained.(3)To verify the effectiveness of the prediction method proposed in this article,experiments were conducted on three data sets.In addition,this thesis also conducted comparative experiments from the perspective of a single program to evaluate the effectiveness of this method in reducing constraint solving time.The experimental results showed that the two methods proposed in this article perform better in various aspects than other methods of the same type.
Keywords/Search Tags:Symbolic execution, Constraint solving, K-nearest neighbors, Online learning, Class imbalance
PDF Full Text Request
Related items