With the rapid development of the Internet and the widespread use of open source software,by reusing existing open source software or related code,software development efficiency can be greatly improved and development costs can be reduced.Software developers spend a lot of time searching for relevant software or code on the Internet for reuse,so how to improve the efficiency of open source code search has been widely concerned in the industry.Code search through keywords that can reflect code functions is a commonly used open source code search method,but this method mainly implements code query through grammatical keyword content matching,so there is a problem that the search accuracy is not high,and the user is required.Manually filter the search results to get the desired source code.At present,the open source code search method based on input and output matching solves the above-mentioned shortcomings of the keyword code search method,and it is an important search method based on semantic matching.This method encodes the code in the code repository as a constraint,and also converts the input and output pairs provided by the user into constraints,and returns the matching constraints by using the Satisfiability Modulo Theories(SMT)solver to return The source code the user needs.The source code search method based on input and output matching makes up for the shortcomings of the keyword code search method,but it also has obvious shortcomings.First,the method only considers the processing of sequential structure code,and does not deal with complex structures.Second,the method converts the code into constraints during the code matching phase,resulting in lower search efficiency.After investigating and researching the existing work,this paper improves and perfects the existing open source code search methods based on input and output for their shortcomings in research.First,the code constraint transformation in the existing input and output source code search method is advanced to the code organization stage,the complex structure is converted into a sequential structure,and then the constraint transformation is performed.Simply matching a given input and output to a converted constraint during the code matching phase will reduce the time it takes to search.Secondly,it is proposed to convert the selection structure program code into sequential structure program code,and semantically prove the logical correctness of the conversion.This method proposes to process the code of the selected path into multiple single-path code extraction relations according to the characteristics of the selection path,and proves the semantic correctness of this conversion method.On the basis of the above,the double-branch structure conversion algorithm,the multi-branch structure conversion algorithm,and the branch nested structure conversion algorithm are respectively designed.Then,it is proposed to convert the program code containing the loop structure into a sequential structure and semantically prove the logical correctness of the conversion.The conversion idea is to convert the loop into a branch structure.First,the number of loops is determined by the difference equation,and then the branch structure of the transform is determined,and finally converted into a sequence structure.Finally,the Python programming language is used to implement tools that convert complex structures into sequential structures.The tool includes a code loading module,a syntax correctness analysis module,and a code conversion module.The example run shows that the tool can get the desired conversion code for the input complex code,which is consistent with the theoretical analysis. |