| With the rapid expansion of the Internet, people are surrounded by more and more information, the information generated hundreds of millions around the world every day, obviously, it is impossible to gain from the human alone complex and chaotic ocean of knowledge for yourself meaningful information. Therefore, it is desirable in some way to be able to quickly find what you’re looking for. Text similarity calculation is to solve this problem and proposed. But the computer does not understand the meaning of the text directly, so we must be able to identify some of the algorithms so that the computer and find the text we need representation.In this article, the first statistical study of classic vector space model(VSM) and based on TF-IDF algorithm and compared with HowNet semantic similarity algorithm, finally combining the advantages of both types of algorithms proposed A combination algorithm, making text similarity calculation is more accurate. It also raised the overall similarity is obtained by the partial similarity comprehensive idea to improve the text of the similarity of processes. In this paper, NLPIR Institute of Computing Technology of the Chinese word system, combined with the improved algorithm presented earlier, given the system Chinese text similarity computing implementations, and multiple sets of different types of text as an example of the test, and the traditional TF-IDF algorithm based on VSM were compared, improved algorithm was validated.In this paper, the method of processing and analysis of practical problems in the field of public opinion have a certain practical significance and application prospects. |