A Chinese Text Similarity Algorithm Based On Semantic Networks

Posted on:2016-06-11

Degree:Master

Type:Thesis

Country:China

Candidate:N Q Zou

Full Text:PDF

GTID:2308330470960233

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the rapid expansion of the Internet, people are surrounded by more and more information, the information generated hundreds of millions around the world every day, obviously, it is impossible to gain from the human alone complex and chaotic ocean of knowledge for yourself meaningful information. Therefore, it is desirable in some way to be able to quickly find what youâ€™re looking for. Text similarity calculation is to solve this problem and proposed. But the computer does not understand the meaning of the text directly, so we must be able to identify some of the algorithms so that the computer and find the text we need representation.In this article, the first statistical study of classic vector space model(VSM) and based on TF-IDF algorithm and compared with HowNet semantic similarity algorithm, finally combining the advantages of both types of algorithms proposed A combination algorithm, making text similarity calculation is more accurate. It also raised the overall similarity is obtained by the partial similarity comprehensive idea to improve the text of the similarity of processes. In this paper, NLPIR Institute of Computing Technology of the Chinese word system, combined with the improved algorithm presented earlier, given the system Chinese text similarity computing implementations, and multiple sets of different types of text as an example of the test, and the traditional TF-IDF algorithm based on VSM were compared, improved algorithm was validated.In this paper, the method of processing and analysis of practical problems in the field of public opinion have a certain practical significance and application prospects.

Keywords/Search Tags:

Text Similarity, HowNet, TF-IDF, VSM

PDF Full Text Request

Related items

1	The Text Similarity Study Base On Hownet
2	Research On Text Clustering Based On Hownet
3	Research On Algorithm Of Chinese Text Similarity Based On Semantics
4	Research On Chinese Text Similarity Computing Based On Semantic Weighted
5	A Chinese Text Similarity Algorithm Based On Semantic Networks
6	The Research Of Semantic Similarity Computing Algorithm Based On HowNet
7	Research On Text Similarity Algorithm Based On WMD Distance
8	Research On The Modular Chinese Sentence Similarity Computing Based On Hownet
9	Research On Text Similarity Measure Method Of Combining New Word Analysis And Semantic Analysis
10	Research Of Hownet Based Word Semantic Computation And Application