Font Size: a A A

The Research On Block-based Semantic Similarity Of Sentences

Posted on:2012-02-14Degree:MasterType:Thesis
Country:ChinaCandidate:J LiFull Text:PDF
GTID:2178330335990360Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In natural language processing, the relationship between sentences, especially in the computation of sentence similarity has been one of the hot and difficult research spot. It has a very wide range of applications in the automatic answering system, information retrieval, information filtering, natural language processing, intelligent search, machine translation and other fields. Research sentences similarity computation and the accuracy of the results, have a direct impact in other areas of research. However, the concept of sentence similarity is not very clear, it does not accurately pointed out that the similarity of sentences is grammatical, semantic, or pragmatic level, which also has caused great difficulties to the current research. This paper, based on HowNet, focuses on the meaning of the sememes, concepts, and sentences in the semantic level similarity, proposes a computation of semantic similarity algorithm based on block , the method first uses LTP platform and find predicate segmentation words and sentences by the module, and then defines the words of using HowNet rich library expansion, etc. and examples of disambiguation, disambiguation of words and sentences after the concept of HowNet one correspondence, each block of sentences by the weights of different after similarity computation, and finally proved that this sentence semantic similarity computation method is practical and effective through the experiments. This paper has made several innovative work:(1)Liu Qun, Jiang Min, Zhang Zhenxing, etc. made on the basis of word similarity computation method, to further explore the concepts relationship of Hownet, proposes a method of concept similarity computation based on HowNet, calculation of sentence similarity for the next step to lay the foundation.(2)proposes a disambiguation strategy based on the words HowNet. The method , according to Hownet , will have extracted the words with multiple concepts, to expand its match, and according to their specific part of speech and sentence and fixed sample before and after the match with part of speech, to accurately determine the concept of words, making in computing the semantic similarity of the sentence before the sentence s word can correspond with the Hownet s. concept.(3)proposes a computation method of block-based semantic similarity of sentences, each sentence of the method be taken as a unified whole, for the semantic similarity of two sentences need to compute, the ratio of the amount of information those contain great affect similarity of the sentences, and such method also take effect on the similarity of concept and blocks, so that the sentence similarity computation method has a top-down consistency.
Keywords/Search Tags:HowNet, Similarity of concept, Word sense disambiguation, Ratio of the amount of information, Sentences semantic similarity
PDF Full Text Request
Related items