Font Size: a A A

Paraphrase Recognition Based On Hybrid Circuit

Posted on:2015-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y L GuanFull Text:PDF
GTID:2268330428967676Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Paraphrase, domestic researchers sometimes called "rewrite", As the name suggests is a different meaning to the same expression.Paraphrase in natural language is a very common phenomenon, but also in natural language processing (NLP) applications plays a very important role,paraphrase is a research difficulty and hot in natural language processing (NLP).Thus, now more and more attentions come from researchers.The main object of this study is paraphrase recognition based on resistance distance. Paraphrase recognition technology can handle real-time machine translation unregistered phrases encountered, paraphrase recognition technology can also identify multiple questions in the form of question and answer system to improve system performance, paraphrase recognition technology can also be used to generate, compression, similar sentence recognition in a multi-document summary of the system, and so on.This paper proposes a method to calculate the distance between two new sentences, this method is similar to the calculation method of the similarity, except that, the smaller the value is calculated from the resistance, the more similar the two sentences, and the greater the degree of similarity is calculated, the more similar the two sentences. We represent the two sentences in figure(V,E, ω), then we merge these two figures, the same word node combined together, at the same time the corresponding weights ω are combined together, that is the reciprocal of the weights corresponding to the "resistance". The resistance distance between these two sentences is the hybrid circuit total resistance divided by the number of nodes in the new graph. Finally, we calculated the distance by two sentences in order to distinguish whether these two sentences are paraphrase sentences.For this method, accuracy and F1value is not high enough. So for this method, we propose an improved method of this method. We introduce the Laplace matrix L calculation method to improve the calculation of resistance. We represent the two sentences in figure(V,E, ω), then we merge these two figures, the same word node combined together, then we write the adjacency matrix A and degree matrix D of the new graph, and the Laplace matrix L=D-A. Then we can get the pseudo-inverse L+of the Laplace matrix L. And then we can calculate the resistance between any two nodes using the formula. The resistance distance between these two sentences is the hybrid circuit total resistance divided by the number of nodes in the new graph. Finally, we calculated the distance by two sentences in order to distinguish whether these two sentences are paraphrase sentences. Experimental results demonstrate the effectiveness of this method.
Keywords/Search Tags:Paraphrase recognition, Resistance distance, Laplace matrix, Hybridcircuit
PDF Full Text Request
Related items