Research On Evaluation Methods For Paraphrase Generation

Posted on:2023-08-30

Degree:Master

Type:Thesis

Country:China

Candidate:X R Jian

Full Text:PDF

GTID:2558306848955149

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Paraphrase generation aims to generate a sentence(a.k.a.paraphrase)but distinct expression as the given input sentence,which has been widely used in customer-serviceoriented dialogue and production introduction.Thus,the appropriate evaluation metrics become a crucial factor for improving paraphrase generation.Nowadays,the mainstream paraphrase generation metrics come from machine translation,and evaluate the semantic adequacy and fluency via computing the matching degree between paraphrase and references.However,the manually made references are limited and therefore can not cover the rich paraphrase phenomenon and result in uncorrected evaluation results.Besides,there is no metric for evaluation of expression diversity.To address the above problems,we investigate semantic evaluation metrics based on deep neural networks,diversity evaluation metrics and the combined metrics.The contributions are summarized as follows.(1)Design a systematic analysis and evaluation method for the evaluation metrics of paraphrase generation.First,we analyze factors considered in the popular metrics,including N-gram,synonyms and different granularities of semantic matching.Then,we design a annotation specification and use it to annotate a diversity-oriented evaluation dataset.We use the Person’s correlation coefficient and the Spearman’s correlation coefficient to evaluate the automatic metrics.The experimental results show that the semantic consistent evaluation metrics based on deep learning outperforms mainstream metrics by about 9%,which demonstrates the superiority of the deep semantic metrics in the case of diversified paraphrases.(2)Propose a comprehensive evaluation metric over the three dimensions based on deep learning.First,we propose a semantic evaluation metric relying on input sentence,and a fluency evaluation metric relying on references and three diversity evaluation metrics based on the surface information.Then,we propose unified evaluation metrics combining the above metrics in three fusion methods: linear-form PEWF,product-form PEMF and exponentially weighted form WPEMF.The experimental results on the diversity-oriented dataset show that the proposed unified evaluation metric outperforms all of the single-dimension evaluation metrics.In particular,the WPEMF fusion metric achieves the best results,in which Pearson’s and Spearman’s correlation coefficients are57.40% and 59.20% respectively.

Keywords/Search Tags:

Paraphrase Evaluation, Paraphrase Generation, Semantic Representation, Diversity, Automatic Evaluation, Human Evaluation

PDF Full Text Request

Related items

1	Research On Paraphrase Processing Methods Based On Neural Networks
2	Research On Specific-domain Monolingual Paraphrase Extraction In Automatic Evaluation Of Machine Translation
3	Research On Controllable Paraphrase Generation
4	Research On Statistical Paraphrase Acquisition And Generation
5	Research On Chinese Paraphrase Patterns And Collocations Extraction
6	Research On The Method Of Automatic Paraphrase Extraction Based On Markov Network Model
7	A Research On Paraphrase Detection And Generation Based On Sentence Representation
8	Research Of Diverse Paraphrase Generation
9	The Study On Paraphrase Generation Based On Neural Network
10	Research On Paraphrase Identification Method Based On Deep Semantic Understanding