Font Size: a A A

Research On Neural Machine Translation Based English Grammatical Error Correction

Posted on:2020-03-06Degree:MasterType:Thesis
Country:ChinaCandidate:J F DengFull Text:PDF
GTID:2415330590974447Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The goal of grammatical error correction is to automatically correct grammatical errors in written texts.At present,the mainstream approach regards it as a task of monolingual translation,and the error correction is seen as the process of translating a "wrong" sentence into a "right" one.This paper investigates methods to improve the performance of grammatical error correction from three aspects: modeling,training algorithm and data augmentation.(1)Most grammatical errors occur in a local context of the text,but there are also some grammatical errors that span multiple segments of the text.We apply the advanced encoder-decoder model in neural machine translation,Transformer,into grammatical error correction task,to take into account both the local context and the long-distance dependencies in the text when modeling error correction process.The experimental results on standard datasets show that Transformer is significantly superior to encoder-decoder model which based on Recurrent Neural Network or Convolutional Neural Network.(2)The typical neural machine translation faces problems such as exposure bias and loss-evaluation mismatch.In addition,the automatic evaluation metrics often can not fully reflect the real performance of the model.We propose an adversarial learning framework,in which exists a discriminator and a generator.Given a sentence with grammatical error,the discriminator is responsible for distinguishing whether its corrected version comes from model output or human annotation.The goal of the generator is to generate high quality correction to cheat the discriminator.Through confronting the discriminator,the generator is encouraged to generate correction closer to human expression.We use the policy gradient method in reinforcement learning to overcome the optimization problems caused by the discrete nature of natural language text.The experimental results show that the proposed adversarial learning framework can improve the fluency of correction sentences output by the generator.(3)At present,the scale of "error-correction" parallel corpus for grammatical error correction is limited,which brings direct difficulties to the application of machine translation approachs in this task.To alleviate this problem,We synthesize pseudo "errorcorrection" parallel sentence pairs based on back-translation method.In order to introduce a variety of grammatical errors,we first construct pseudo parallel sentence pairs by using sampling decoding strategy in the back-translation,and we compare the effects of pseudo data synthesized by using different decoding strategies on training forward grammatical error correction model.Furthermore,we use the self-proposed adversarial learning framework to improve the performance of the reverse grammatical error generation model in back-translation,so as to construct more real pseudo "error-correction" parallel sentence pairs to help the training of the forward model.The experimental results show that our method can synthesize effective pseudo sentence pairs and improve the performance of grammatical error correction.
Keywords/Search Tags:Grammatical Error Correction, Machine Translation, Generative Adversarial Network, Back-translation, Data Augmentation
PDF Full Text Request
Related items