Font Size: a A A

Research On Non-Autoregressive Models In Natural Language Generation

Posted on:2024-06-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:F HuangFull Text:PDF
GTID:1528307325966639Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Natural language generation is a fundamental research direction in the field of artificial intelligence.While autoregressive generation has shown potential in producing highquality texts,the academic community continues to explore more effective language modeling methods.In recent years,non-autoregressive generation,a new paradigm in natural language generation,has garnered extensive attention.Unlike autoregressive methods,which generate sequences from left to right,non-autoregressive methods use parallel operations to predict all words simultaneously,which leads to a faster generation process and alleviates various biases associated with autoregressive generation.Although parallel generation has the above advantages,current non-autoregressive models suffer from low fluency in generation and poor versatility in applications.To address these challenges,we conduct four research works:· Learning Theory of Non-Autoregressive Generative Models and Training Methods Based on Proxy Distribution: Based on information theory,we develop a theory for the learning of non-autoregressive models.This theory uncovers the problem of information loss during training and highlights that the extent of this loss primarily depends on the target distribution.Hence,this work proposes a novel training method with a proxy distribution,effectively alleviating the problem of information loss.Experimental results on machine translation demonstrate that this method achieves significant improvements in fluency while maintaining a 15 times speedup in inference.· Non-Autoregressive Generative Model with Directed Acyclic Graph: To alleviate the problem of mixing multiple output modes during parallel prediction,we introduce a directed acyclic graph structure into non-autoregressive models.The structure can effectively capture multiple possible outputs for a given input,thereby improving the fluency of generation.Experiments in machine translation demonstrate that this method outperforms non-autoregressive model baselines by 3.1 in BLEU-4 score.Moreover,it is the first non-autoregressive model to attain comparable translation quality to autoregressive models,while delivering 7 to 14 times faster inference speed.· Application of Non-Autoregressive Generative Models in Unsupervised Style Transfer: This work aims to investigate the application of non-autoregressive models in unsupervised scenarios,particularly focusing on style transfer.It proposes an unsupervised objective for training the non-autoregressive model and introduces a word alignment module to prevent the generation of irrelevant content.Experiments show that this model successfully mitigates the problem of hallucination that occurs in autoregressive generation,and achieves improved quality and efficient inference for style transfer.· Pretraining of Non-Autoregressive Generative Model and Its Applications in General-Purpose Generation Tasks: This work proposes a non-autoregressive model pre-trained on a large unlabeled corpus,which significantly improves the generation quality on general-purpose downstream tasks.Experiments reveal that this model outperforms previous non-autoregressive baselines in multiple generation tasks,with an average score improvement of 4.2.Moreover,it is the first time that a non-autoregressive model surpasses pre-trained autoregressive models in general-purpose generation tasks.It also offers a 17 times speedup in throughput.
Keywords/Search Tags:Natural Language Generation, Non-Autoregressive Model, Efficient Language Generation, Machine Translation
PDF Full Text Request
Related items