Research On Decoding Methods Of Statistical Language Models

Posted on:2022-11-20

Degree:Master

Type:Thesis

Country:China

Candidate:R W Tao

Full Text:PDF

GTID:2518306764467614

Subject:Computer Software and Application of Computer

Abstract/Summary:

PDF Full Text Request

Text generation in natural language processing is an important research direction and has a wide application space,such as Machine translation,Text summarization,and Openended text generation.The best results are achieved by text generation systems built using statistical language models and decoding algorithms.The decoding algorithms are usually reported as the technical details of experiments compared to the huge leaps in generation quality based on language model training and structural improvements,while generating higher quality text requires decoding algorithms in conjunction with language models.In this thesis,improvements are proposed from three perspectives: probability distribution in language model generation,open-ended generation decoding,and constrained generation decoding.1.The probability distribution generated by the language model suffers from modeling error and exposure bias,which can cause a continuous degradation in the quality of the generated text.This thesis analyzes the causes of such errors and propose a way to correct and fine-tune the probability distribution by constructing error samples and fusing them using the masking mechanism in the language model to reduce the errors of the probability distribution.It is experimentally demonstrated that this correction method can produce texts with lower perplexity（PPL decreased by 0.41）under the same decoding algorithm and mitigate the degradation of text generation quality due to model errors.2.Open-ended text sampling algorithms inevitably cause text degradation by intercepting low probability characters and sampling.This thesis analyzes the causes of text degradation,statistically proves that the sampled text cannot reach the quality of the real text by adjusting the sampling parameters,and proposes a decoding model that fits the information sequence generated by the language model in an autoregressive way to guide the language for generation in the inference stage.It is experimentally demonstrated that the decoding model is closer to the real text in terms of static metrics than other sampling algorithms（PPL difference of 0.13）and has better performance in manual evaluation,and finally,it is experimentally demonstrated that the decoding model is equally effective for texts of different corpora.3.Constrained text generation uses beam search to complete the decoding of the target,but there is a problem that the search results are inconsistent with the actual expectation,i.e.,the text quality obtained from high search width results is worse.The thesis analyzes that the wrong introduction of low-probability modeling characters in high search width causes this inconsistency,and therefore propose a method to dynamically select core characters based on the probability distribution generated by the search process,which improves the search width while avoiding text degradation.It is experimentally demonstrated that the beam search algorithm with high search width achieves better results（BLEU value increased by 0.32）on Chinese-English translation datasets after filtering low-probability characters.Finally,based on the methodological innovation,an Open-ended text writing system is designed and implemented,which can provide users with different styles and types of generated texts to assist them in text creation.Text decoding algorithms play an important role in realizing high quality text generation,and this thesis complements the implementation of decoding algorithms in the current stage and provides new decoding ideas for better open-ended text generation and constrained text generation.

Keywords/Search Tags:

Decoding algorithm, Statistical language model, Probability distribution, Open-ended text generation, Constrained text generation

PDF Full Text Request

Related items

1	Research And Implementation Of Intelligent Algorithms For Open-ended Text Generation
2	Neural Network-based Inference Algorithm For Constrained Text Generation
3	Keyword-Constrained Text Generation
4	Research On The Scoring Method Of Open-ended Question Answer Based On Adversarial Text
5	Research On Semantic Text Exchange Method Based On Pre-trained BART Language Model
6	Research On Key Technologies Of Text Generation In Social Media
7	Research And Application Of End-to-end Chinese Text Generation
8	Research And Application On Controllable Text Generation Based On Pre-trained Language Models
9	Research On Few-shot Text Generation With Pre-trained Language Model
10	Research On Cross-Modal Natural Language Generation