Font Size: a A A

Research On Automatic Long Text Generation With Deep Learning

Posted on:2023-06-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y F XiaoFull Text:PDF
GTID:2568306794455064Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the vigorous development of artificial intelligence,deep learning has been widely used in the field of natural language processing due to its excellent characteristics.As one of the hot topics in the field of natural language processing,the task of text generation is to obtain useful information from the analysis of massive text data,and then extract important content based on this information.Among these text data,short text and long text can be classified according to the length of the data.Most of the existing methods are oriented to short text information,and there is still a lack of in-depth research on the effective processing of long text information.Under this background,this paper mainly combines the problems that the input long text data cannot fully learn the global semantic information and the long text summarization can not guarantee the language succinct or lack of key information,and carry out the research on the text generation method for long text data.(1)In view of the fact that the discriminative network of traditional generative adversarial network cannot fully learn global semantic information when facing long texts,a long text generative adversarial network model based on multi-head self-attention mechanism is proposed.The model uses a multi-head self-attention mechanism as a feature extractor to extract the preprocessed text feature vector,which covers the global semantic information of the input text and enhances the feature extraction capability of the model.At the same time,the gated recurrent unit is used to encode the text sequence,Get the encoded feature of the current word and predict the next word together with the encoded feature vector.Experiments show that the feature vector extracted by the model proposed in this paper contains global semantic information,which can effectively improve the quality of long text generation.(2)Aiming at the problem that the long text summaries generated by the extractive text summaries are not refined enough and the long text summaries generated by the generative text summaries lack key information,a hybrid long text summarization model based on Transformer is proposed.The key information of the original text is analyzed and extracted through the extractive model,a large amount of redundant information is deleted,and only the sentences most relevant to the standard abstract are retained.The BIO tag is used to assist the generative model to continuously copy the words with high prediction probability,so that the generated The abstract retains more key information.Experiments show that the model proposed in this paper can obtain key information from the original text that is closer to the target summary and rewrite it,effectively solving the problem that long text summaries cannot guarantee language succinctness or lack of key information,making the generated results closer to standard summaries.(3)In order to meet the needs of long text generation in practical applications,an automatic long text generation system is designed according to the two models proposed above.The overall design and main functional modules of the system are introduced in detail.Finally,the specific implementations of the abstract generation module in the field of Chinese and English news and the text generation module in the field of Chinese and English poetry are shown respectively.
Keywords/Search Tags:Generative Adversarial Networks, Attention Mechanism, Deep Learning, Text Generation, Text Summarization
PDF Full Text Request
Related items