| The rapid development of the Internet has led to a rapid growth of text data,and a huge amount of textual information is flooding people’s lives.How to extract the key content from the huge amount of textual information is a pressing problem today,and automatic summarization technology in deep learning can alleviate the information overload problem.Automatic summarization technology typically refers to the computer generating a concise summary from a single or multiple documents to represent the core information of the original text.At present,the sequence-to-sequence model with attention mechanism has achieved some success in automatic summarization tasks,but the summary generated by the decoding prediction of this model with the shortcomings of low semantic accuracy and high content repetition rate.Aiming at the problems existing in the current model framework,this paper proposes corresponding solutions.The main research work is as follows:Firstly,a text summarization model incorporating pre-training and attention enhancement is proposed.Chinese text will be processed by word segmentation in the preprocessing stage,which reduces the length of the vocabulary but lacks semantic links compared with word processing.The Pointer Generator Network model in this paper is based on,the encoder of the Transformer model is introduced to pre-train the input text and obtain semantically linked word vectors as the input to the model,the encoder of the Transformer model is introduced to pre-train the input text and obtain semantically linked word vectors as the input to the model.In the seq2 seq model with attention mechanism,the attention weights of each decoding moment are independent of each other.The decoder of this model can only refer to the attention weights of the current time step when decoding,ignoring the attention weights of the historical moments.Therefore an attention enhancement mechanism is introduced to make the current attention weight of the decoder refer to the historical attention weight,changing the current attention weight distribution and allowing the attention mechanism to make more accurate decisions.Finally,beam search is optimized to suppress the predicted output short sentences of the decoder.The experimental results of this model on NLPCC2018 and LCSTS datasets show that this model has better performance than most mainstream generative models,which verifies the effectiveness of this method.Secondly,an automatic summarization model based on gated networks is proposed.The input text sequence will carry a lot of redundant information after being encoded by the long and short term memory network,resulting in the decoder decoding the output summary with information that is semantically irrelevant to the original text,which affects the quality of the produced abstract.Therefore,this paper introduces a gated network to filter the input text sequence,remove redundant information,so that the context vector flowing from the encoder to the decoder contains more key information of the original text,thus making the generated summary information more accurate.The experimental results on the NLPCC2018 and LCSTS datasets demonstrate that this model can effectively improve the quality of the generated summarization,and the ROUGE value of the evaluation index of this model is significantly improved compared with the PGN model. |