| In the big data era of information explosion,text information is the main source for people to obtain information.People spend a lot of time and energy reading texts every day.With the leap of computer's computing power,if the computer can be used to generate the text automatically to express the main connotation of the long text,it will undoubtedly help to slove the problem of information overload and save a lot of human resources.Therefore,text summarization technology came into being.At present,there are two mainstream automatic text summarization technologies,one is extractive text summarization and the other is abstractive text summarization.The extractive text summarization is to extract one sentence or several sentences closest to the central idea from original text as the summary.The abstractive text summarization is that the computer reads through the original text and produces the summary on the basis of understanding the central idea of the whole document.With the development of deep learning,the Google Brain team proposed the sequence-to-sequence model in 2014,which cause the hot research of the end-to-end network in NLP.Sequence-to-sequence model also become the mainstream research direction of abstractive text summarization.However,sequence-to-sequence model apply to abstractive summarization also have problems that out-of-vocabulary and repetition.This thesis proposes several improved algorithms for these problems to study the generated summary technology,mainly in the following aspects:1.A model for abstractive text summarization based on attention mechanism and beam search algorithm is constructed to lay the foundation for the later research.Based on the Encoder-Decoder framework,a bidirectional RNN is used as an encoder,which is suitable for sequence problems.In the decoding stage,attention mechanism is used to align the original text with the abstract.When generating prediction words,the beam search algorithm is used to find the global optimal solution.2.For the problem of OOV and repetition,this thesis proposes a copy-generate model based on the Intra-Temporal attention mechanism.The Intra-Temporal attention mechanism reduces repetition by penalizing inputs that acquired high attention scores in the past.In order to achieve the purpose of expanding vocabulary,copy-generate network calculates a probability to determine whether to sample a word from a fixed vocabulary or to copy a word from input sequence as the output of decoder.Then experiment on the dataset.The experimental results show that copy-generate model based on Intra-Temporal attention mechanism has greatly improved performance.3、For the problem of repetition and semantic irrelevance,this thesis proposes a model based on the selective gated unit and Maxout network.Selective gated unit make convolution operation on the output of each time step of the encoder through the convolutional neural network.On top of the new representations generated by the CNN module,we further implement scaled dot-product attention upon these representations so as to dig out the N-gram feature and global correlations.A further step is to set an information flow gate so as to control the information flow of the encoder to the decoder to achieve the function of information selection.Finally,Maxout network is adopted in the decoding to further filter the noise.The experimental results show that our model effectively alleviates the problem of repetition and has great application value. |