Research On Abstract Text Summarization Based On Sequence To Sequence Model

Posted on:2021-04-19

Degree:Master

Type:Thesis

Country:China

Candidate:Y S Shi

Full Text:PDF

GTID:2428330626460371

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In the information age,Internet data,especially text data,is growing exponentially.It is very important to study how to use computer to compress the text information,helping users grasp the information content quickly and accurately,that is,automatic text summarization technology.Automatic text summarization is one of problems in natural language process.There are two types of research methods: extractive text summarization and abstractive text summarization.The extractive,that is,key segments of the original text are extracted and spliced to produce a summary for the original text.However,the purpose of the abstractive is to study how to generate a summary like human writing style,which is a difficult thing.With the development of sequence to sequence model framework,which has been widely used in various natural language processing tasks,such as machine translation,dialogue system,and so on.This method also provides a new research idea for text summarization.However,there are still many problems in the application of this structure in the task of automatic text summarization.In this paper,the algorithm will be improved to generate better summary.The main work includes the following aspects:For text summarization,this paper constructs a sequence to sequence based benchmark summarization model.It uses the RNN as the encoder and decoder,including the attention mechanism;it implements the point-copy network for solving the OOV problem in the abstract;in the decoding process,it applies the beam search algorithm,so as to quickly generate high-quality summaries.In order to solve the problem of unbalanced training caused by uneven word frequency in the training process of the benchmark model,this paper proposes a text summarization model based on multi-dimensional features optimization.At the input of the model,a multi-dimensional features encoder is constructed to encode the linguistic features of the original text,so as to improve the robustness of the model's input information.At the output,the word based focal loss is used to calculate the loss of the predicted tag and the real tag,so as to avoid the class imbalance problem.Experimental results on two Chinese datasets,TTNews and LCSTS,show that the model based on multi-dimensional features optimization can improve the quality of summaries,but only need to added a little model parameters.The alignment attention mechanism of the benchmark model is proposed to solve the problem of translation alignment.However,it does not meet the requirements of the task of text summarization.For this reason,we construct a new global attention mechanism based on GRU and multilayer dilated gated convolution network.It can ensure that the generated attention distribution can reflect the importance of words in the original text when decoding,and help the model better adapt to the summary task.The experimental results on LCSTS and English Gigaword datasets show that the improved model has a wider attention distribution than the benchmark model,can cover more source information,and can generate higher quality summaries.

Keywords/Search Tags:

Automatic text summarization, Sequence to sequence model, Muti-dimensional features optimization, Global attention mechanism

PDF Full Text Request

Related items

1	Research On Key Techniques Of Two Phase Automatic Summarization Algorithm For Long Text
2	Research On Chinese Abstractive Text Summarization Based On Sequence To Sequence Model
3	Abstractive Document Summarization Based On Deep Sequence To Sequence Model
4	Improved Seq2Seq Text Summarization Generation Methods
5	Research On Text Summarization Method Based On LSTM Sequence To Sequence Model
6	Improving Sequence-to-Sequence Research On Chinese Text Summarization Generation Method
7	Research On Text Summarization Technology Based On Deep Learning
8	Research On Chinese Abstractive Summarization Via Fusing Topic Information
9	Research Of Model For Abstractive Summarization Based On Deep Learning
10	Research And Implementation Of Short Text Sentimental Abstractive Summarization