The Research On Multi-document Summarization Generation Method Based On Text Relation Graph

Posted on:2023-12-06

Degree:Master

Type:Thesis

Country:China

Candidate:W M Luo

Full Text:PDF

GTID:2558307097994769

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the development of information technology,people are getting more and more data in text format from cyberspace.How to get useful information from massive text information becomes critical.Text summarization is a technique for condensing long text into a short summary containing only crucial information.Nowadays,the explosive growth of text information and more application scenarios have put forward higher requirements for text summarization.The text summarization method has developed from the rul ebased algorithm to the current approach based on deep learning.Although the existing deep learning models have made some achievements,there are still some problems.In view of the existing problems of text summarization,this paper aims to design an efficient model for the task of text summarization and explore ways to generate high-quality summaries.According to the way of summary generation,it can be divided into extractive methods and abstractive methods.Abstractive methods encode the source text and then produce summaries that may contain words and phrases not found in the source documents by natural language generation techniques.Extractive methods select a subset of sentences representing the key points from a piece of text and then combine them to make a summary.Based on the number of input documents,text summarization can be dealt with in single document summarization and multi-document summarization.The multi-document summarization task handles multiple documents on the same topic,synthesizing different perspectives from multiple documents and making full use of more information to produce a more comprehensive and accurate summary.The work in this paper focuses on abstractive multi-document summarization.Some multi-document summarization methods simply concatenate multiple documents as a single document summarization task.However,the input of the multi-document summarization task is usually long and has high redundancy.The concatenating sequence ignores the relationship between documents.Moreover,the lengthy input makes it difficult for the model to capture critical information,causing the model to generate a low-quality summary.Most existing models that consider text relations use attention-based methods for relation learning,which cannot effectively learn the relations between texts.The main contents of the research are as follows:(1)Considering the inherent characteristics of the input of multi-document summarization task,this paper exploits the hierarchical encoder corresponding to the hierarchy of input text,which can learn the representation of words,sentences,and documents.Meanwhile,the sparse attention mechanism is deployed in the decoder to enhance the model’s ability to locate crucial information in long sequence;(2)In order to solve the problem of ignoring the relations between texts or learning the relations insufficiently,we introduce static relation information to help the model learn the potential semantic association between texts in the encoding stage.Furthermore,a graph convolutional network is deployed to make full use of relation graphs to generate high-level semantic representation;(3)This paper compares our model with several existing models on the widely used mult idocument summarization dataset.The results of both automatic and human evaluations confirm the model design’s effectiveness.In addition,the effect of the internal structure of the model on the generation is also tested.

Keywords/Search Tags:

Natural language processing, Text generation, Multi-document summarization, Text relation graph

PDF Full Text Request

Related items

1	Research And Realization Of Graph-based Subjective Multi-document Automatic Summarization
2	Research On The Key Issues Of Text Segmentation And Its Application In Multi-document Summarization
3	Research On Automatic Summarization Algorithm For Meeting Speech Transcribed Text
4	Automatic Summarization Of Multimedia Information And Related Technology Research,
5	Research On Key Techniques Of Query-focused Multi-document Summarization
6	Automatic Summarization System Based On Natural Language Processing
7	Research On Semantic Text Exchange Method Based On Pre-trained BART Language Model
8	Research Of Non Domain Knowledge Dependent Text Summarization Method
9	Text Summarization Research Based On Semantic Analysis
10	Research On Multi-Document Summarization Method With Text Association