Font Size: a A A

Research On Long Text Summarization Via Structural Information

Posted on:2023-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:J Z LiFull Text:PDF
GTID:2568307124469664Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Text summarization has always been hotspots and difficult problems in the field of natural language processing.In recent years,there are many researches on news text summarization,and it has become more and more mature.Compared with news texts with relatively fixed structure,some long texts have complex text structure,such as patent specification,script and so on.Because there is little research on the summarization of these texts,the summary generated by traditional text summarization methods have some problems,such as inaccurate content,incomplete coverage and so on.A large number of previous studies mainly focused on the use of serialization information such as semantic information,while relevant studies show that text structural information also plays a great role in the text summarization.This paper intends to improve the quality of long text summarization by using text structural information,and conduct experiments on representative patent specification and script texts.It includes the following three aspects:Firstly,for the issue of the mainstream model struggling to make effective use of context structural information in long text,we propose a long text summarization method based on context structural information.This method first analyzes the structure of the original text,and represents the context information through the juxtaposition structure and the relationship between sentences.The graph convolution network is introduced in the encoder,word-level coding and sentence-level coding are used to represent the context relationship inside and outside the sentence respectively.The two are fused to form the final representation and provided to the decoder to generate the summary.The experimental results show that this method has a significant improvement in the ROUGE evaluation compared with the traditional abstractive summarization.Secondly,for the issue of the traditional model not being able to make full use of the dialogue information in dialogue long text,we propose a long text summarization method based on dialogue structure.This method first uses the dialogue relationship between roles in the original text,and constructs the dialogue sequence structure and dialogue interaction structure diagram,introduces the graph neural network to model it,represents the relationship between sentences and other sentences through the dialogue structure,and finally completes the task of summary extraction through the classifier.The experimental results show that the introduction of dialogue structure can improve the quality of summarization.Finally,for the issue of insufficient scene structural information and lack of key information in dialogue long text,we propose a long text summarization method based on scene structural information.This method is a summary method combining extraction and abstraction.Firstly,based on the dialogue structure,we continue to analyze the scene structure and character structure of the original text,and extract the key sentences by fusing various structures through graph neural network as the input of the generation model.Then,based on the BART pre-training model,the scene and character embedded representation are added to the input to enrich the semantic features of the text.Finally,the summary is generated by the self-attention decoder.The experimental results show that this method has a significant improvement in the ROUGE evaluation.
Keywords/Search Tags:Long Text Summarization, Text Structural Information, Neural Network, Encoder-Decoder
PDF Full Text Request
Related items