Font Size: a A A

Research And Application Of Extractive Text Summarization Method Based On Contrastive Learning

Posted on:2024-07-31Degree:MasterType:Thesis
Country:ChinaCandidate:S GongFull Text:PDF
GTID:2568307103990129Subject:Mechanics (Professional Degree)
Abstract/Summary:PDF Full Text Request
Text summarization aims to distill the original document’s important information into a short and concise summary.The methods of text summarization can be roughly divided into two categories: abstractive summarization and extractive summarization.Abstractive summarization uses natural language generation techniques to generate new sentences that summarize the information of the original text,while extractive summarization involves extracting key fragments from the original text to compose a summary.With the increase of data volume,text summarization technology has received widespread attention.Compared to abstractive summarization,extractive summarization is better able to generate summaries that are grammatically accurate and consistent with the facts.In recent years,among the various methods for extractive summarization,neural networks-based methods have become the dominant.These methods treat the original text as a sequential sequence,mark the sequence of sentences based on their positional information,and then extract a fixed number of sentences to form the summary.However,these methods have following limitations.Firstly,they neglect the document structural information.Secondly,the semantic relationships between sentences are not captured through the sentence position information alone.Finally,these methods operate at the sentence level,resulting in extracting a predetermined number of sentences,thus limiting the flexibility of summary.In order to address the above problem,we first propose sentence-level extractive summarization models for short and long documents respectively.Then,based on the models,we propose a summarization-level extractive summarization framework to select an indefinite number of sentences to form a summary.Our research is as follows:(1)For short documents,we propose a sentence-level extractive summarization model with enhanced sentence centrality.This model replaces sentence position information with sentence centrality,effectively addressing the problem of sentence lead bias in news documents while enhancing inter-sentence correlations.(2)For long documents,we propose an extractive summarization model based on a triplet position-enhanced heterogeneous tree.This method models the long document as a heterogeneous tree and uses triplet positions to label nodes in the tree.(3)We propose a contrastive learning-based summarization-level framework to enable the model to select any number of sentences to form a summary.Based on this framework,we design an automatic summarization system,which demonstrates strong performance on four datasets: CNN/Daily Mail,XSum,Pub Med,and ar Xiv.
Keywords/Search Tags:extractive summarization, contrastive learning, neural network
PDF Full Text Request
Related items