Text Summarization Based On Semantic Reconstruction

Posted on:2017-04-13

Degree:Master

Type:Thesis

Country:China

Candidate:C Zhang

Full Text:PDF

GTID:2308330485968083

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of web technology, here comes the problem of data explosion and information overload. Therefore the technology of automatic text summarization becomes the hotspot in computer science. In contrast with other NLP tasks, the challenges that automatic summarization face with are that the judge issue of summary is too subjective and there always lots of redundancy lying in the result summary. Most existing models score sentence by predefining some features and select the top-k sentences as result summary. However these ranking models score each sentence independently without considering the relationships between sentences. On the other hand, these predefined features usually are lexical or statistical, which cannot capture the semantic meanings of text. To counter these shortcomings, we assume that a good summary can reconstruct the original document, and we propose the semantic reconstruction model basing on this assumption. The proposed model selects the sentences that can best reconstruct the original document as the result summary. Our work in this paper consists of two parts:1. Semantic representations of sentence. Given that the bag-of-words vector can not capture the semantic meanings, we use two approaches to learn compact and semantic representations for sentence:(1) weighted mean of word embeddings; (2) deep coding. The semantic representations can be used as the input of reconstruction model.2. Reconstruction strategy is the key of semantic reconstruction and aims to find the most relevant sentences. The reconstruction strategy in this paper includes a simple linear function and flexible nonlinear function, respectively basing on quadratic programming and neural network. Besides, redundant sentences can be reduced by redundancy reduction algorithm to improve the summary quality. And the summary experiments basing on the DUC datasets validate the effectiveness of our model.

Keywords/Search Tags:

automatic summarization, semantic reconstruction, word embedding, semantic representation

PDF Full Text Request

Related items

1	Research On Text Summarization Technology Based On Abstract Meaning Representation Graph
2	The Research Of Automatic Single Text Summarization Based On Latent Semantic Analysis
3	Research On Construction Method Of Entity Semantic Vector In Science And Technology Field
4	A Study On Neural Network-based Natural Language Semantic Representation
5	Research On Microblog Summarization Using Paragraph Vector And Semantic Structure
6	Deep Neural Networks For Text Representation And Application
7	Automatic Construction Method For Domain Concepts Based On Wikipedia Semantic Knowledge Base
8	Research On Augmented Semantic Representation Based On External Knowledge
9	Sentence Embedding Representation With Syntactic Information Learning Method And Application Research
10	A Comparative Study On Semantic Representation Of Keywords In Science And Technology Bibliometric Analysis