Research On Abstractive Text Summarization Methods Based On Multidimensional Semantic Analysis

Posted on:2023-07-30

Degree:Master

Type:Thesis

Country:China

Candidate:C Wang

Full Text:PDF

GTID:2568306836964119

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of modern science and technology,the amount of information on the Internet has increased at an alarming rate.Facing the pressure of the extremely inflated network information and data overload surplus,it is extremely important to locate “valuable information” efficiently and accurately.Text summarization technology in the field of Natural Language Processing is an effective means to analyze and process network information.Text summarization technology extracts and summarizes text information to generalize the content or meaning of the article in concise and refined words.However,the coverage of text information disseminated on the network is getting wider and wider,and the text content information is only increasing,involving more entity nouns such as person names and place names.The current summarization model is more difficult to capture the key information and deep meaning of medium-long texts,and it is beginning to face the problem of long-distance dependency.In view of these issues,the research is carried out from the perspective of semantic analysis,the main research contents and contributions are as follows:(1)To solve the problem that the semantics of summaries are not smooth enough,we construct an abstractive summarization model based on bidirectional decoding Bidecoder.Based on the sequence-to-sequence model,the model improves the decoder part and adopts a bidirectional decoding structure.This structure can predict from both two directions and continuously fine tune it on combination with the prediction results,so as to alleviate the error accumulation problem caused by the unidirectional structure.Attention mechanism is introduced during decoding and attention allocation is rational used to improve the semantic consistency of the generated summary.(2)To solve the problem of inaccurate recognition of entity information in medium-long text,we construct an abstractive summarization model based on NER(Named Entity Recognition)tag and bidirectional decoding NER-Bidecoder.The model uses NER technology to mark entities in the original text and divides them into four categories,PERSON,ORG,GPE,and MISC,which are represented as person names,organization names,place names,and numbers.The NER-marked text retains entity information after vectorization,which can effectively alleviate the problem of incomplete entity information.Encoder utilizes entity information to generate intermediate semantic vectors,decoder utilizes a bidirectional decoding structure.Attention mechanism is introduced to capture deep-level semantic relationships based on time-series information to improve the entity integrity and coherence of the summary.(3)To solve the problem of inaccurate comprehension of sentence-level words in the medium-long text,we construct an abstractive summarization model based on BERT(Bidirectional Encoder Representations from Transformers)vectorization and bidirectional decoding BERT-Bidecoder.The BERT pre-training model is utilized in the vectorization stage of the model,which can make full use of the contextual information of the vocabulary and obtain a more global vector representation.It helps encoder and decoder understand full-text information,and attention mechanism is introduced to enhance the semantic relationships.The bidirectional decoding structure of the decoder can also further reduce the error accumulation,alleviate the tilt problem caused by unidirectional errors,and improve the coherence and generalization of the summary.

Keywords/Search Tags:

abstractive text summarization, bidirectional decoding, named entity recognition, BERT vectorization

PDF Full Text Request

Related items

1	Research On Abstractive Automatic Text Summarization Methods
2	Reaserch On Named Entity Recognition For Web Recruitment Text Based On Deep Learning
3	Research On Named Entity Recognition Algorithm In Mathematics Field
4	Research On Chinese Named Entity Recognition Based On BERT
5	Research On Bert-based Named Entity Recognition
6	Chinese Named Entity Recognition Based On Bidirectional LSTM-CRF Model
7	Research And Application Of Named Entity Recognition Based On Bidirectional LSTM
8	Research On BERT-based Uncivilized Language Detection Method
9	The Research Of Chinese Named Entity Recognition Based On Deep Learning
10	Research And Application Of Knowledge-Enhanced Text Summarization Method For Technology Policy