Font Size: a A A

Research On Abstract Generation Technology Based On Patent Text

Posted on:2021-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z WangFull Text:PDF
GTID:2427330611499044Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
In today's increasingly fierce international competition,in order to quickly occupy the consumer market and improve the country's economic strength,enterprises are constantly carrying out technological innovation.As a synonym for the core technology of an enterprise and even the country,patents play an increasingly important role in it,and the research on patent technology has gradually become the focus of people's attention.In the era of patent big data,how to quickly and accurately retrieve patents in the target field from the massive patent database and how to quickly discover the core technical content of related patents have become the first problem to be solved in patent technology analysis.Under this background,this subject draws on the relevant research in the field of automatic summarization,explores the extraction algorithm suitable for patent text abstraction,and extracts the core technical content in the patent text.First of all,considering the shortcomings of the classic Text Rank abstract extraction algorithm that cannot express the semantic information of the sentence,this paper refers to the Glo Ve semantic feature representation method,and proposes a method of using the BERT pre-training model for sentence vector representation,and constitutes a summary based on Text Rank and BERT extraction algorithm.Secondly,according to the characteristics of the patent text,this paper considers the characteristics of the text sentence position,sentence length and topic relevance.This paper corrects the weights iteratively calculated by the abstract extraction algorithm based on Text Rank and BERT.At the same time,considering the redundancy of the generated abstract,the MMR algorithm is used to perform redundant processing on the abstract candidate sentences,so that an abstract extraction algorithm based on improved Text Rank and BERT is proposed.Finally,this paper takes the patents of the computer industry as an example to establish an English patent database containing 100 patents.This article applies the mentioned algorithm to the patent database,and uses the ROUGE series of indicators to evaluate the generated abstracts,which verifies the feasibility and rationality of the algorithm,and judges the advantages and disadvantages of the algorithm proposed in this paper by comparing the index values.
Keywords/Search Tags:patent abstract generation, Text Rank algorithm, semantic feature representation, redundancy processing, abstract evaluation
PDF Full Text Request
Related items