Research On Automatic Summary Technology Of Patent Texts Based On Semi-supervised Deep Learning

Posted on:2022-09-28

Degree:Master

Type:Thesis

Country:China

Candidate:Y X Zhu

Full Text:PDF

GTID:2507306572963039

Subject:Applied Statistics

Abstract/Summary:

PDF Full Text Request

Patents play an important role that cannot be ignored for an enterprise or even a country in today’s society.Researching patent contents and improving patent technology gradually become the goal that people want to realize more and more.In the era of information overload,it is an urgent problem to obtain the core content of patents without browsing a large number of patent texts in the field of patent analysis.Therefore,automatic summary of patent texts has certain researching significance.Under the research background,this paper studies and draws on the relevant research in the field of text summarization,and proposes a patent text automatic summarization technology based on semi-supervised deep learning to achieve the goal of extracting the core content of patent text.First of all,considering that the traditional Text Rank algorithm ignores the attributes of words themselves,this paper proposes the Text Rank-Attr,which combines the attributes of words such as TF-IDF,the length and part of speech.We can realize the extraction of keywords and important sentences by calculating the final weight after the combination of the attributes.Secondly,considering that deep learning framework is widely used in text summarization task,especially the Seq2 Seq model with Attention based on RNN,but it is difficult to obtain all labeled text data in practice,this paper proposes an automatic summarization technology that combines semi-supervised learning and deep learning,that is combining labeled data and unlabeled data to train Seq2 Seq model with Attention based on RNN.Then,this paper takes the patent texts of wine and other wine manufacturing industries in the biological field as an example for empirical analysis,establishes a data set containing 1041 patent texts,applies the proposed semi-supervised deep learning model to this data set,and makes experimental comparison with the unsupervised Text Rank-Attr algorithm and supervised deep learning model to verify the validity of the model.Finally,this article uses the ROUGE series index as an evaluation standard of the results.On this basis,considering the ROUGE depends on the pros and cons of the referential summary.This paper puts forward to quantify the generated summary and important sentence by using the Word2 Vec model,calculate their similarity,then evaluate the quality of summary,and finally verify the feasibility and rationality of the proposed model in this article through comparing the value of ROUGE.

Keywords/Search Tags:

Automatic summary of patent texts, TextRank-Attr algorithm, Seq2Seq, Semi-supervised learning

PDF Full Text Request

Related items

1	Applied Research On Employment Of University Students Based On Semi-supervised Learning
2	Semi-supervised Clustering Algorithm Based On Single Linkage Clustering
3	Semi-supervised Classification Research Based On Self-paced Learning And Sparse Self-expression
4	Semi-Supervised Chinese Text Classification Based On Selective Integration
5	Research On Automatic Evaluation Model And Algorithm Based On MOOC Video Subtitles And Learning Data
6	A Study On Risk Identification Of P2P Lending Platform Based On Semi-supervised Learning
7	Construction And Application Of Corpus For Primary School Mathematics Learners Based On NLP
8	Research On Application Of Employment Guidance For College Graduates Based On Improved Semi-Supervised Self-Training Method
9	Research And Implementation Of Text Summarization Based On Attention Mechanism
10	Design And Implementation Of Automatic Course Arrangement System