| Patent data is a highly professional text that contains technologies and applications in various fields.Enterprises and researchers in related fields usually research patent data,and analyze the hot technology and development status of the research field to provide suggestions for the development of the industry.At present,the research on patent data of Chinese herbal medicines is relatively weak.The overall performance is to analyze the structural data in patent data and cluster analysis of text data by statistical counting method.The mathematical and scientific methods are used to analyze and study the patent data of Chinese herbal medicines.The analysis results are relatively simple and cannot provide accurate and effective suggestions for industrial development.With the rapid development of natural language analysis methods,the topic model can be used to analyze and mine the patent data of Chinese herbal medicines,and the text semantics can be used to derive the technical hotspots and development trends in the field of Chinese herbal medicines,which can provide ideas for industrial development and technology research and development.Based on the research status of patent documents related to Chinese herbal medicines,this paper carried out the topic mining and application research of Chinese herbal medicine related patent texts based on the topic model.This paper mainly completed the following work:(1)Study the pretreatment method of Chinese herbal medicine patent abstract text.Solving the problem of not considering the influence of professional vocabulary in the field of Chinese medicinal materials on word segmentation,resulting in the inability of the characteristics of Chinese herbal medicine patent texts to be fully reflected.(2)Based on the abstract text of Chinese herbal medicine patent data,a text mining framework based on LDA theme model is constructed.The topic analysis method based on hot topic theme and topic intensity analysis is proposed,and industrial suggestions are proposed based on the analysis results.(3)Constructing a text classifier based on LDA-SVM for Chinese herbal medicines,realizing the automatic division of the patent texts of Chinese herbal medicines,and solving the problem of inaccurate IPC classification numbers in patent subject analysis.(4)Realizing the patent subject analysis system of Chinese herbal medicines,combing and implementing various functional modules including:preprocessing module,subject classification module,subject analysis module and basic function module.The system analysis process was constructed and tested by the Sanqi and Tianma patent data analysisThe research shows that this paper highlights the characteristics of the patent texts of Chinese herbal medicines by pre-processing the patent texts of Chinese herbal medicines.Through the combination of the time series of Chinese herbal medicine patents and the theme of Chinese herbal medicine texts,the number of Chinese herbal medicine patents has decreased in recent years.The reason is to propose industrial development according to the development trend of the subject areas of Chinese herbal medicine patents;through the construction of the Chinese medicine material theme analysis system,improve the efficiency and effectiveness of the Chinese medicine material theme analysis,and intuitively draw the theme development trend of Sanqi and Tianma.This paper comprehensively draws the information in the current patent texts of major Chinese herbal medicines,and provides decision support for the development of Chinese herbal medicine industry. |