Font Size: a A A

Forewarning Technology Of Chinese Patents Based On Burst Words Detection

Posted on:2017-06-10Degree:MasterType:Thesis
Country:ChinaCandidate:W J MaFull Text:PDF
GTID:2349330503492912Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Patent information is the first choice of competitive intelligence resources, for its huge amount of technical information and knowledge. By fully mining and exploiting patent information, we can understand the situation of technology trends, technology branches and technology relations. What's more, it contributes to finding new technical fields, technical directions and technical means. By establishing patent forewarning mechanism, on one hand, it enables enterprises to gain the initiative in the fierce market competition, to deal with the challenges by its competitors calmly, to avoid patent infringement, and to protect the intellectual property. On the other hand, it also contributes to discovering the hot or emerging technologies in the industry timely, understanding the industry trends better, and providing a reference for the strategic business development further.In this paper, the data of patents in new energy vehicle field is our research object. With the application of text mining and information extraction theories, we implemented a burst words extraction and important patents forewarning algorithm, based on the patents in new energy vehicle field. Meanwhile, we proposed and built a distributed patent information collection system that fit for large-scale data acquisition, and realized a Chinese patents forewarning system based on burst words detection including the work mentioned above.In engineering aspect, the system draws on advanced engineering experiences from large Internet companies domestic and foreign. In architectural design aspect, the system fully considered practical issues like high availability, high concurrency and large-scale data processing. In general, the system is divided into four parts, namely, patent information collection subsystem, burst words extraction subsystem, important patents forewarning subsystem, and competitor analysis subsystem.(1) Patent information collection subsystem is used to fetch the latest patent data from specific information data sources, to apply operations like structured data parsing and information extraction to the fetched data, and to transfer the processed data to the patent knowledge database in the master system. This subsystem makes large improvements on the engineering side. By the application of distributed deployment and HTTP proxy clusters, makes it more responsive to the needs of massive data processing.(2) Burst words extraction subsystem is used to apply the algorithm proposed in this paper, realizing the functions of burst words extraction from patent data. And then, displays them to the users.(3) Important patents forewarning subsystem is based on the burst words extraction. When the user designates a burst word or a phrase, the subsystem will recommend a set of patents closely related to the filed that the burst word or phrase represents.(4) Competitor analysis subsystem realizes the forewarning indicators analysis of competitors in Chinese patent forewarning field, including comprehensive competitiveness analysis, inventor partmership analysis and the number of patents change tendency analysis.Proved by experiments on real datasets, the patents forewarning system based on burst words detection proposed by the paper can greatly reduce the waste of labor costs and improve the efficiency of patent information mining.
Keywords/Search Tags:Patent Data, Text Mining, Natural Language Processing, Burst Word Detection, Patent Forewarning
PDF Full Text Request
Related items