Research On Patent Ontology Construction Oriented To Complex Semantic Relations | Posted on:2015-01-10 | Degree:Doctor | Type:Dissertation | Country:China | Candidate:R R Li | Full Text:PDF | GTID:1108330467475138 | Subject:Computer software and theory | Abstract/Summary: | PDF Full Text Request | Patent data covers95%of the world’s advanced technology invention. The quantity and quality of patents have become representatives of competitiveness for an enterprise, an industry and even a country’s economic competitiveness. Effective utilization of patent data can help enterprise save money and time in making decisions for research&development (R&D) effectively to avoid duplicated innovation. Nowadays patent data is growing rapidly with people’s growing awareness of intellectual property rights. Especially there are always a lot of patents relevant to the same technology, and they usually also have some similar details among them to form a technology group.Current patent analysis methods are usually based on statistical information of keywords by modeling each patent document as a space vector with its members, which means the weights of keywords in it. The similarities among the implementation principles and technical details of patents are calculated based on the space vectors representing them while semantic relations among keywords are not considered in the vector model. It is very common that different patents in the group are described with a bag of various technical keywords having same or similar meanings in a certain technical area. If the semantic relations among keywords are considered in patent analyzing, and better performance will be achieved then.Semantic relations among keywords were not considered in existing patent analysis methods which mainly applied statistics analysis based on the technical keywords to calculate the similarity among patents based on vector space model. However, different technical keywords that have identical or similar meaning from disparate patents among clustering patents can’t be identified from the aforementioned model. So, better result will be achieved if semantic information can be considered and utilized in analysis process.In this paper, we will research the information extraction from patents which contain abundant semantic relations, the organization, and management based on the entity, tends to profound patent analysis based on patent structural semantic information.Technologies of natural language understanding offer a way to analysis and get the semantic relations contained in patent documents. Although it becomes possible that information about the object being described in documents can be obtained with data extraction technologies being investigated and applied in fields such as economics, biology, chemistry and etc., those technologies do not applicable in either entity relation extraction from Chinese patent documents or organization and management those entity relations. There are still a few problems in the process of acquiring patent structural information and patent analysis applications:(1) A patent document contains a wealth of entity-relation related to its structure. It is necessary to classify the concepts and the relations among them included in patents when modeling the patent ontology, in order to reflect the difference and characteristics of the semantic relations mentioned in patent document as fully and effectively as possible;(2)The physical connections among components and dynamic process among components and objects are illustrated in patent document. While the flexible expression of texts and the sentence structure is complicated. Meanwhile, there are a large number of entities and relations expressed with new technical terms unique to individual patents. An entity-relation may be contained in a phrase, in a sentence or among sentences. All these factors should be considered in the approaches of entity relations extraction from Chinese patent document.(3) It is considered that the impact of sets of semantic relations among entities from patents on the result of analysis should be considered in patent analysis based on structural ontology. This process will be very complicated. Nevertheless, each patent document is in accordance with written specifications strictly on the other hand. Moreover, a patent document must be reviewed and modified recursively before they can be published, so the high quality of patent is guaranteed. Although the new technologies in different technical areas described in patent document are various, but the ways in which patent technologies are described exhibit many common characteristics:①New technical words based on the basic terminology head-word are imported in patent documents;②Descriptions about the construction of patent technologys follow a few certain kinds of spatio-temporal sequence;③Entity-relations related to processes and changes are included in the description of new technology.Taking advanced features of patent document into account, it is meaningful to solve the problems related to extracting relations from patent document, which could be in favor for the in-depth study of semantic analysis and provide quality data for mining the area knowledge of technical patent. Based on this idea, we do some research to modeling patented technology ontology and extracting data from patent documents effectively, and propose a method to analyze the patent by ontology knowledge.The methods of building the patent structure ontology are studied considering the novelty of technology and high quality of writing patent document descriptions, our work mainly includes the following: (1) Modeling the concepts and semantic relations related to technology structureMotivated by the phenomenon that entity relation instances are the most intuitive reflection of ontology concepts and the relations among them, the method for analysis and mining the relation instances are proposed:achieving the multi-level hierarchical semantic relations classification by the method of hierarchical clustering; giving the relation type mark to the patent structural graph and mining the frequent pattern from the relational structural graph based on the classification; analyzing the common features among different type relations in patent entity related entities and then making decisions about patent ontology class and relational pattern message; giving the inference rules based on the classes and relations of the ontology to obtain implied entity semantic relation by the achieved relation instance. The experiment shows that the proposed approach for modeling can reduce the cost in modeling the concepts and relations in patent ontology. The schema information obtained by ontology modeling method can organize and manage the knowledge of patent structure effectively in any specific technical field.(2) Structured Patent data acquisition method with self-learrning methodIn this work, a method is proposed to extract entity relations from patent documents by taking advantages of pattern features reflecting the written specifications of patent documents on all-levels of textual segments in patent document. In the text pre-processing stage, and forms multi-level pattern rules by statistical study to get all kinds pattern feature, such as words occurrences, phrase constructions, sentence expressions and so on, which comes from entity relation instance in the corresponding text segment. Then, it takes a few instances of entity relations as seeds to construct an original template based on the semantic characteristics of the seed relation to extract more multi-entity relations by self-learning method from corpus. Finally, it can obtain implicit entity relation among sentences by analyzing the details about textual data segmentation.(3) Typical application of structured patent dataIn the aspects of typical demonstrative application of patent ontology, the method is proposed for comparative analysis of patented technology structures composed of related semantic relations based on the idea of greedy algorithm, and further a method is proposed to calculate similarity of patents based on the set of common sub-graphs obtained from bottom to up; The patents are clustered according to the similarity of structure, and then similarity of the patentees’technical strengths is analyzed. It verifies that patent structural knowledge can promote the accuracy of the patent analysis effectively. (4) Implementation of Patent Ontology Construction and ApplicationThe construction process of patent ontology is implemented:schema of patent got after instances mining is built with ontology tool; Information are extracted from documents, such as kinds of the feature words reflecting relation and sentences patterns; Instances are extracted from document. A new application of patent analysis is proposed to recommend potential collaborators based on the similarity between patentees. | Keywords/Search Tags: | Patent Structure, Entity Relation, Ontology, Data Extraction, CollaboratorsRecommendation | PDF Full Text Request | Related items |
| |
|