Font Size: a A A

Automatic Construction Of Ontology Based On Document Retrieval And Semantics Identification In The Oil Field

Posted on:2018-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:X N ZhangFull Text:PDF
GTID:2381330596468737Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The industry of oil and gas exploration in oil field is developing digitally and intelligently with the rapid development of computer science and technology.Traditional oil production pattern is undergoing enormous changes.Intelligent production is becoming more and more popular.In order to realize the intelligent production,we have to over come the great challenge of huge knowledge group of oil field.The most classic and most widely used method of knowledge representation is ontology representation,which is based on the ontology learning from existing information sources such as text files.Nowadays,there are many problems in the construction of domain ontology,such as the independence of the development system,the disunity of data coding rules and the repeated development of the system software.In order to solve the above problems,we propose a method to construct domain ontology in oil field based on ducument retrieval and semantic relationship recognition:The primary task of ontology construction is to retrieve documents in oil field.In this paper,we use the combining method of focused web crawler and incremental web crawler to crawl the web through the analysis of advantages and disadvantages of traditional web crawler.And we avoid duplication effectively by the introduction of the crawl queue.We build domain corpus on different scale as information source to extract concepts.By analyzing the statistical method based on TF-IDF and the method based on linguistics,we design a combined method of both methods under to implement concept extraction under different number of documents.It is proved that the combining method is more accurate in concept extraction.To identify the semantic relation between concepts we have to implement from two aspect: taxonomic relation and non-taxonomic relation.The taxonomic relation is identified based on hierarchical clustering by computing similarity matrix,determining parent and child clusters by computing global similarity.The non-taxonomic relation is identified based on the association rules by obtaining the support degree and confidence degree of concepts,determineng the linking verbs between concepts by computing mutual information.The domain ontology is built automaticly with the extracted concepts and relations between them by the analysis of existing ontology learning tools through utilizing the method of probabilistic ontology model and data-driven.We export the files as OWL files and import them into the protégé platform to realize the visualization of domain ontology.
Keywords/Search Tags:Domain ontology, concept extraction, semantic relation identification, automatic construction
PDF Full Text Request
Related items