Font Size: a A A

Research And Implementation Of Core Algorithms In Knowledge Graph For Petroleum Exploration And Development

Posted on:2017-09-24Degree:MasterType:Thesis
Country:ChinaCandidate:J N WangFull Text:PDF
GTID:2321330563950527Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
There were large amounts of research reports which were produced in the progress of petroleum exploration and development.Traditional systems,such as relational database based information management system and keyword based information retrieval system,cannot analyze,organize and utilize the knowledge in these research reports effectively.Knowledge graph can extract knowledge items,build their network and query them by using out-of-date technologies such as Machine Learning,Natural Language Processing and Semantic Web,so that it can solve problems such as query and reuse of knowledge effectively.This thesis focuses on core algorithms in the construction of knowledge graph on the domain of petroleum exploration and development because knowledge graph depends on domain knowledge.In this thesis,the core algorithms in Topic Model,Named Entity Recognition and Relation Extraction are mainly researched.The main contents of this thesis are as follows.(1)In the process of building the knowledge graph,the number of topics is hard to determine and that the quality of topics is not good enough due to brand new domain.To deal with these problems in topic model,an interactive topic model is proposed to determine the number of topics dynamically and improve the quality of topics by introducing the supervision of user.Besides,analysis and discussion about the performance of interaction were made.(2)In the process of extracting named entity for knowledge graph,the algorithm does not take fully advantage of characteristics in the domain of petroleum exploration and development.To deal with these problems,an algorithm is proposed,which is based on distributed word representation and neural network.In addition,unlabeled data is used to make an improvement by utilizing it to initialize training data and matrix of word embeddings.(3)Most of algorithms for distant supervision are feature-based so that they often encounter some problems,such as that key features are not distinct and that data is linearly inseparable.A pattern-based vector is designed for overcoming these problems.By introducing that vector,a pattern-based algorithm for distant supervised relation extraction is proposed.The experiment result shows that the algorithm can successfully improve the precision of distant supervision for relation extraction.
Keywords/Search Tags:Knowledge Graph, Topic Model, Named Entity Recognition, Relation Extraction
PDF Full Text Request
Related items