Font Size: a A A

Research On Construction Technology Of Knowledge Graph Based On Agricultural Thesaurus

Posted on:2020-02-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:B QiaoFull Text:PDF
GTID:1363330620981009Subject:Land Resource Science
Abstract/Summary:PDF Full Text Request
The knowledge graph,as a critical core technology of AI,has a rapid development since its launching in 2012 by Google for improving the searching quality.Its effectiveness in improving the searching quality of searching engines and the accuracy of Q/A systems makes it be widely used in areas such as the smart searching,the smart Q/A and the personalized recommendation,etc.Currently,a number of domestic and overseas researchers have been researching the construction of knowledge graphs in their fields,where includes the system building,acquisition,integration,storage,reasoning and application of knowledge,and achieved a certain results.This thesis focuses on the agricultural knowledge graph construction,where the Thesaurus of Agricultural Science,and some theories and approaches such as the Recurrent Neural Networks,the Conditional Random Fields Model,the Ensemble Learning,the Joint Extraction Model of Entity Relations and the BERT model etc.,are employed to discuss relevant schema-building and knowledge acquisition.Main works of the thesis are as follows:(i)A research on the thesaurus-based method for constructing the agricultural knowledge graph is presented.Most knowledge graphs are currently built on Wikipedia's and Baidu's encyclopedia and other public resources,from which the conceptual ontologies,entities and relationships are extracted.Considering the scarcity of agricultural knowledge in these public resources,the thesis proposes a thesaurus-based method for constructing the agricultural knowledge graph.The automatic conversion and construction rules from the thesaurus to the knowledge graph schema layer and to the knowledge graph data layer are respectively discussed,and the process of the Agricultural Science Thesaurus to the agricultural knowledge graph is automated.From this,the thesis bring up with a preliminary agricultural knowledge graph,which includes 60,000 plus agricultural thesaurus entities and 210,000 plus triples comprising of the thesaurus entities and the relationships.The results show that building the agricultural knowledge graph on the thesaurus is feasible and reliable,which provides a new direction to constructing the agricultural knowledge graph,and also lays a high quality data foundation for the extension of the agricultural knowledge graph.(ii)An ensemble learning-based research on the agricultural entity recognition model is performed.The approach truncating sentences to provide inputs for modeling is commonly used in the entity recognition model,which easily leads to the losing of the contextual information among sentences.To solve this issue,the thesis gives an ensemble learning-based agricultural entity recognition model,the ELER.An agricultural entity recognition dataset,theAgriNER2018,is constructed to train the ELER model,and three types of entities in the dataset,the sediment,the soil-forming and the topsoil,are all marked.And the dataset is divided into two parts,the training set that includes 1528 sentences,71,736 tokens and 1229 entities,as well as the test set including 231 sentences,10,242 tokens,and 127 entities.When being compared with the BILSTM-CRF model,the ELER's accuracy and the F1 value,when using the AgriNER2018,are respectively improved by 2.32% and 2.92%,and when the CoNLL2003 standard data set is adopted,these two are respectively 1.37% and 0.7%.The results show that the ELER model can effectively improve the entity recognition and the improvement when using the Agri NER2018 is more significant,which proves that the model can be applied for the specific agricultural fields lacking of datasets.(iii)A research on the BERT pre-training-based joint extraction model of agricultural entity relations is carried out.Currently,the Word2 vec model is popularly employed to train the word vector in the joint extraction models of entity relations.In addition,considering the Word2 vec model can not be used for modeling the polysemous words,the thesis proposes a BERT pre-training-based joint extraction model of agricultural entity relations,the BERT-BILSTM-LSTM,to address this problem.An agricultural entity relationship extraction data set,the AgriRelation2018,which marks three types of entities including the fruit,the geographical location and the origin relationship between them,is constructed to train the model.And this data set is divided into the training set that includes1348 sentences and 1161 relational entity triples and the test set including 187 sentences and 133 relational entity triples.When being compared to the LSTM-LSTM-Bias model,the BERT-BILSTM-LSTM model respectively improves the F1 value by 2.8% and 3.3% when the AgriRelation2018 and the standard dataset NYT are used respectively.The results show that the model overcomes the issue mentioned above and can basically meet the needs of relation extraction in agricultural fields.(iv)Designs and implements relevant agricultural knowledge graphs and the applications relating to them.On the basis of above works,the thesis builds relevant agricultural knowledge graphs and their applications,which realize the functions such as the thesaurus query,the entity recognition,the relation extraction,the entity query and the relation query.The effectiveness of above research methods,models and algorithms is verified through the running of these applications.
Keywords/Search Tags:agricultural knowledge graph, construction method, agricultural science thesaurus, entity recognition, joint extraction
PDF Full Text Request
Related items