Font Size: a A A

Construction Research And Application Of Plant Knowledge Graph PlantKG

Posted on:2022-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q WangFull Text:PDF
GTID:2480306527470464Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As a branch of artificial intelligence,knowledge graph has important application value in intelligent question and answer,expert system,recommendation system and other scenarios.Typical generic knowledge graph include Google Freebase,DBpedia,Baidu Zhixin,etc.In terms of domain knowledge graph construction,there are BIO2 RDF in the field of life science and TCM knowledge graph in the field of medicine.In the field of plants,Beijing Forestry University has built the plant knowledge graph,but the data sources and construction scale are not rich enough to serve the application of knowledge graph well.In view of the diversity of plant knowledge and the existing problems in the construction of plant knowledge graph,this paper studied the systematic construction of plant knowledge graph PlantKG from multiple professional plant knowledge data sources,and carried out work from the aspects of the construction method of knowledge graph,named entity recognition,knowledge fusion and knowledge question-andanswer based on knowledge graph.The main research work is as follows:(1)Research on the construction method of plant knowledge graph.The concept layer of PlantKG was constructed from top to bottom,and then the case layer of PlantKG was constructed by extracting plant knowledge from semi-structured and unstructured data of Flora of China,Plant Network,Interactive Encyclopedia and Wikipedia through crawler and deep learning methods.The architecture of the concept layer was imported into the graph database,and then the data of the instance layer were mapped to the concept layer to construct the plant knowledge graph PlantKG.(2)Research on the Named Entity Recognition Model of Medicinal Plant Text Diseases Based on BERT+BiLSTM +CRF Combined Attention Mechanism.In terms of obtaining plant knowledge from unstructured data,a disease named entity recognition A BAC method based on BERT + BiLSTM +ATT+CRF model was proposed to solve the problem of long sequence semantic sparse of medicinal plant text,which combined the bidirectional long-short term memory network(BiLSTM)and conditional random field(CRF)model.The experimental data set was constructed by preprocessing and semi-automatic annotation of medicinal plant texts.The experimental results show that the BAC method is better than the traditional method in disease named entity recognition.The trained model was used to extract disease entities from the text of medicinal plants and match them with plant names to obtain triplet data.(3)A study of plant knowledge fusion and knowledge Q & A.The plant knowledge from different sources was fused,and the fused plant knowledge was stored in the graph database NEO4 J.The built PlantKG scale has more than 67,000 entities and more than608,000 entity relationships and attributes.The correct query results(at least part of them)of the plant knowledge query based on the template and PlantKG show that the plant knowledge graph is valid.The constructed PlantKG has been shared on Git Hub.It can provide users with knowledge retrieval of medicinal plants and other related knowledge.In the future,it can also integrate more plant domain knowledge to provide services for knowledge retrieval,knowledge reasoning and other applications in the plant domain.
Keywords/Search Tags:knowledge graph, BERT, attention mechanism, bidirectional longshort term memory network(BiLSTM), conditional random field(CRF), knowledge Q&A
PDF Full Text Request
Related items