Font Size: a A A

Research And Implementation Of Knowledge Fusion Method For Medical Field

Posted on:2022-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:M WuFull Text:PDF
GTID:2494306740962689Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the continuous advancement of Web technology,massive amounts of data are growing explosively,and the data forms are becoming more and more diverse.The proposal of the knowledge graph provides a brand-new solution for data storage and rapid retrieval.It stores massive data in the knowledge base after processing,so that knowledge can be efficiently managed and applied.Medical care is one of the most widely used fields of knowledge graph.Transforming massive medical data into natural language that people can understand,and fusing medical knowledge from different sources to build an accurate and complete medical knowledge base are important foundation for the realization of "wisdom medical care".Based on the background,this thesis constructs a medical knowledge graph and supplements it through knowledge fusion.On this basis,it provides disease knowledge question and answer service.The main work includes the following four aspects:1.By investigating the expertise from the medical field,the ontology of the knowledge graph for medical is built,and the pattern layer of the knowledge graph is defined.Based on the web crawler technology,the corresponding crawling strategies are set for the major medical websites,and the disease-related content of the major medical websites is crawled through the Scrapy framework based on Python3,and the medical corpus about diseases is constructed.2.A named entity recognition model,Ng-BAC(Attention-based Bi-LSTM-CRF with Ngram),is designed and implemented.Firstly,the N-grams algorithm is used to obtain new words from the medical corpus to build a medical-related external dictionary.Then,the potential word information is integrated into the Bi-LSTM as an extension,and the information is dynamically routed from different paths to each character through the gate structure.Then,the attention mechanism is introduced to assign different weights to the output of the Bi-LSTM layer to improve the accuracy of the output.Finally,the CRF layer is used to label the sequence.The hyperparameters of the model are set to the optimal value through parameter comparison experiments,and the superiority of the model is proved through model comparison experiments.3.An embedding-based entity alignment model,RD-HRGCNs(Relation-aware DualRGCNs with Highway gates),is designed and implemented.Firstly,the dual relation graph is constructed according to the input primal entity graph.Then,the vertex representations of the dual relation graph and the primal entity graph are obtained iteratively through the graph attention mechanism,so that the dual relation graph and the primal entity graph can interact fully.Then,the relation-aware entity representations in the primal entity graph are fed into the double-layer RGCNs with highway gates to further capture the adjacent structure information.Finally,the obtained entity representations are used to calculate the distance between the entities to determine whether the two entities should be aligned.The different components of the model are evaluated through ablation experiments,and the superiority of the model is validated through model comparison experiments.4.Through the proposed models in this thesis,knowledge extraction and knowledge fusion are carried out on the crawled medical data to construct the medical knowledge graph.The knowledge graph for medical is visualized by the Neo4j.On this basis,the disease knowledge question and answer service based on the Flask framework is implemented.
Keywords/Search Tags:Knowledge Graph, Knowledge Fusion, Knowledge Extraction, Entity Alignment, Name Entity Recognition
PDF Full Text Request
Related items