Font Size: a A A

Research On Knowledge Graph Construction Based On Electronic Medical Record

Posted on:2022-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:G H YuanFull Text:PDF
GTID:2504306509994279Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Electronic Medical Record(EMR)is one of the products of information Medical and health services,which is a digital Medical record of patients and is saved,managed,transmitted and reproduced by Electronic equipment(computer,health cards,etc.).Electronic medical record contains a large number of medical facts,but it is difficult to extract information automatically because of the complexity of its text structure,the professional of text vocabulary and poor readability.With the accumulation of electronic medical records in China,it is of great significance to use natural language processing technology to automatically obtain and integrate effective medical information from the massive electronic medical records.The main work of this thesis is to use natural language processing technology to study the construction of medical knowledge graph based on electronic medical records.The specific research content includes the following two parts.(1)This thesis proposed an entity relationship extraction method based on location denoising and rich semantics of electronic medical records.The entity relationship extraction task of electronic medical record is to judge the relationship categories between entities under the given medical entities,which is a key step in the construction of knowledge graph.Currently,there are two problems in entity relationship extraction of electronic medical records: 1)Noise exists in the generated position vector;2)Lack of semantic representation of words in electronic medical records.To solve the above problems,this thesis proposes an entity relationship extraction model based on location denoising and rich semantics.Firstly,the attention score of each word is calculated by using the location information and the word vector information trained by the domain corpus.Then the weight score is combined with the word vector trained by the general domain corpus to achieve the denoising of position vector and the introduction of rich semantics.Finally,the weighted word vector is sent into the feature extraction model to extract the features and determine the entity relationship type.Experimental evaluation of this method is conducted on I2B2 /VA corpus in 2010,and the F1 value is 76.47%.After the introduction of BERT,the F1 value reaches 83.05%,which achieves the best result in this corpus.(2)Thesis propose a framework for constructing knowledge graph and construct a knowledge graph based on Chinese electronic medical records.Chinese electronic medical record is a kind of semi-structured data,which contains a large number of proper nouns and abbreviations,and the texts of different departments are quite different.Therefore,most of current knowledge graphs based on Chinese electronic medical record are analyzed and constructed for a department or a disease.This thesis defines a framework of the construction of knowledge graph for different departments and diseases,which can be extended and integrated for the construction of more fine-grained knowledge graph.Based on this framework,this thesis has carried out practice on 15,835 Chinese electronic medical record,and the final knowledge graph constructed contains 31,388 medical entities and 204,900 pairs of relationships.After manual evaluation,the accuracy of entities and relationships reaches92.95% and 84.05%,respectively,which proves the reliability of the knowledge graph in quality.To sum up,this thesis proposes an entity relationship extraction method and a knowledge graph construction framework for electronic medical records,constructs a knowledge graph based on Chinese electronic medical records,and realizes the storage and visualization of knowledge graph based on Neo4 j,which provides a data basis for downstream tasks such as medical question and answer,knowledge search and so on.
Keywords/Search Tags:Electronic Medical Records, Knowledge Graph, Position Vector Denoising, Rich Semantic
PDF Full Text Request
Related items