Font Size: a A A

Research On Medical Semantic Network Construction Method Based On Chinese Electronic Health Records Text

Posted on:2020-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:J MengFull Text:PDF
GTID:2404330578952485Subject:Information management
Abstract/Summary:PDF Full Text Request
With the development of medical informatization and computer hardware,electronic medical records have been greatly popularized in China.Therefore,the electronic medical records data generated every day has exploded,but the text data in electronic medical records is difficult to be structured and used for secondary use.In order to find more information in electronic medical records,electronic medical records text mining is now a research point of many scholars,mainly focusing on the two research tasks of electronic medical records named entity recognition and electronic medical record entity relationship extraction.The text mining of electronic medical records in the English field has already achieved fruitful results,but domestic research is still in its infancy.This is because(1)there is a lack of standardized and unified terminology,the terminology in the electronic medical records text is not standardized,and the foreign mature knowledge base cannot directly guide the research of electronic medical record text mining in Chinese field;(2)lack of corpus,public annotated corpus and annotated norms.The electronic medical records text is highly professional,so it is difficult for ordinary people to identify the entities and relationships,which severely limits the research on Chinese electronic medical records text mining.Based on this,this paper' s research is identifying named entity and extract entity relationship through Chinese electronic medical records text to construct a medical semantic network based on Chinese electronic medical records text.The main work includes the following aspects:This paper first analyzes the data structure characteristics and language characteristics of electronic medical records,and proposes a data cleaning model based on metadata.In view of the problem of non-uniformity in electronic medical records,a small part of the corpus of specific diseases is marked by myself.We use CRFs model and dictionaries is introduced to realize the multi-term task of entity identification in the specific disease.It can expand annotation corpus for electronic medical record entity recognition,and lay the foundation for the subsequent research such as entity relationship extraction and semantic network construction.Then,for the entity relationship extraction task,this paper refers to semantic network structure of the UMLS to clarify the entity relationship extraction type extracted in this paper.Using the advantages of LSTM in text entity relationship extraction tasks,the Att_BiLSTM model is transplanted into the medical field to extract the semantic relationship between sentence-level entities' relationships in electronic medical records text.The experimental results prove it has good performance in the recognition of TrCP,TrIP and TrAP.The F values are 0.862,0.861 and 0.862 respectively.Then,the tool MetaMap,which is a tool of the domain knowledge base UMLS,is introduced to obtain the concept in the UMLS,which is regarded as IS-A relationship.It not only establishes a relationship with the international knowledge base to promote the development of Chinese medical text mining research,but also further complement the Chinese semantic network of the UMLS.Finally,we construct a medical semantic network of specific disease kidney cancer using the Chinese electronic medical records dataset.We also make it visualization through the tool Gephi.This semantic network can be used for further research such as drug recommendation,disease prediction,intelligent medical question answering system,etc.,which is of great significance.
Keywords/Search Tags:Chinese electronic medical records, Named entity recognition, CRFs model, Entity relationship extraction, Bidirectional LSTM model, UMLS
PDF Full Text Request
Related items