Font Size: a A A

Research On Standardization Of Clinical Terms For Chinese Electronic Medical Record

Posted on:2022-07-31Degree:MasterType:Thesis
Country:ChinaCandidate:M Y LiFull Text:PDF
GTID:2494306560491104Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The standardization of clinical terminology for electronic medical records is to correspond the clinical diagnosis entities in the electronic medical records to the standard entities in the standard knowledge base.The standardized description of clinical diagnostic entities mainly adopts the International Classification of Diseases 10(ICD-10).Standardization of clinical terminology is an important research topic in medical natural language processing,and it is the basis for subsequent mining and analysis of clinical medical texts.At present,the standardization research of English clinical terminology is relatively in-depth,but the related research in the Chinese field is relatively lacking,and the operating efficiency and accuracy of some existing standardization tools are low.The research goal of this thesis is to study the standardization method of disease diagnosis terms in electronic medical records and improve the efficiency and accuracy of standardization.This thesis abstracts clinical terminology standardization as a semantic similarity matching task.Then the standardization process is divided into two steps: candidate entity recall and candidate entity disambiguation.The main research work includes:(1)Research on candidate entity recall method.By analyzing the characteristics of disease diagnosis terms in electronic medical records,candidate entities are generated based on string similarity and text statistical characteristics respectively,and the performance differences of different recall methods were compared through experiments to determine the appropriate candidate entity recall methods.At the same time,a simple and quick way to generate candidate entity sets is realized by building an Elasticsearch search engine interface.(2)Research on candidate entity disambiguation method.Build a deep learning semantic matching model based on Bi LSTM and BERT+Match CNN respectively,and determine the best semantic matching model through experimental comparison and analysis,and recall standard entities from candidate entities to achieve entity disambiguation.(3)Research on the methods of solving the problem of multiple implications in clinical terms.Aiming at the problem of multi-implication of clinical terms,that is,the problem that the original clinical terms are associated with multiple standard entities,the method for predicting the number of standard terms based on the BERT multiclassification model is studied.The final standard entity mapping result is obtained by predicting the number of links of the standard entity matching the original diagnosis terms,and combining the result of the candidate entity reordering.This thesis proposes a clinical term standardization process of "recall first,then disambiguation".First,a simple text matching method is used to obtain the candidate entity set,which reduces the search scale in the entity disambiguation stage,thereby improving the overall operating efficiency of terminology standardization.In the process of entity disambiguation,the semantic matching model based on deep learning is used.At the same time,a method for predicting the number of standard terms based on the BERT multi-classification model is studied for the multi-implication problem,thereby improving the accuracy of clinical terminology standardization.The optimal accuracy result of the experiment is 85.93%.
Keywords/Search Tags:standardization of clinical terminology, entity linking, text matching, deep learning, multiple implication
PDF Full Text Request
Related items