Font Size: a A A

Chinese Electronic Medical Record Medical Entity Recognition Algorithm

Posted on:2020-07-23Degree:MasterType:Thesis
Country:ChinaCandidate:C Z CaiFull Text:PDF
GTID:2404330596476640Subject:Engineering
Abstract/Summary:PDF Full Text Request
Electronic medical record is a digital record of the patient's complete course of disease.And it is important for helping doctors analyze medical records and make medical decisions.Structured electronic medical records are slowly replaced by unstructured and post-structured electronic medical records written by doctors using natural language due to complex choices,limited doctor thinking,and highly repetitive cases.Structured electronic medical records are the basis of medical big data analysis.Therefore,transforming electronic medical records written in natural language into structured data with certain rules is an important direction of medical informatics research.The emergence and use of deep learning methods has also made natural language processing for electronic medical records a research hotspot.This paper studies the named entity recognition technology based on deep learning,which can complete the recognition and extraction of entity nouns in medical texts,so as to achieve the post-structured purpose of electronic medical records.In the Named Entity Recognition(NER)task,word embedding is the most important pre-training method,which transforms the word information in the context into a vector in the mathematical space.Different from English,which is studied from the perspective of words or sentence level,the research of Chinese word embedding focuses on exploring the intrinsic radicals and stroke information of Chinese words and characters.Therefore,this paper proposes a fusion word embedding model,which contains word information and sub-word information.The combination of characters and strokes is used to form the sub-word information part,and it can extract more intrinsic word information comparing with the existing word embedding method.Through the external evaluation method,the experiment is carried out in four different named entity recognition models.The results show that the fusion model proposed in this paper can increase 1% on the F-score evaluation of NER models than the word2 vec model as the word embedding.Due to the research of Chinese electronic medical record named entity recognition,a large amount of data is required.The cost of hiring doctors and experts with the corresponding knowledge background for data labeling needs huge manpower and material resources,while the input-output ratio is extremely low.Therefore,this paper proposes a medical entity recognition model based on crowdsourcing annotation,which uses the crowd-labeled electronic medical record as input to train model.Use the idea of adversarial training to reduce the differences between crowdsourcing and improve the generalization of the model.By comparing with other named entity recognition models that voted on crowdsourced corpus,the F-score has an improvement of 2%-3%,and it also achieves better results in precision and recall.Based on the concept of DevOps,this paper designs and develops an electronic medical record system,which realizes the application of electronic medical records and medical terminology dictionary on the Web.And monitoring the application service,database and server nodes on the server side.Implement Docker container technology for CI/CD pipelines from code submission,testing to service deployment.Finally,through the API concurrent test,the hardware and node status are monitored in real time through the monitoring module,and the alarm notification is performed when the load reaches the preset pressure,thereby verifying the stability of the system.
Keywords/Search Tags:Electronic Medical Record, Natural Language Processing, Named Entity Recognition, Word Embedding, Crowdsourcing
PDF Full Text Request
Related items