Font Size: a A A

Named Entity Recognition Of Electronic Medical Records Based On Deep Learning

Posted on:2021-05-07Degree:MasterType:Thesis
Country:ChinaCandidate:B Y YanFull Text:PDF
GTID:2404330611996832Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the extensive application of artificial intelligence in the medical field,through natural language processing related technologies,intelligent management of the patient's morbidity history,diagnosis and treatment process,and discharge status can be achieved.The application of this information in intelligent diagnosis and treatment is crucial to the construction of medical knowledge maps,decision-making assistance systems,and consultation systems.In view of the current problem of low accuracy of electronic medical record named entity recognition and the need for a large number of manual annotations,this paper uses a self-attention model combined with a two-way recurrent neural network to study named entity recognition.The main work includes:The experimental data set uses the open source electronic medical records of the National Knowledge Graph and Semantic Computing Conference,and preprocesses hundreds of original texts.Use the ternary set {B,I,O} to label five types of medical entities such as body parts,symptoms and signs,examinations,diseases,and drugs Where B is the beginning of the entity,I is the middle part of the entity,and O is the non-medical entity.In order to solve the problem of low accuracy of traditional entity Conditional Random Fields(Conditional Random Fields,CRF)medical entity recognition,a BiLSTM-CRF model based on long-short-term memory network is proposed,which combines the label constraints of conditional random fields to predict medical text sequences.Label,the F1 value of this model is 3% higher than the conditional random field.The Word2Vec–BiLSTM–CRF model is proposed,and the word vector Word2 Vec is added to improve the accuracy.Grab massive medical texts from medical websites through crawling technology,and use gensim tools to train word vectors.A comparative experiment was carried out on the electronic medical record data set,in which the model using word vectors was improved by 3% compared with no pre-trained word vectors,and increased by 6% compared to the conditional random field.Considering the slow convergence of long-term and short-term memory networks,a transformer CNN bigru CRF model is proposed to speed up the convergence.This method uses word vector combined with position coding as input,extracts text features by self attention mechanism,and annotates the extracted multi-dimensional feature sequence by combining the downstream structure.Experiments show that the F1 value of this model increases the most without adding artificial features.Finally,considering the particularity and difficulty of obtaining medical data,this paper divides the data set in proportion and takes a small amount of data for experiments.Among them,Transformer-CNN-BiLSTM-CRF is 7% higher than the random field of the baseline model condition,which verifies that the method Validity in case of labeled data.
Keywords/Search Tags:Deep Learning, Named Entity Recognition, Transformer, Electronic Medical Record, Natural Language Processing
PDF Full Text Request
Related items