Deep Learning Approach For Medical Named Entity Recognition

Posted on:2020-08-08

Degree:Doctor

Type:Dissertation

Country:China

Candidate:K Xu

Full Text:PDF

GTID:1364330572479190

Subject:Computer applications engineering

Abstract/Summary:

PDF Full Text Request

Medical named entity recognition plays an important role in biomedical research,and it has brought extensive research in recent years.However,three problems still exist to tackle.Firstly,the recognition accuracy problem.The number of new medical entities is increasing rapidly,while the accuracy of traditional identification methods is not high enough.Secondly,the computational efficiency problem,i.e,the deep learning-based recognition method is not structurally efficient.Thirdly,the lacking of multi-category medical entity identification problems.To improve the accuracy of medical named entity recognition,the semantic-based deep learning approach is studied.A character-based BiLSTM-CRF(CBLC)is proposed to capture the intermal structure information of a word through character-level word embedding.A semantic BiLSTM-CRP(SBLC)is proposed,which trains word embeddings on a large number of medical resources with semantic information,and uses BiLSTM-CRF to capture the relationship between the context of the semantic structure and the label,combining Ab3P to effectively recognize abbreviations.The results show that CBLC is superior to the widely used baselines such as random field and dictionary matching.SBLC is superior to the advanced approaches such as DNorm and TaggerOne.On the basis of semantics,in order to solve the problems of rare medical entity recognition and entity tagging inconsistency,an trie tree based medical dictionary matching approach is firstly designed,and then two deep learning approaches that integrate dictionary attention are proposed,i.e.,the Dic-Att-BiLSTM-CRF(DABLC)and Dic-Att-BiGRU-CRF(DABGC).DABLC weightedly combines the dictionary matching and document-level attention into BiLSTM-CRF.In DABGC,the dictionary is used to match the medical dictionary.At the same time,the bi-directional GRU network is used to train the word embedding,and the hidden state containing context information is output.It analyzes the structure between words through a multi-head attention mechanism.DABLC and DABGC can effectively utilize external dictionary resources to solve the rare and complex medical entity recognition problems,further improving the accuracy of deep learning approaches.In order to improve the computational efficiency of the deep learning approaches,two accelerated deep learning approaches are proposed.Firstly,Att-SGRU-CRF(ASC)is proposed to improve the training speed by using the sliced GRU network and the hierarchical computational structure.The attention mechanism is used to solve the problem of entity tagging inconsistency,and combined with CRF to calculate the optimal label sequence.Secondly,an attention-based iterative expansion convolutional network(AIDC)is proposed,which is combined with an iterative expansion convolutional network(IDC)and a multi-head attention approach.AIDC inputs the word embeddings into the iterative expansion convolution network to accelerate training,and outputs the final label by combining the multi-head attention mechanism with the CRF.Compared with traditional neural networks,the ASC approach is 50 times faster,and obtains a higher FI score at the same time.AIDC is 1.9 times faster than BiLSTM while maintaining high recognition accuracy.The computational efficiency of the deep learning approach is improved.To solve the problem of multi-category medical entity recognition,an approach named Text Classification Weighted Voting(TCWV)is proposed.Combined with the rank constrained linear text classification model,the texts are classified more efficiently with a small amount of training texts.TCWV integrates multiple deep learning approaches by the weight voting algorithm,and different categories of medical texts are used as input for word embedding training for different named entity categories.On the disease,chemical and genetic datasets,TCWV obtains the highest FI score,achieving the goal of multi-category medical named entity recognition.The experimental results show that the proposed methods solved some of the problems of the current deep learning methods in the field of medical named entity recognition,i.e,the problems of low recognition accuracy,low computational efficiency and multi-category medical entity recognition.It has a certain positive effect on the research of medical informatics.

Keywords/Search Tags:

Medical named entity recognition, Deep learning, Integrated learning, Natural language processing

PDF Full Text Request

Related items

1	Study On Named Entity Recognition Of Chinese Electronic Medical Record Based On Deep Learning
2	Named Entity Recognition Of Electronic Medical Records Based On Deep Learning
3	Research On Named Entity Recognition Of Electronic Medical Records Based On BERT Model
4	Research On Medical Knowledge Extraction In Electronic Medical Records Based On Deep Learning
5	Research On Medical Named Entity Recognition Based On XLNet-CRF
6	Medical Text Information Extraction Based On Deep Learning
7	Research Of Chinese Medicine Terminology Recognition Based On Deep Learning And Active Learning
8	Named Entity Recognition Of Online Medical Consulting Texts Based On Deep Learning
9	Research And Implementation Of Chinese Electronic Medical Record Named Entity Recognition Based On Deep Learning
10	Named Entity Recognition Of Chinese Medical Records Based On Deep Learning