Research On Medical Named Entity Recognition Based On XLNet-CRF

Posted on:2022-03-05

Degree:Master

Type:Thesis

Country:China

Candidate:H Jin

Full Text:PDF

GTID:2494306602455674

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Biomedical Named Entity Recognition(BioNER)is a basic and important biomedical information extraction task,which aims to extract biomedical entities from biomedical texts.In recent years,deep learning has become the mainstream research direction of BioNER due to its data-driven context encoding capabilities.However,the current models still have deficiencies.Training costs are high and data is difficult to obtain,which has also become a major bottleneck.To solve the above problems,this paper proposes the XLNet-CRF model architecture,which optimizes the model from two perspectives:encoding and decoding.Furthermore,this paper proposes noise reduction learning and hierarchical shared transfer learning to improve effect and generalization.main tasks as follows:1.This paper proposes the XLNet-CRF architecture on BioNER for the first time.The XLNet pre-trained by the permutation language model has stronger denoising word-level coding ability,and CRF decodes the label context.By comparing the experiment of the model and the XLNet model,an average F1 value improvement of 0.912 was achieved.2.This paper proposes two noise reduction models based on XLNet-CRF,shared labels and dynamic splicing.Through experiments on 15 BioNER datasets,the two models have achieved an average F1 increase of 1.482 and 1.48 respectively,state-of-the-art performance was achieved on 7 of them.Through further analysis and experiment,it is proved that these two models have stronger decoding capability on conditional random field.The applicable scope of the model based on different data characteristics is studied.3.In view of the unstable effect and poor generalization of deep learning in BioNER,this paper proposes hierarchical shared transfer learning.Combining multi-task learning and fine-tuning learning,the multi-level information fusion of the bottom-level entity features and the upper-level data features is realized.Select 14 datasets containing 4 types of entities for training and evaluate the model on 6 subtasks on 5 gold standard datasets.Experimental results show that the model has more generalization biomedical entity recognition capabilities than using multi-task learning and fine-tuning learning alone.

Keywords/Search Tags:

Biomedical Natural Language Processing, BioNER, Permutation Language Model, Transfer Learning

PDF Full Text Request

Related items

1	Research Of Biomedical Named Entity Recognition Method Of BioBERT-based Hybrid Model
2	Information Extraction For Evidence Based Medicine Using Natural Language Processing
3	Construction Of Diabetes Knowledge Map Based On Chinese Natural Language Processing
4	Research On "Treatise On Febrile Diseases" Based On Natural Language Processing
5	Deep Learning For Natural Language Processing Of Classifying Neuro-Oncology Imaging Reports And Prognosis Analysis
6	Research On Detecting Biomedical Event And Its Trigger
7	Research On Extraction Method Of Biomedical Entity Relationship Based On Deep Learning
8	Research On BERT Model For Chinese Clinical Language Processing
9	Research On Label Technology Of Medical Data Based On Natural Language Processing
10	Research On The Key Techniques Of Biomedical Text Mining