Font Size: a A A

Research On Chinese Medical Text Named Entity Recognition Method

Posted on:2024-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:J Y ZhangFull Text:PDF
GTID:2544307115957469Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous development of deep learning models in the field of natural language processing,the domestic medical assessment tasks emerge in endlessly,information extraction task has been widely concerned by many researchers,and named entity recognition of Chinese medical text is one of the critical tasks.The implementation of named entity recognition in the medical field can quickly obtain relevant knowledge from massive medical data,which is beneficial to improve the efficiency and quality of medical scientific research,and can provide services for downstream subtasks(such as medical question answering).Because medical texts contain a lot of medical professional knowledge and terms,how to make machines recognize entities in Chinese medical texts with high accuracy is a key problem.This paper focuses on the recognition method of named entities in Chinese medical texts,and proposes an entity recognition strategy based on multi-model fusion and hybrid neural network.The main research content of this paper includes:(1)Construction of datasets and dictionaries.The data set used in this paper is mainly from the Chinese medical text named entity recognition evaluation task,including 20,000 pieces of data.The data were sorted and divided and BIO annotated to construct the experimental data set.By crawling online dictionaries and related materials and cleaning the data,a partial dictionary for medical field is constructed.(2)Entity feature extraction based on hybrid neural network.For nine types of medical entities in medical texts,this paper uses Bi LSTM to extract context feature and construct feature vectors,IDCNN to extract local feature and take into account long-distance features,and the external feature vector is constructed by using the partial radical dictionary.The three feature vectors are spliced in the hidden layer and input to the decoding layer.To realize medical entity recognition based on multi-class features.(3)Named entity recognition of Chinese medical text based on multi-model fusion.Based on the superior performance of pre-trained models in named entity recognition tasks,this paper combines Ro BERTa,Ro Former V2 and ERNIE-Gram three feature extraction mechanisms and models with large differences in pre-training methods in the decoding layer to identify medical entities.Aiming at the nested entities in medical texts,Global Pointer is used in the decoding layer and the idea of global normalization is introduced to identify.At the same time,this paper designs and implements a corresponding Chinese medical text named entity recognition system,and conducts experiments on relevant data sets to verify the effectiveness of the proposed method.
Keywords/Search Tags:Chinese medical text, Hybrid neural network, Named entity identification, Multi-model fusion
PDF Full Text Request
Related items