Font Size: a A A

Research On Named Entity Recognition For Chinese Medical Texts

Posted on:2020-08-23Degree:MasterType:Thesis
Country:ChinaCandidate:G H XuFull Text:PDF
GTID:2428330596468178Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The main research content of named entity recognition is to recognize the words and phrases which have special meaning or terminology from the unstructured texts.As an important and foundational technique in the area of natural language processing,named entity recognition is widely used in information retrieval,question answering and such on fields.In recent years,a lot of work focus on open domains mainly in English.This pa-per makes a research on the problems of recognition on the medical entities from Chinese medical texts.Different types of medical texts as the research object,includes profes-sional medical texts such as clinical electronic medical record and public medial texts such as medical queries and online question answering.Firstly,this paper construct the basic framework based on neural network methods.Then we propose the methods of ex-ternal knowledge acquisition and integration to make full use of dictionary information to enhance model performance.Finally,we take advantage of external data with transfer learning to further enhance the recognition performance.The main contributions of this paper are as follows:· Named entity recognition based on neural network method To alleviate the de-pendency on the feature engineering,this paper construct a basic framework named NN-CRF for the problem of NER.Also,three datasets of Chinese medical texts for NER are collected and annotated.Then,the experiments explore the effects of char-acter input and word input on recognition performance.In addition,the experiments show that the methods of neural network can achieve better performance than the methods of traditional statistical machine learning without relying on feature engi-neering.Finally,we compare the performance of three typical neural network and give practical guide for rational model design· Enhance named entity recognition based on external knowledge Considering the large amount of external resources in the medical field,we incorporate the exter-nal knowledge into the model to recognize the rarely or unseen entities in training sets.Taking medical dictionary information as an example,we not only propose two methods of external knowledge acquisition which are feature template method and character and word combination respectively,but also present two methods of external knowledge integration which are direct input method and indirect input method.Besides,the experiments verify the effectiveness of above approaches to improving the generalization ability of the model· Improve named entity recognition based on external data This paper lever-ages the value of external data with transfer learning to alleviate the problem of lack of annotated data.We introduce two methods to address the issue.On one hand,we use a large number of unlabeled data to pre-train model with a language model as a task and transfer knowledge of parameters to accelerate model convergence and boost performance.On the other hand,we fully utilize the relevant annotated datasets under multi-task learning.Specifically,a shared-private parameter frame-work is proposed and it obtains effective training under improved iterative strategy to maximize the performance of target domain.
Keywords/Search Tags:Named entity recognition, Medical texts, Neural network, External knowledge, Transfer learning
PDF Full Text Request
Related items