Font Size: a A A

Research On Named Entity Recognition Based On Deep Learning

Posted on:2024-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:B LiuFull Text:PDF
GTID:2544307061469214Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Named Entity Recognition(NER)is one of the major means of information mining and plays an important role in the construction of knowledge graphs and the application of recommendation systems.With the continuous innovation and development of deep learning technology,the effects are remarkable in completing the tasks of NER,but it still faces many challenges,such as the inaccuracy of entity recognition.There are two main reasons cause this problem,one is the lacking of annotated data in the medical field,which leads to poor model recognition performance,another one is determining entity boundary is quite difficult,so incomplete semantic information extraction happens from time to time.In view of the above problems,based on the deep learning method,this thesis makes an in-depth study of the structural characteristics of electronic medical records and the key technologies of Named Entity Recognition,and respectively puts forward the method of introducing the counterfactual mechanism to enhance the vocabulary and the method combining multi-feature embedding and multi-network integration to solve the existing problems in the task of Named Entity Recognition in the medical field.The main contributions and innovations of this thesis are as follows:(1)Aiming at the shortage of labeled data in the medical field,this thesis proposes a model to enhance vocabulary by introducing counterfactual mechanism.This model improves on the traditional lexical enhancement method,with the help of counterfactual idea,this model interferes with the context in which the entity is located and replaces the entity with other entities of the same type to generate new data.The mask language model is introduced to fill the non-entities in the context,and the two-step strategy is implemented to complete the data enhancement.In this model,not only the lexical information in the annotated corpus is integrated,but also the character features and position features are embedded,which can further determine the entity boundary position and enrich the character feature expression.Experimental results show that the model has demonstrated superior performance on both datasets,effectively solving the shortage of labeled data in the medical field and greatly enhancing the corpus.Meanwhile,the method of integrating lexical information is adopted in this thesis to enrich features and improve the accuracy of entity recognition.(2)Aiming at the problems of inaccuracy of entity extraction and incomplete semantic information extraction,this thesis proposes a model combining multi-feature embedding and multi-network fusion.Based on character embedding and word embedding,the model integrates radical features and external knowledge features,and constructs a multi-semantic dictionary to further determine the location of entity boundary.By improving the graph convolutional network model,an adaptive graph convolutional network model is proposed,which captures the global semantic information by continuously aggregating the features of adjacent nodes,and integrates with the features extracted from the long short-term memory network to achieve double-path feature extraction,capture the text features at a deep level,and greatly enrich the expression of semantic information.The experimental analysis of the model in this thesis is carried out on two datasets respectively,the results show that the performance of the model has improved compared with the current advanced models,which effectively solves the problem of inaccurate entity recognition,and fully proves the effectiveness of the model.
Keywords/Search Tags:Named Entity Recognition(NER), vocabulary enhancement, counterfactual structure, multi-feature embedding, multi-network integration
PDF Full Text Request
Related items