Font Size: a A A

Domain Adaptation Research And Application Of Named Entity Recognition

Posted on:2021-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:2428330605467909Subject:Engineering
Abstract/Summary:PDF Full Text Request
Named entity recognition is one of the core tasks in the field of natural language processing.Its task is to extract specific types of entities from text.This task has important scientific significance and wide application value in downstream natural language processing tasks,such as information retrieval,question answering system,information extraction,text mining,and public opinion analysis,etc.Judging from the existing research results,the results of the recognition of named entity recognition research in the proprietary field(social media,medical field)are trapped in the small scale of high-quality annotated corpus,which is worse than traditional fields.This makes the recognition of named entities in proprietary domains a challenging field of research.How to carry out domain transfer and improve the performance of specific domain models is the main research content of this topic.The main research contents of this article include:(1)Summarize the research background and development history of named entity recognition,analyze the performance advantages and disadvantages of more common named entity models,and explain the feasibility of transfer learning in the field of named entity recognition.(2)Summarize and improve named entity recognition with Deep Learning to obtain the Bi LSTM-CRF model.This model converts character text into low-dimensional dense vectors through the Glove.At the same time,it uses Bidirectional long short-term memory network to extract character-level features.The combined vector representation uses the CRF layer to calculate and output the optimal label sequence.Build an end-to-end named entity recognition model.(3)Designed and implemented the ERNIE-Bi GRU-CRF model.For the problem of processing the named entity recognition with Deep Learning,Traditional word embedding method map words or chars into a single vector,which can't represent the ambiguity of the word in the context.The ERNIE-Bi GRU-CRF model is proposed.The model expresses the semantic perceptual representation of the enhanced words of the ERNIE pre-training model through Enhanced Representation from k Nowledge Int Egration.The multivariate data knowledge is introduced to generate the semantic vector,then the word embedding is input to the GRU layer to extract features,and finally the label sequence is obtained through the CRF layer.(4)Designed and implemented a transfer learning neural network TL-Bi LSTM-CRF.Firstly,constructing the model through integrated with the character embeddings with the morphological features of character-level extracted by bidirectional long short-term memory network and word embeddings with semantic,word order and other feature information.Secondly,transferring the model through introducing the word adaption layer into it and bridging the gap between the source and target embeddings spaces employing the method of canonical correlation analysis.The above models are experimentally verified on relevant data sets.The three methods of recall,accuracy,and F1 value commonly used in the field of natural language processing are used as evaluation indicators.The experimental results show that the above learning models for transfer learning have a certain degree.The feasibility and effectiveness of the model in this paper are verified.
Keywords/Search Tags:Natural language processing, Named entity recognition, Transfer learning, Recurrent neural network, Pre-trained language model
PDF Full Text Request
Related items