Font Size: a A A

Research On Identification Of Bacteria Named Entity Based On Deep Learning And Language Model

Posted on:2021-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:X S LiFull Text:PDF
GTID:2370330605961306Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The interaction network between bacteria is closely related to human health and ecological environment.There are a large number of bacterial interactions in medical documents.It will be a valuable work if these interactions are extracted and organized into a knowledge base.Text mining technology provides a feasible solution for the above problems,which includes two core tasks:named entity recognition and interaction extraction,and bacterial named entity recognition is the key step of interaction extraction.Bacterial named entities have their own characteristics,such as the continuous emergence of new entities,polysemy of a word,a large number of nested entities and so on.These characteristics make bacterial named entity recognition as a challenging task.To solve this problem,this paper studies a bacterial named entity recognition method based on hybrid depth learning and language model,and verifies the recognition effect of the model on experimental data sets.The main research work and contributions are as follows:Firstly,a hybrid deep learning framework for bacterial named entity recognition is proposed.The method of named entity recognition based on machine learning needs to design features manually,then extract features and select features.At the same time,the universality of extracted features is poor.Aiming at the above problems,this paper proposes a hybrid deep learning framework(HDL-CRF)combining convolutional neural networks(CNN),long short-term memory networks(LSTM),and conditional random fields(CRF)for bacterial named entity recognition.This is an end-to-end deep learning model,which does not need complex feature extraction,and achieves good results in the experimental results.Secondly,a method of bacterial named entity recognition based on language model is proposed.The semantics of words change with the context,but deep learning uses the word vector model to convert text into vectors for model input.For each word,there is a fixed vector representation,which will bring training errors.In order to solve this problem,this paper proposes a bacterial named entity recognition method based on language model,which can use large-scale unlabeled corpus to learn the word representation in different contexts.It is a dynamic word vector representation method,and can better understand the word meaning representation in different contexts.In this paper,the pre-trained BERT language model is used to learn the context representation of words,and then the bidirectional long-short-term memory network is used for feature extraction,and finally the conditional random field is used for label prediction.The experimental results show that the language model better represents the semantic information between words than the deep learning model,and it also achieves better performance on the bacterial entity recognition task.The bacterial named entity recognition method proposed in this paper has good performance,can quickly and effectively identify bacterial entities in large-scale medical texts,and lays a good foundation for subsequent bacterial interaction extraction.
Keywords/Search Tags:Text Mining, Bacteria Named Entity Recognition, Deep Learning, Microbial Interactions, Language Model
PDF Full Text Request
Related items