Biomedical named entity recognition and relation extraction are key tasks in biomedical information extraction,providing critical information for biomedical knowledge graph,disease treatment,and drug development.In recent years,the methods based on deep neural network have become the mainstream for biomedical information extraction.Compared with the traditional statistical machine learning method,the deep learning methods can automatically extract features,instead of designing features artificially.In this thesis,we focus on biomedical named entity recognition task and drug-drug interaction extraction task based on deep neural network method.In the biomedical named entity recognition task,there are some problems such as high sparseness of entities,fuzzy boundary of entities and special characters in entities.In this thesis,we proposes a neural network model based on CNN-BLSTM-CRF.The model extracts the character features of a word through a Convolutional Neural Network(CNN).The character features combined with word Embedding are fed into the Long and Short Time Memory Network(LSTM)in order to learn context information.Finally,a globally optimal tagged sequence is obtained by the conditional random field CRF.The experimental results show that our model has achieved the F1-score of 89.09% and 74.40% on the Bio Creative II GM corpus and JNLPBA corpus respectively.The biomedical Drug-Drug interaction extraction task faces many difficulties such as context similarity between instances,the absence of entity information.In this thesis,we propose a Drug-Drug interaction extraction model based on the attention mechanism incorporating knowledge information.Firstly,the model utilizes abstracts of the entity from Wikipedia and Drug Bank as background knowledge and encodes background knowledge into knowledge vectors through Doc2 vec model.Secondly,the model employs BGRU to encode the input sequence and learn semantic information.Then the model uses the attention mechanism incorporating knowledge information to obtain a new context representation fusing knowledge.Finally,the model utilizes the new context representation to predict the Drug-Drug interaction.Experimental results show that our model has achieved an F1-score of 71.86% on the DDI 2013 corpus.In conclusion,this thesis proposes a biomedical named entity recognition model based on CNN-BLSTM-CRF in biomedical named entity recognition.The experimental results on Bio Creative II GM corpus and JNLPBA corpus show that our CNN-BLSTM-CRF model is effective and ubiquitous.For the biomedical Drug-Drug interaction extraction task,we propose a Drug-Drug interaction extraction model based on the attention mechanism incorporating knowledge information.The experimental results on DDIExtraction 2013 corpus show that our attention mechanism incorporating knowledge information is effective and ubiquitous. |