Font Size: a A A

Research On Extraction Method Of Biomedical Entity Relationship Based On Deep Learning

Posted on:2020-09-30Degree:MasterType:Thesis
Country:ChinaCandidate:C Y WangFull Text:PDF
GTID:2404330575981213Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,the speed of Internet development has become more and more rapid,especially in the field of biomedicine.The number of documents has increased dramatically,almost “exponential” trend.The massive biomedical literature contains a large number of medical entities.The entity relationship also implies a lot of knowledge,and the researchers are going to dig,but it is obviously not feasible to obtain valuable information from the manual reading method,mainly because it consumes a lot of labor cost and time,and The efficiency is not high.With the increasing popularity of artificial intelligence such as machine learning,deep learning,and natural language processing,many researchers have applied technology to practice,such as the field of text mining,which is designed to be highly efficient from massive literature.Excavate hidden knowledge and do more in-depth research.In 2015,Obama proposed a plan for precision medicine.With the popularization of precision medical plans,the extraction of physical relationships based on the literature in the biomedical field has also received extensive attention,and in order to better serve precision medicine.Laid a good foundation.The literature on biomedical sciences contains a variety of medical entities,such as genetic entities,drug entities,compounds,etc.,and entities also contain a variety of relationships(gene-protein,drug-drug,protein-compound,etc.).Entity relationship extraction based on biomedical literature is an important topic in natural language processing(NLP).Currently,there are many ways to achieve this in the field:(1)co-occurrence,(2)rules,and(3)machine learning.(4)The method of deep learning.Deep learning techniques are of high value in improving the accuracy of relational extraction results.At present,the existing methods for extracting these entity relationships can be roughly classified into the following categories:(1)Based on the full-text corpus,the non-specific relationship existing in the text is extracted.(2)Based on short text corpus,extract the relationship of a specific medical field,for example,protein-protein interaction,drug-drug interaction,protein-protein interaction.For the first case,a co-occurrence method can be implemented,and the second method can be implemented by a rule and machine learning method.However,in summary,few researchers extract non-specific domain relationships from short texts,which is of great significance for future research work in this field.The traditional method can obtain better shallow semantic information,but the deep semantic information learning needs to be improved,but the deep learning based method can improve this problem.This paper proposes a kind of MAT-LSTM based on deep learning.The LSTM model is used to extract non-specific entity relationships from essays in biomedical literature.The main work is as follows: Firstly,through the experimental corpus and the text corpus in pubmed,the word embedding feature is obtained.Using the existing tool-word2 vec,the position embedding is extracted at the same time,and the two types of features are combined as the input of the model through the bidirectional LSTM layer.The output information is first parsed through the "word granularity" attention mechanism layer,then enters the attention mechanism layer of "sentence granularity",and finally the n-class mapping is performed via the softmax function to predict the corresponding category.The experiment is divided into verification experiment and application experiment.For the verification experiment,the data comes from three data sets,two BioCreative benchmark data sets and one BioNLP benchmark data set to verify the validity of the proposed model MAT-LSTM.Because the above tasks have been published by researchers and compared with them,good results have been achieved;for the application experiments,the MAT-LSTM model is actually applied to extract the non-specific relationship contained in the PubMed literature.The results extracted from PubMed by using the proposed model are mostly verified by experts,indicating the practical value of the MAT-LSTM model.
Keywords/Search Tags:deep learning, natural language processing, biomedical entity relationship extraction
PDF Full Text Request
Related items