Font Size: a A A

Research On Text Mining Method For New Medicine Discovery Of Chinese Medicine

Posted on:2022-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:J H HeFull Text:PDF
GTID:2504306524480894Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the progress of the modernization of Chinese medicine,the research literature of Chinese medicine is increased exponentially.There are some problems existed in new Chinese medicines development,such as low profit-risk ratio,long cycle and weak related basic research.In view of the problems existing in the research of new Chinese medicine,a multidisciplinary collaboration research between Chinese medicine and computer science is conducted.In this thesis,computer deep learning technology is applied to text mining in the field of medicine and pharmacy,and then drugs with similar pharmacology with target drugs are recommended according to the similarity of pharmacological texts,so as to achieve the purpose of virtual screening for new Chinese medicines development.In this thesis,text is taken as the research object,and several research works are carried out,namely,pharmacological named entity recognition,drug-drug interaction extraction,new drug discovery based on text similarity learning,and building a new drug discovery platform based on text mining.The specific works of this thesis are as follows:1.In view of the incomplete and non-standard data of the pharmacological effects of Chinese medicine and the fact that pharmacological research results mostly appear in literatures,and no text mining research has been carried out int the field of Chinese medicine pharmacology,a research on pharmacological named entity recognition is carries out.Since the current named entity recognition model does not make use of the stroke semantics of Chinese characters,a stroke based pharmacological named entity recognition model is constructed,and a pharmacological named entity recognition corpus is constructed.The proposed model is evaluated on the pharmacological named entity recognition corpus and the SIGHAN 2006 public corpus,and the F1-score reached 69.86 and 90.84,respectively.2.In view of the problem that the current drug-drug interaction extraction models do not take into account knowledges in the medical and pharmacological fields,a feature-enriched drug-drug interaction extraction model is proposed in this thesis.In order to make use of knowledges in the field of medicine and pharmacy,texts in the field of medicine are used to pre-train the word vector.According to the characteristic of the sample in corpus,proposed model learns the relative distance of each word to the drug pair in the sample.The model is stacked by CNN and RNN,where CNN is used to extract the N-gram features of each word and RNN is used to calculate the context features of the whole sentence.The model is evaluated on the DDIExtraction 2013 corpus,and the final F1 value is 73.9.3.New drug discovery of Chinese medicine can be carried out according to the pharmacological similarity of Chinese medicine.In view of the current situation that new drug discovery is based on the pharmacological similarity experiment,but the experiment method is time-consuming,this thesis proposes to use the pharmacological text similarity to represent the pharmacological similarity.In this thesis,a corpus for pharmacological text similarity learning is construted by clustering algorithm,and then a text similarity learning model based on the siamese network model with attention mechanism is construted.The peoposed model is evaluated on the ATEC public corpus and the constructed pharmacological similarity learning corpus,and the F1 value reached 54.4 and 45.8 respectively.4.A new drug discovery platform based on text mining with a browser/server architecture is designed and built.The platform is developmented with Java,HTML,Java Script,My SQL,and Spring Boot framework.The platform includes three functional modules: user management,literature mining and new drug discovery.Pharmacological named entity extraction,drug-drug interaction extraction and new drug recommendation based on the literature uploaded by users are implemented in this platform.Based on text mining and pharmacological text similarity learning,the platform provides decision support for the research and development of new Chinese medicine.
Keywords/Search Tags:Named entity recognition, drug-drug interaction extraction, text similarity calculation, discovery of new Chinese medicine
PDF Full Text Request
Related items