Font Size: a A A

Predicting Drug-Diseases Association Through Multi-Source Heterogeneous Data

Posted on:2022-05-04Degree:MasterType:Thesis
Country:ChinaCandidate:K FuFull Text:PDF
GTID:2504306533472594Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Drug reposioning is the process of determining the potential indications of existing drugs and discovering new drug treatments for diseases.Compared with traditional drug development,drug reposioning can effectively shorten the development cycle and reduce development costs.With the open use of various biological information databases,drug-disease association analysis has become an important guiding method in drug reposioning.However,most of the various correlation prediction algorithms that have been developed currently have defects such as low utilization efficiency of multi-source data and poor model learning capabilities,resulting in unsatisfactory features extracted by the model.In view of this,this thesis has carried out feature extraction based on multi-source heterogeneous data combined with the method of network embedding learning.The specific content is as follows:(1)Construction of a heterogeneous network of drug diseases based on multi-source heterogeneous data.In order to improve the utilization of multi-source data,this thesis builds three single-similarity heterogeneous networks on the basis of multi-source heterogeneous data,and designs and proposes a multi-source fusion heterogeneous network construction method.(2)Research on shallow embedding correlation prediction model based on random walk.In order to dig deeper into the topology information of heterogeneous networks,this thesis is inspired by the field of network embedding to design a shallow embedding feature extraction model RWNN based on random walks,and implements drug-disease association prediction by building a machine learning classifier.The model RWNN improves the traditional walking algorithm through the Meta Graph-based walking algorithm,and then extracts the node embedding features by training a special neural network model.(3)Research on deep embedding correlation prediction model based on graph convolution.In order to improve the efficiency of embedding feature extraction and strengthen the learning of network node features,this thesis designs a deep embedding feature extraction model HNVAE++ based on graph convolution,and realizes drug-disease association prediction by building a machine learning classifier.The HNVAE++ model uses the graph frequency domain convolution kernel to realize the self-exploration of the network structure and the aggregation of node features,and extracts the node embedding features through the heterogeneous network variational autoencoder.In addition,in order to optimize the extraction of embedded features,the HNVAE++ model is trained based on the idea of reliable negative correlation sampling.This thesis constructs a multi-source fusion heterogeneous network on two sets of data and conducts experiments.The AUC values of the shallow embedding correlation prediction based on random walk reaches 0.9288 and 0.9297 respectively.The deep embedding correlation prediction AUC value based on graph convolution reaches 0.9569 and 0.9586 respectively.Related experimental results show that the two correlation prediction models have achieved good classification performance.This thesis includes 26 figures,7 tables,and 103 references.
Keywords/Search Tags:Drug-disease association prediction, Multi-source heterogeneous data, Random walk, Graph convolution, Embedded feature extraction
PDF Full Text Request
Related items