| Since the beginning of the Human Genome Project,novel omics technologies are constantly emerging.Life science have entered multi-omics era.Compared with methods that use only a single data type,data integration approach enables a more comprehensive and informative analysis of biomedical data.Integrating multiple data types can compensate for missing or unreliable information in any single data type,and multiple sources of evidence pointing to the same result are less likely to lead to false positives.New algorithms for integrating multi-omics or multi-dimensional biomedical data become indispensable key technologies for multi-omics research.Furthermore,the accumulation of multi-dimensional drug informatics data may offer new opportunities for drug research and development,especially drug repositioning.However,integration of multi-dimensional drug data for precision repositioning remains a pressing challenge.In this research,we proposed a multi-omics data integration algorithm based on random walk with restart on heterogeneous network.We use the algorithm to integrate similarity networks established from multiple omics data.The algorithm consists of two main steps:(1)Construction of similarity networks for each data type and construction of a heterogeneous network in which corresponding samples of multiple similarity networks are connected.(2)Random walk with restart on the heterogeneous network.After several iterations,the stationary probability distribution can be obtained.Integration of similarity networks utilizes the stationary probability distribution.We applied our method to TCGA data.Three types of omics data are integrated and network clustering(subtyping)is conducted.Experiment results show that our method performs better than previous methods.The subtyping results also provide analytical basis for clinical applications.Secondly,we proposed a systematic and extensible framework of drug repositioning,namely Prediction of drug therapeutic property by Integrating Multi-dimensional Data(PIMD).We introduced a multi-dimensional data integration algorithm to integrate multi-dimensional drug property data.To assess the analytical power of PIMD,we constructed an integrated drug similarity network(iDSN)from drug structure,side effect profile,and target protein sequence data,representing chemical,clinical,and pharmacological property,respectively.The analyses of the contributions of each drug property indicated that PIMD made full use of common and complementary information about drugs.According to quantitative index,the iDSN performed better than published drug similarity networks construct by single drug property.By spectral clustering,we distinguish 32 communities in the iDSN,providing clues for drug repositioning from two aspects,which are drug pairs with high iDSN similarity score and unexpected drugs in each community.Additionally,we provided 5 kinds of drug and target enrichment analysis paradigms to label and annotate drug communities from multiple views for repositioning analysis.Within the top 20 recommended drug pairs,7 drugs have been reported to be repurposed.Specifically,PIMD is an open and extensible drug repositioning framework.The high expansibility and modularity of PIMD allow researchers to explore drugs from a wider range of fields.At last,to achieve drug precision repositioning,we performed prediction of drug-target-disease relationship.Considering multi-dimensional drug data,multi-dimensional target data and multi-dimensional disease data,we use nonlinear integration algorithm to integrate multi-dimensional data and construct three integrated networks for drug,target and disease respectively.The three networks are connected by known drug-target-disease relationships to form a drug-target-disease heterogeneous network.Random walk with restart algorithm is applied to the heterogeneous network to predict new drug-target association,drug-disease association and target-disease association.The performance of our method is better than previous methods.In summary,the research have the following two innovations: First,the multi-omics data integration algorithm applies the theory of random walk to data integration.It is a network-based data integration strategy that makes full use of the information of each data type and the topology information of the entire network.Second,the two frameworks proposed here facilitate the integration of new drug information and knowledge as required,help achieve drug repositioning more accurately and reveal new drug’s mode of action. |