Font Size: a A A

Predicting Disease Genes Based On Normalized Modules And Self-Adaptive Hopping Random Walk

Posted on:2017-09-01Degree:MasterType:Thesis
Country:ChinaCandidate:J YuanFull Text:PDF
GTID:2334330488485686Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of biology technology, biology data are increasing with geometric order of magnitude. Discovering disease genes using known biology network and multi-source data plays a pivotal role in the pathological research, pharmaceutical research and precision medicine. We predict disease genes by normalized modules and self-adaptive hopping random walk methods based on biology networks in this article, of which the main research and contribution are as following:(1) Current predicting disease genes methods based on modules mainly depend on known modules or guilt-by-association assumption, but these methods always ignore the topology feature of disease genes in specific network and isolated node. To cure the above problems, this paper presents a method which predicts disease genes based on normalized modules and phenotype ontology (NMP). Firstly, this paper defines the similarity between disease and gene based on new phenotype ontology of gene and disease. Then, because of the disease genes tends to form topological clusters, we normalize the phenotype of module where candidate gene locates in as the weight of candidate gene. At last, we validate the efficiency of NMP by leave-one-out cross validation and literature. The experiment result shows that NMP outperforms than classic NetRank, NetScore, NetZcore, Flow, RWR and new NDRC methods.(2) The protein interaction network is not completed, and includes many false positives and false negatives, it is hard to improve the precision of disease genes prediction approaches that only depend on single network. As the research shows that mutations in functionally related genes may result in similar phenotypes, the integration between phenotype and protein data will make up the defect of the current data and enhance the accuracy. Although the current random walk on heterogeneous network methods can predict disease gene efficiently, they need to adjust the hopping probability many times, so they are not universal. This paper proposes a method that is Laplacian normalization and Self-Adaptive hopping Random walk on heterogeneous networks (LSAR), and validates 1428 known disease genes by leave-one-out cross validation and leave-two-out cross validation. The result shows that LSAR not only reduces the effort to set the parameter but also outperforms the classic RWRH, CIPHER-SP, CIPHER-DN and new RWRH-RE, RWHRHN, LapRWRH methods. We predict the disease genes of BREAST CANCER, DIABETES MELLITUS, LUNG CANCER and OBESITY according to prediction result.
Keywords/Search Tags:Biology interaction network, Disease gene, Heterogeneous network, Self-adaptive hopping, Random walk
PDF Full Text Request
Related items