Font Size: a A A

Research On Human Disease Genes Identification Algorithm Based On Integrating Multiple Networks

Posted on:2016-10-04Degree:MasterType:Thesis
Country:ChinaCandidate:Q ChenFull Text:PDF
GTID:2404330473464834Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As the successful completion of the Human Genome Projectm,detecting disease-associated genes is the basis of understanding the pathogenesis of disease,clinical diagnosis and the prevention measures,with significant social and scientific significance.So genetic disease causative gene prediction gradually has become one of the hotspots in the field of biomedical research.With the development of bioinformatics,various types of biological network data provide a huge help to predict disease causing gene.Systems biology research shows that the same disease or similar disease is caused by function related gene.Many researchers based on this research finding to integrate protein interaction networks,disease-phenotype similarity network,disease-gene association data and so on,to use the underlying biological information in different network effectively so as to realize the prediction of human disease causing gene.Based on the analysis of previous studies,this article link the human protein complex information to the process of disease gene prediction and proposed the ENPCANG(ENhancing the Prioritization of CANdidate diseases Genes via integrating multiple network)algorithm,which combining with the random walk with restart algorithm and the random walk on heterogeneous networks algorithm to mining causing genes related information from the genetic level and module level respectively.Compared with several recently proposed human disease gene prediction algorithm,the ENPCANG algorithm improves the prediction precision.The ENPCANG algorithm are applied to predict the alzheimer's disease's and the breast cancer's causing genes,and the related experimental results further show the effectiveness of this method.On the other hand,the previous research shows that the data of gene related function is one of the most effective data in predicting disease causing genes.Based on the information of KEGG pathway annotations,this article provide a new methods,Ke_Lap_RWRH(combining Kegg pathway and Laplacian network normalization based Ramdom Walk with Restart on Heterogeneous network),by combining the KEGG pathway with Laplace normalization which can exploit the topological characteristics of network fully,to predict complex disease genes.And the Laplace normalization can strengthen the weight of seed nodes in the network,which can enhance the modular nature of both PPI network and disease similarity network.Related experimental results show the significant performance of the Ke Lap RWRH algorithm,and the case studies of Diabetes II validates the feasibility of this algorithm in predicting disease genes to some extent.
Keywords/Search Tags:disease gene, disease-phenotype similarity, protein-protein interaction, heterogeneous network, random walk with restart
PDF Full Text Request
Related items