Font Size: a A A

Prediction Method Of Pathogenic Genes On Heterogeneous Biological Network

Posted on:2018-12-02Degree:MasterType:Thesis
Country:ChinaCandidate:N X DingFull Text:PDF
GTID:2370330515955898Subject:Computer technology
Abstract/Summary:PDF Full Text Request
It is an important research topic in bioinformatics to predict the pathogenic genes on heterogeneous biological network.The prediction of genes related to genetic diseases is of great significance for the discovery of drug target genes,the improvement of medical care,prolonging the life cycle of patients and the implementation of biological experiments.With the development of computer technology and artificial intelligence,the research of bioinformatics have entered a new stage,many types of human biological data have been discovered and published,which play a more and more important role in the prediction of pathogenic genes.There are three main strategies for predicting gene-disease associations:positioning candidate cloning strategy,positioning strategy and non-positioning strategy.With the completion of Human Genome Project;Human Genome Project and the development of the related biological data,positional candidate strategy has gradually become the main method to find the causative gene.With the advent of large-scale genetic and disease related biological data,we can use the link prediction algorithm to complete the prediction of pathogenic genes.In this dissertation,we focus on predicting related genes for diseases on heterogeneous biological network with data mining approaches.Specifically,we do the following two tasks and innovations:(1)Based on the data of other nonhuman homologous genes,this dissertation constructs a heterogeneous network and proposes a probability-based collaborative filtering model for predicting gene-disease associations.We assume that in the same feature space,if the Euclidean distance between two nodes is closer,they get more similar.Based on this hylpothesis,the relationship between diseases and genes was changed into a two classification problem by probability distribution.In order to improve the accuracy of prediction,we also add a lot of prior information,and add two different constraints to form the two improved models.In order to check the effectiveness of the proposed model,we make a number of experiments on real biological data,and compare with the existing prediction algorithms to analyze the performance of this model.(2)In order to deal with the lack of known negative samples on the heterogeneous biological network,we propose an PU Learning-based matrix decomposition model.With this model,we consider predictions on heterogeneous biological network as a recommendation problem on recommender systems;we add the Learning PU method to the inductive matrix completion model.The proposed model solves the problem of the lack of the negative samples on biological network,and can be used to predict novel pathogenic genes which are not on the training sets.
Keywords/Search Tags:Biological Heterogeneous Network, Prediction of Pathogenic Genes, Collaborative Filtering
PDF Full Text Request
Related items