Font Size: a A A

Reconstructs Rare Disease Classification With The Integration Of Systems-level Molecular Data And Phenotypic Data

Posted on:2019-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:X PanFull Text:PDF
GTID:2394330566460747Subject:Life medicine engineering
Abstract/Summary:PDF Full Text Request
With the development of biomedical data-capturing technology,omics sciences producing more and more molecular and medical data.The number of different biological entities(e.g.genes,phenotypes,diseases,etc.)for which data can be collected is increasing significantly.Orphanet is well established classification of rare diseases.It relates and classifies rare diseases based on the observation of clinical symptoms and signs.However,the growing number of heterogeneous genomic,proteomic data currently has not fully contributed to this classification yet.Previous studies have indicated that genetic knowledge of disease can determine nosology and be the most important factor to predict associations between diseases.In the meantime,the study of disease relationships has shifted from simple sharing of single entities,such as genes,to fuse systems-level molecular data.Motivated by these works,we introduce a computational framework to integrate various biological network using a nonnegative matrix tri-factorization model with graph-regularized(GNMTF).This model takes all network data in a matrix form and performs simultaneous clustering of genes,phenotypes and rare diseases,inferring new relations between rare diseases.Remarkably,by fusing gene interactome,rare disease-phenotype network and rare disease-gene network,91% of rare disease relations classified with our method are the same as that in Orphanet.Then,we detect rare disease communities based on a newfound topology of disease network and use link clustering to build a dendrogram.We find rare diseases in captured communities exhibit significant molecular relations.Furthermore,we examine the contribution of each included data source to the inferred model,further emphasizing the importance in the shift towards systems-level molecular data integration.Finally,we use matrix completion to predict gene-disease associations and find the performance outperforms the previous methods in candidate gene prioritization.
Keywords/Search Tags:rare disease classification, heterogeneous network fusion, genotype, phenotype, matrix factorization, candidate gene prioritization
PDF Full Text Request
Related items