Font Size: a A A

Research On The Application Of Ontologies In Disease Related Problems

Posted on:2017-04-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y DengFull Text:PDF
GTID:1360330542492891Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Bioinformatics tackles the problems in biology using technologies,theroies,models and tools in information science and mathematics.Diseases are closely related to human health care and are one of the most important problems in life science.In recent years,systems biology approaches,especially approaches based on biological networks,have become powerful tools for studying diseases.Biological networks,including protein-protein interaction networks,gene regulatory networks and metabolic networks,not only include the relationships between individual proteins/genes/diseases,but also represent higher level organization of cellular communication including biological modules and pathways.Ontologies are one of the important tools in biomedical researches.Using controlled vocabularies consisted of terms in ontologies,concepts such as genes and diseases can be described and compared in a formalized appraoch.Biomedical ontologies,such as Gene Ontology,Human Phenotype Ontology,Disease Ontology and Mammlian Phenotype Ontology,can be used to characterize the similarites between genes and diseases in multiple perspectives.These ontologies are used in the different studies including disease gene prioritization and protein interaction prediction.This dissertation is focused on the application of ontologies in disease related problems.Specifically,the following contributions are made.1.Semantic similarites and enrichment analysis based on phenotype ontologies.Since there are no convenient tools for calculating semantic similarities based on Human Phenotype Ontology and Mammalian Phenotype Ontology,terms and their relationships are first extracted from the text-based ontologies data,then tools for semantic similarites calculating and enrichment analysis based on Human Phenotype Ontology and Mammalian Phenotype Ontology are built,which provide multiple semantic similarity measurements and enrichment analysis methods to analysis the phenotypic feauters of genes and diseases.The results of experiments show that semantic similarities based on phenotype ontologies can be used to characterize the similarities between genes and diseases in the level of phenotype.2.Protein interaction prediction using heterogeneous features.The predictive abilities of biological and topological features are tested on different kinds of protein interaction data sets.The experiments on binary and co-complex protein interaction data sets show that there exist big differences among the predictive abilities of different features,including semantic similaties based on 3 sub-onbologies of Geno Ontology,co-pathway similarity based on KEGG and topological structure of protein interaction network.And using a combination of heterogeneous features can archieve better prediction performance than single feature.3.Integrating phenotypic features and tissue-specific information to prioritize disease genes.The proposed approach is based on the heterogeneous network model consisting of a protein interaction network and a disease network.Protein interaction network and disease network are built by considering two factors: phentypic features and tissue-specific data.The effectiveness of these two factors are then evaluated.The results of case studies reveals that integrating phenotypic features with a tissue-specific PPI network improves the prioritization results.And proposed method for builting disease network is effictive on the diseases no in the common-used 5080 disease similarity data set.4.Integrating cross-species data to prioritize disease genes.Phenotype ontologies in human and mouse are integrated to map between terms in different ontologies.Using the associations between human disease and mouse disease models,experiment data of mouse are used in the prioritization of human disease genes.The results of experiments show that integrated mouse phenotype data with human phenotype data improves the prioritization results.Based on the homology associations between human and mouse,tissue-specific protein interaction data of mouse is integrated with that of human.The results of case studies reveals that integrating tissue-specific mouse protein interactions with human protein interactions improves the prioritization results.
Keywords/Search Tags:biomedical ontologies, complex diseases, semantic similrity, disease genes, biological networks
PDF Full Text Request
Related items