Font Size: a A A

Study Of The Duplication,Divergence And Pleiotropy Of Genes By Analysis Of Gene Family Sequences

Posted on:2011-04-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y W CengFull Text:PDF
GTID:1100360305997458Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
A gene family denotes a subset of genes that has the common ancestor and sequence similarity. Now, sequence analysis of gene family has permeated into every aspects of biology-related area and become a regular tool. It is widely used for the investigation of the species origination, differentiation, evolutionary mechanism and detection of natural selection.In the dissertation, we tried to dig the information behind gene families from three aspects:duplication time, functional divergence and gene pleiotropy. Although we only use sequences from vertebrates, flies and yeasts, the method we used can be extended to similar research on other species. Our major research content and conclusion are as the following:1. In human genome, the age distribution of gene duplication was found to present two-wave duplication and an ancient component. By functional constraint, gene duplication in different functional categories will probably show different age distribution. However, functional categories are correlated and will affect each other, which result in similar gene retention pattern. There is finding that some functional categories have bias in the retention in different stages of evolution, but the correlation between different categories is not investigated. Thus, we developed a pipeline to estimate the age of duplication events with which we estimated most of the age of duplication events in human and zebra fish. We analyzed the retention pattern of duplicate genes in different GO functional categories and found two distinct patterns. One cluster is correlated with development and signaling. The other cluster is correlated with organism physiology process. For detailed information of the first cluster, we used a stricter method to estimate the duplication age of genes from three signaling-related super gene families. Their age distribution pattern is similar to the GO "signal transduction" categories. Besides, we compared the difference of age distribution of genes in different subcellular localizations. We found, from the nucleus to extracellular space, the age distribution patterns are almost similar. Besides, in the recent 600 million years, the gene duplication has accelerated a bit. Summarily, gene duplication is consistent in function-related categories as well in different subcellular localizations.2. The gene duplication and following functional differentiation are considered the source of function diversification of genomes. Although several models have been proposed to describe the patterns of functional divergence after gene duplication, an appropriate measure of functional distance between different duplicates is highly needed. In this paper, we proposed a simple method to measure the functional distance between each two subfamilies. We have performed a new statistical test on ten 3-cluster vertebrate gene families which have been generated after two rounds of whole genome duplications, and found two patterns of functional divergence after gene duplication(s), indicating two rounds of gene duplications may have distinct roles in the functional diversification. Functional distance analysis may provide a simple measure for the level of functional divergence between gene clusters after gene duplication(s) and further shed light on the mechanism of functional innovations in functional genomics.3. Biologists have long recognized the importance of gene pleiotropy, that is, single genes affect multiple traits, which is one of the most commonly observed attributes of genes. Yet the extent of gene pleiotropy has been seriously under-explored. Theoretically, Fisher's model assumed a universal pleiotropy, that is, a mutation can potentially affect all phenotypic traits. On the other hand, experimental assays of a gene usually showed a few distinct phenotypes. We estimated the effective gene pleiotropy for 321 vertebrate genes, and found that a gene typically affects 6-7 molecular phenotypes that correspond to the components of organism fitness, respectively. The positive correlation of gene pleiotropy with the number of gene ontology biological processes, as well as the expression broadness provides a biological basis for the sequence-based estimation of gene pleiotropy. On the other hand, the degree of gene pleiotropy has been restricted to a digital number of molecular phenotypes, indicating that some cautions are needed for theoretical analysis of gene pleiotropy based on the assumption of universal pleiotropy.4. We introduce a software, Genepleio, to calculate gene pleiotropy, and calculate the estimation error and give the suggestion to improve the estimation. For wider appliance of gene pleiotropic measure, we also designed a method to estimate the site-specific gene pleiotropy. Using Genepleio, we studied the gene pleiotropy distribution in vertebrates, flies and yeasts. We found vertebrates and flies have similar gene pleiotropy distribution, with average of gene pleiotropy below yeasts. Moreover, we calculate the gene pleiotropy of disease-related genes. Their pleiotropy is only a little below other genes.
Keywords/Search Tags:gene family, sequence analysis, functional differentiation, gene pleiotropy, evolutionary analysis
PDF Full Text Request
Related items