Font Size: a A A

Codon-Pair Usage And Genome Evolution

Posted on:2010-03-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:F P WangFull Text:PDF
GTID:1100360278968082Subject:Theoretical Physics
Abstract/Summary:PDF Full Text Request
Codon analysis and its application in bioinformatics and evolutionary studies are important issues for investigating the genome evolution,protein function and interaction between genetics and environment.It is well known that synonymous codon usage is non-random.Codon-pair usage,like codon usage,has also been found to be highly biased. The vast majority of prokaryotic and eukaryotic species have a non-random codon-pair usage.In this dissertation,in order to demonstrate possible evolutionary constraints that shape codon-pair context,we investigated the codon-pair usage in different evolutionary level genomes of organisms.The main contributions are summarized as follows:1.The distributions of numbers of modes(DNM) of codon-pairs in protein coding sequences(CDSs) and the frequency of base triplet pairs in intergenic sequences(IGSs) are analyzed in 110 fully sequenced genomes.We propose that these distributions are in accordance with a gamma distribution.By studying the shape parameterαvalue of gamma distribution a distinct relation between theαvalue and the genome evolution is obtained.The modes of evolution for protein coding sequences and intergenic sequences are significantly different.For codon-pairs in CDSs,theαvalue increases in the order Archaea, Bacteria,and Eukaryota,and divides the species into three evolutionary groups,Archaea, Bacteria and Eukaryota.For triplet pairs in IGSs,on the other hand,theαvalue classifies the species into two groups,one is Bacteria and the other is Archaea and Eukaryota.The findings indicate that the codon-pair contexts contain biologic evolution information,and suggest the existence of fundamental differences of evolutional constraints imposed on CDSs and IGSs among Archaea,Bacteria,and Eukaryota.2.Based on the codon-pairs usage,a method of similarity analysis of genomes,which could be used to construct phylogenetic trees using the codon-pair usage bias and the dinucleotide frequencies within codon-pairs,is proposed.A phylogenetic tree that is constructed using the dinucleotides frequencies within codon-pairs in 40 mode organisms shows that the organisms are apparently divided into three evolutionary groups,Bacteria, Archaea,and Eukaryota.Another phylogenetic tree constructed using the index reflecting codon-pair usage bias is consistent with the phylogenetic tree constructed based on the dinucleotides frequencies within codon-pairs.Our results indicate that the component of dinucleotides within codon-pairs that reflects information of life evolution is one of the determinants of codon-pair bias.3.The patterns of codon-pair usage in the genomes of Rickettsiabellii and Anaeromyxobacterdehalogenans2CP-C that have extremely biased genomic compositions are analyzed.The results show significant differences of codon-pair usage bias between the leading and the lagging strands,suggesting that codon-pairing is influenced by strand-specific features.The strand-specific features may include the biased codon usage, gene orientation bias,context-dependent codon bias etc.Therefore,asymmetry of codon-pair usage between DNA double strands in above two genomes seems to be the result of strand-specific mutational biases and natural selection probably acting at the levels of replication,transcription and translation.4.In view of the value of the shape parameterαof gamma distribution is related with genome evolution,firstly,the linear regression between the r values of codon-pair and the parameterαvalues of gamma distribution is analysed in ten archaea,fifteen bacteria and five eukaryotes genomes.The results show that two parameters have a significant linear correlation for part of codon-pairs.Secondly,the usage of dinucleotides composed of the third position nucleotide and the first position nucleotide(cP3cA1) within codon-pair is analyzed,and the result indicates that the usage of cP3cA1 is significant biased.Finally,the modes of preferred and rejected codon-pairs are analyzed in three domains of life,and it is found that the modes of preferred and rejected codon-pairs are different from three domains. The above results confirm again that codon-pair usage is associated with genome evolution.5.The codon-pair usage is analyzed in the genome of Anaeromyxobacter dehalogenans2N-C.It is found that the codon-pair usage is highly biased,and about 5.2% modes of codon-piars are absent in the genome.Our analysis shows that the pattern of codon-paring in the genome could be the result of at least three different forces:(ⅰ) the local and total genome GC content,(ⅱ) composition of dinucleotides of codon-pair,and(ⅲ) the level of dipeptides conservation.
Keywords/Search Tags:Codon-pair usage, triplet-pair usage, Г(α,β) distribution, Genome evolution, phylogenetic tree, Dinucleotide, Hierarchical cluster, Leading strand, Lagging strand, Asymmetry
PDF Full Text Request
Related items