Font Size: a A A

Codon And Codon Pair Usage Bias

Posted on:2013-05-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:S J TanFull Text:PDF
GTID:1310330482950204Subject:Biology
Abstract/Summary:PDF Full Text Request
Codon usage patterns vary across genes within a given genome,and also across different organisms.This is known as the synonymous codon usage bias.The phenomenon of codon bias was proved to be affected by many factors,which could be generally classified into two classes:selectionist explanation and mutational or neutral explanation.One of the factors,base composition(or GC content),has attracted much attention.Many researches showed that,GC content in the non-coding regions,GC content at the third codon position(GC3)or islands of CpG dinucleotides are closely related to the codon usage pattern.To date,more and more whole genome data have been published,which provide a great opportunity for systemically analyzing the codon usage patterns across organisms.By comparing the codon and amino acid frequencies from different genomes,we found both codon and codon pair usage patterns are driven by base composition.There are several distinct patterns in codon pair usage:the long mononucleotide repeats are strongly avoided in codon pairs;codon pairs with C or G repeats are more unfavorable than that with A or T repeats;a negative correlation between genomic GC contents and the o/e ratios,particularly for C/G pairs.These results support natural selection against long C/G mononucleotide repeats,which could induce frameshift mutations in coding sequences more easily.Therefore,the avoidance of codons with relatively high GC content could be a better strategy to minimize the risk of generating long C/G mononucleotide repeats.That is also why almost all GC-poor codons are used for the amino acids with limited synonymous codons or stop/start codons.In addition,we found a significant avoidance of C/G pairs in highly conserved regions,further supporting that long mononucleotide repeats may play an important role in base composition and genetic stability of a gene and gene functions.To compare the frequencies of codon usage,the coding sequences from different species,genes,or domains were used as a unit.A strong correlation between codon usage pattern and regional GC content was revealed by a linear regression analysis.There is a constant codon or amino acid usage in the regions with nearly equal GC-content.As the GC-difference increases,the observed usage patterns deviated from each other.All of these results clearly show that the usage frequency of each codon tends to be same at certain GC-content of genes(or motifs),regardless of their diverse origins,either from eukaryotic or prokaryotic genes,indicating GC content could be a general factor to account for codon usage.Our survey on widely-represented species also revealed an interesting correlation between the average GC-content among synonymous codons for an amino acid(GCsyn)and the amino acid usage pattern.The usages of four amino acids with highest GCsyn and five with lowest GCsyn varied along with regional GC-contents,whereas those of the other 11 amino acids with intermediate GCsyn showed less variation.On the other hand,the number of degenerate codons in the 11 less-variably-used ones was significantly correlated with the amino acid usage frequencies regardless of the regional GC-contents.The functional constraint of a gene determines the amino acid usages,which in form the proportion of high-GCsyn and low GC-syn amino acids.Also,we show that the GC-content in a region is primarily determined by the usage change of amino acids with extreme GCsyn,which contribute 76.7%to the changed GC-bases.Therefore,the amino acid usages,especially amino acids with extreme GCsyn,might be a key factor to determine the proportion(and number)of synonymous codons for themselves and the GC-content there,which consequentially determines their codon usages.Our results are essential to understand some basic biological questions,such as the origin and evolution of codons and amino acids,as well as the important role of GC content in codon usage pattern.
Keywords/Search Tags:codon bias, codon pair, mononucleotide, indel, GC content, codon usage pattern
PDF Full Text Request
Related items