Font Size: a A A

Research On Feature&Molecular Mechanism Of Nucleotide Compositional Variability In Transcribed Region By Comparative Genomic Strategy

Posted on:2013-11-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:D W HuangFull Text:PDF
GTID:1220330467480029Subject:Genomics
Abstract/Summary:PDF Full Text Request
Transcribed region is the main part which encodes the genetic information in genome. Unbalance in nucleic acid composition is one of the most important genomic features represented in transcribed region; it is an important subject in genomic research. GC gradient is the unbalance trend of nucleic acid composition starts form5’to3’in transcribe region. GC gradient is observed in monocotyledon but not in dicotyledon. It is the signature of transcription coupled mutation. There is still no systematically study on existence, impact factor, association with transcription coupled mutation of GC gradient in bacteria and metazoan. Nucleic acid composition in intron, which is abundant in metazoan, is different from that in coding region. Intron in metazoan has experienced an expansion process in length through the evolution from cold blood vertebrate to warm blood vertebrate. The molecular mechanism of intron expansion, its effect on GC gradient and its variation in nucleic acid composition are still unclear that require systematic study. Therefore, our research focused on GC gradient in bacteria and metazoan using the comparative genomic strategy and intron expansion process in metazoan which effect the nucleic acid composition of metazoan’s transcribed region.We sequenced Bifidobacterium longum subsp. longum BBMN68using whole genome shotgun strategy and got a circular chromosome of2265943bp with GC content of59.95%. GC skew analysis on BBMN68and other6strains of Bifidobacterium longum showed non-classic GC skew pattern, indicating specific feature in DNA replication. We then calculated the GC content of the1st,2nd and3rd position of codon (GC1, GC2, GC3) near the CDS start site (CDSST) of Bifidobacterium longum using a slide window strategy. In intergenic region, there is no distinct difference between GC1, GC2and GC3. Unlike Monocotyledon, we didn’t find classic gradient in Bifidobacterium longum’s, coding region. In coding region, we found the phenomenon that GC3> GC1> GC2and the intergenic region GC content is around50%.Further investigation in another1640sequenced bacteria genomes indicated that three evident type of GC content pattern existed:GC1> GC2> GC3in the coding region and the intergenic region GC content is from10%to40%; GC1> GC3> GC2in the coding region and the intergenic region GC content is from30%to55%; GC3> GC1> GC2in the coding region and the intergenic region GC content from40%to75%. We found that the Coefficient of Variation (CV) of GC1, GC2, GC3is negatively correlated with the GC content of intergenic region when the GC content of intergenic region is less than about40%, and becomes positively correlated with it when the GC content of intergenic region is more than about50%. We then investigated in metazoa to see if there is GC gradient exits in the transcribed region, and found that warmblooded vertebrate showed similar GC gradient pattern with that in Monocotyledon, which is consistent with the previous conclusion that GC gradient came out since the transition of coldblooded vertebrate and warmblooded vertebrate. Among genes expressed in human testis, highly expressed genes tend to show sharper GC gradient despite they have intron in5’UTR region or not, and among the genes with UTR5I (Intron in5’UTR), the shorter the sum of UTR5I length is,the sharper the GC gradient becomes. We also found that GC gradient is weaker in genes with longer5’UTR than that with shorter5’UTR. The CDSST which jumps to the next exon moved the intron in CDS into5’UTR which strongly increased the length of5’UTR and weakened GC gradient. The UTR5I containing gene tend to have more ATG codon at downstream of CDSST which provide more3’-direction CDSST candidates when the original CDSST is destroyed by mutation. As different isoforms transcribed from one gene can have different5’UTR structure which may have different GC gradient, we used transcription start site (TSS) information to get full length of isoform’s5’UTR and analyzed multi-isoform genes in further research. Based on transcription mechanism of different isoforms, two evident types existed among these multi-isoform genes:AS type, using adjacent promoter and alternative spliced (AS), and AP type, using distal alternative promoter (AP) to transcribe different isoforms. We found that the mean length of AP type genes are larger than AS type and introns of AP type tend to enlarge the length difference of5’UTR between isoforms’ hnRNA. AP type genes have sharper GC gradient than AS type. Then we calculated the Ka/Ks using a slide window method and found the shared exons of the two isoforms in AP type gene are under more selection pressure than the upstream exons.In further analysis on intron, we found UTR5I’s increase of both copy number and length in metazoa evolution and the length growth of intron is mainly contribute by intronic transposon element (TE). We proposed that TE with moderate GC content, such as DNA, LTR, LINE, copied themselves early which result in intron length’s growth. SINE’s invasion happened later, which pushed host introns’ length and GC content into high level. Intron is the main part of transcript unit, thus it contributes the main effect on unbalance in nucleic acid composition in transcript unit. Since transcribed couple mutation and transposon elements replication depends on transcription and are only heritable in germline, transcription in germline must play an important role in gene evolution.
Keywords/Search Tags:Bifido bacterium longum, GC gradient, vertebrate, intron, transposon element
PDF Full Text Request
Related items