Font Size: a A A

Characteristics And Patterns Of LTR Retrotransposons Expansion In The Pine Genomes

Posted on:2023-03-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:S S ZhouFull Text:PDF
GTID:1523307292973889Subject:Tree genetics and breeding
Abstract/Summary:PDF Full Text Request
LTR retrotransposons(LTR-RTs)are ubiquitous and the dominant repeat element in plant genomes,and play important roles in genome size and structural variation,gene gain and loss,gene expression regulation,epigenetic regulation,and stress response to changing envrionment.Pines,as the largest family and genus of conifers with wide distributions,are important foundation species in forest ecosystems for maintaining ecosystem stability,biodiversity,ecological security and sustainable use of biological resources.Pines have large and complex genomes(17 Gb to 35 Gb),with LTR-RTs contributing up to 60% of the genome size.With the advent of modern sequencing technologies and the availability of genomic resources for many organisms,the widespread presence of LTR-RTs has been now recognised.However,the mechanisms underlying the significant amplification of LTR-RTs in plant genomes and the associated functional factors are poorly understood.In this study,this study established an efficient and accurate computational process for LTR-RTs identification,annotation and classification,and used this process to construct a large-scale LTR-RTs dataset for plant genomes;explored the commonality and specificity of LTR-RTs class composition,number distribution and insertion time and resolved the relationship between LTR-RTs clearance capacity and genome size.The study focused on Pinus lambertiana(sugar pine)and Pinus taeda(loblolly pine).By providing genome/wide LTRRTs annotation,the studied characterized LTR-RTs components in the genomes of the two pine species,reconstructed the history of LTR-RTs expansion in each genome,examined the preference of LTR-RTs insertion sites,and identified cis-regulatory elements to elucidate potential gene regulatory functions.The main findings are as follows:(1)This study developed an efficient and accurate computational procedure for LTR-RTs identification,classification and functional component annotation.This study constructed an intact LTR-RTs database in plants,which was designed to classify and annotate intact LTRRTs with a standardized procedure.An automated computational process was implemented in Python language by integrating LTR-RTs structure-based identification algorithms and selecting strict quality control parameters.Validation with manually corrected rice data showed that the process had a high degree of confidence.This database includes 93 families of 46 orders,including Rhodophyta,Chlorophyta,Bryophytes,Pteridophyta,Gymnosperm,and Angiosperm species.The dataset currently comprises a total of 2,593,685 intact LTRRTs from genomes of 300 plant species.The scripts of the computational procedure and LTR-RTs datasets are in open access.(2)LTR-RT lineages are diverse in plant genomes with a few LTR-RTs lineages dominate each genome.In conifers,the lineages of Tat II and Tat III of the Gypsy superfamily are dominant and with older insertion times.More closely related species share similar RT amino acid sequences of their LTR-RTs,numbers and insertion time.Conifers have almost identical distribution in number and insertion time,but only two species pairs(Pinus lambertiana and Pinus taeda,Picea glauca and Picea abies)have similar RT sequence.(3)LTR-RTs content is significantly correlated with plant genome size after the removal phylogenetic relationships.Genome size was negatively correlated with S:I values(Ratio of the number of solo LTRs to the number of intact LTR-RTs),independent of T:I(Ratio of the number of truncated LTR-RTs to the number of intact LTR-RTs)values and positively correlated with I+S+T values(Birth,sum of the number of solo LTRs,truncated LTR-RTs and intact LTR-RTs).The generally low S:I values and high I+S+T values in conifers suggest that conifer LTR-RTs have high birth rates and weak clearance.The clearance of LTR-RTs is mainly associated with unequal recombination.(4)This study found that the increase in sugar pine genome size is largely due to an extensive proliferation of just a handful of LTR-RT lineages,especially the Tat II lineage of Gypsy retrotransposons.Compared with loblolly pine,this Tat II lineage is significantly longer and younger in sugar pine.(5)LTR-RT insertions are not random but display a strong preference for palindromes in the two pine species,with the most dominant lineage Tat II having lower GC content within the insertion site flanking regions in sugar pine.LTR-RT insertions showed a different preference for “sequence environment/context” between the two-species,with the dominant lineages showing fewer associations with nucleosome position and occupancy in sugar pine.The results suggeste that the recent interspecific differential amplification of the LTR-RT lineages(Tat II,etc.)were related to the different insertion preferences and the chromatin context in which they were involved.Different preferences leaded to different amplification capacities and thus to different genome size variants(6)This study discovered a large number of transcription factor binding sites(TFBS)for11(sugar pine)and 10(loblolly pine)transcription factors(TFs)families embedded in 25%of LTR elements in the two-pine genomes.Each of the pine genomes shows a distinct profile of TFBS in the LTR region of most LTR-RT lineages,especially for the dominant Tat II and Gymco-II lineages which carry more TFBS in their LTR region for TFs(MYB,C2H2)but only few for B3 in sugar pine.In addition,LTR-RTs may have species-specific and lineagesspecific effects on gene regulation.The results suggested that the recent interspecific differential amplification of LTR-RT lineages(Tat II,etc.)are also related to sequence composition of their LTR regions,and that the different cis-regulatory elements carried by the LTR-RTs may determine their different amplification patterns.To conclude,this study established a reliable analytical procedure for LTR-RT annotation and established a LTR-RT database from 300 plant species,Based on this database,this study clarified the characteristics of LTR-RTs in pine speccies based on extensive comparative analysis,and identified functional factors associated with differential amplification of LTR-RTs between pine species.This study sheds light on the evolutionary mechanisms and drivers of conifer genome complexity,and provides an important reference for further unraveling the important roles of transposons on plant genome differentiation and adaptation.This study contributes to the discovery and utilization of conifer genomic resources,and provides a new ideas and approaches for future molecularly assisted breeding.
Keywords/Search Tags:LTR retrotransposons, conifers, genome size, pine species, expansion patterns
PDF Full Text Request
Related items