Font Size: a A A

Data Mining Of Tandem Repeat Sequences In Plant Genomes And Online Service Platform Development

Posted on:2018-07-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:J Y YuFull Text:PDF
GTID:1310330518484781Subject:Crop Genetics and Breeding
Abstract/Summary:PDF Full Text Request
Tandem repeats are important genomic components in plant genomes.In this study,we identified the functional tandem repeats-tandem duplicated genes and simple tandem repeats-microsatellite DNAs from the genome-sequenced plant species to implement in-depth data mining and data integration based on genome wide.At last,we developed two online service platforms for the two types of tandem repeats from the genome-sequenced plant species.?.The effect of tandem duplicated genes on the complexity of S.indicum genomeThe S.indicum has experienced the whole genome duplication(WGD)and tandem duplication(TD)events,which provide extentive genomic data for researchers to study the complexity of the S.indicum genome.The collinearity analysis of the two subgenomes from S.indicum genome indicated that the duplicated genes from the two subgenomes have experienced fractionation.Sequence diversification of different types of gene pairs revealed that most of TD events occurred after the WGD event,with others following the ancestral gene order indicating that ancient TD events at some time were prior to the WGD event.Comparison of function analysis of different types of gene pairs indicated that the WGD and TD evolutionary events were both responsible for introducing genes that enabled exploration of novel and complementary functionalities,whilst maintaining individual plant ruggedness.?.The effects of tandem duplicated genes on Cytochrome P450 super gene familyAccording to the released genomic data from A.thaliana,B.rapa and B.oleracea,we identified 251,356 and 346 cytochrome P450 genes from A.thaliana,B.rapa and B.oleracea genomes,respectively,using the characteristics of conserved domains of Cytochrome P450 super gene family.Our comparison of influence of WGD and TD events on the P450 gene superfamily between A.thaliana and Brassica species indicated that the family-specific expansions in the Brassica lineage could attribute to both WGD and TD,whereas WGD was recognized as the major mechanism for the recent expansion of the P450 super gene family.Expression analysis of P450 s from A.thaliana and Brassica species indicated that WGD-type P450 s showed the same expression pattern but completely different expression with TD-type P450 s across different tissues in Brassica species.?.PTGBase: an integrated database to study tandem duplicated genes in plantsIn this section,we identified tandem duplicated genes from 39 genome-sequenced plant species.Firstly,according to phylogenetic relationship of 39 plant species,we classified different evolutionary subgroups of 39 plant species and then determined the orthologous gene groups from each evolutionary subgroup employing otrthoMCL software.Secondly,we detected the location of the members of orthologous gene groups on chromosomes or genomic scaffolds in 39 plant genomes.Finally,we investigated if two or more genes from the same orthologous group were next to each other on chromosomes or genomic scaffolds in 39 plant genomes to confirm tandem arrays.We obtained 54,130 tandem arrays including 129,652 tandem duplicated genes from 39 genome-sequenced plant species.Using normalizated data of tandem duplicated genes from 39 genome-sequenced plant species,we developed an integrated database to study tandem duplicated genes in plants,PTGBae.?.PMDBase: a database for studying microsatellite DNA and marker development in plantsBased on the released Perl scripts from MISA,we improved the released scripts and further developed the pipenline of microsatellite DNAs identification and corresponding markers development.Using this pipeline,we identified 26,230,099 microsatellite DNAs from 110 genome-sequenced plant species and provided up to three pairs of primer sequences for each microsatellite DNA.Using the pipeline of microsatellite DNA identification,we developed an online service platform for identification of microsatellite DNAs,MISAweb.After putting two parts together,we developed a comprehensive database for studying microsatellite DNA and marker development in plants,PMDBase.Therefore,PMDBase can help to show microsatellite DNAs across selected species in a genome-wide fashion and also identify all putative microsatellite DNAs within interested genomic regions through online web service,MISAweb.In this study,we investigated the influence of tandem duplicated genes for evolution and formation of genomes and gene families and developed an integrated database to study tandem duplicated genes in plants.Further,we identified microsatellite DNAs among different plant species and developed a database for studying microsatellite DNA and marker development in plants.We hope this provides a biology model for scientists to study the formation and evolution mechanism of genomes and gene families in plants,particularly in the context of the evolutionary history of flowering plants,and offers two intergarted platforms for the study of genomics,comparative genomics and molecular breeding in plants.
Keywords/Search Tags:Genome, Tandem repeat sequence, Tandem duplicated genes, Microsatellite DNA, Database
PDF Full Text Request
Related items