Font Size: a A A

Phylogenetic Analysis Of Triticeae Based On Whole Genome And Transcriptome

Posted on:2019-01-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:C Y WangFull Text:PDF
GTID:1313330545988209Subject:Crop Genetics and Breeding
Abstract/Summary:PDF Full Text Request
Triticeae is the major tribe in Poaceae and extremely rich in genetic variation.This tribe is an important gene bank for breading,herbage utilization in agriculture,and livestock production.This tribe is relatively young for it diverging from other Poaceae about 25 million years ago(MYA).Because of polyploidization,inter-generic hybridization,and introgression,different researchers have disagreement on Triticeae definition,taxonomical classification,and the evolutionary relationships of the species in Triticeae.In order to utilize the abundant gene resources of this tribe and broaden the available gene pool for wheat,those problems must be solved urgently.In this study,the evolutionary relationships of Triticeae species and the selection pressure of different sub-genomes in bread wheat are researched using the comprehensive application of bioinformatics,comparative genomics and phylogenetics analysis.The main results are as follows:1.Sequence and assembly 20 species transcriptome of Triticeae.These species,belonging to 9 genera,include 19 genome types and six of them are allopolyploid.By short reads sequencing or full-length sequencing,we got high quality transcripts data of the 20 species by trinity assembly,genome guided trinity assembly,short reads assembly according to the reference genome,and full-length quality control.These assembly results provided a large number of valuable benchmarking data for Triticeae phylogenetic analysis,genomic studies,and wheat molecular breeding.2.Estimate the transcripts integrity of 20 species.Each transcript of the 20 species was compared with 1440 single copy orthologs from embryophytes to estimate the transcripts integrity.Then the appropriate amount of sequencing data under different strategies could be determined according to the integrity.The missing rate of diploid species in Triticeae was less than 11.3% when the amount of sequencing data between 10 Gb to 20 Gb.The missing rate of orthologs was between 15.8% to 30.8% when the amount of full-length transcriptome sequencing data was about 8Gb on the PacBio Sequel platform.Accordingly,the amount of full-length sequencing data should be more than 16 gigabase for the Triticeae species with relatively huge genome.The missing rate of wild emmer was 1.9%,and 83.8% duplicated genes were acquired under the strategy of genome guided assembly,which indicated that genome guided assembly shown advantages in allopolyploid species transcriptome assembly.3.Phylogenetic analysis of Triticeae on the whole genome or transcriptome scale.Combined all-to-all blastn with phylogenetic tree topological structure,20,577 homologous clusters and 1,348 orthologous clusters were identified in 20 species transcriptomes(the three sub-genomes of bread wheat were taken as different species),and barley genome.Rice genome and Brachypodium distachyon genome were used as outgroup.Using phylogenetic tree construction for the connected 1,348 orthologous clusters in order,species tree of the Triticeae species in this study was inferred.All the Triticeae species in this study formed a monophyletic group.Except Taeniatherum crinitum,species from Triticineae and Hordeineae formed a monophyletic sub-group,respectively.Pseudosecale villosum,Eremopyrum triticeum,Eremopyrum distans,and Secale cereale were placed in the base of the Triticineae sub-group.The later formed species in Triticineae were divided into three monophyletic sub-groups.In addition to the species from Triticum and Aegilops,the three sub-groups also contained other species belonging to the other genus.This result did not support the hypothesis that Triticum and Aegilops originated from a single ancestor.4.Identification and bioinformatic analysis of single copy gibberellin receptor(GID1)in Triticeae.27 GID1 orthologs were identified in 21 species of Triticeae,which were divided into two types.A-type GID1 was similar to the ones from Arabidopsis thaliana and rice genomes.B-type GID1 had an N-terminal extension which was about 300 bp longer than that of A-type.Moreover,the N-terminal extensions of B-type sequences from different species were relatively conserved.According to the three-dimensional structure comparation of the two different type GID1 proteins,N-terminal of B-type GID1 had more ?-helix.Arabidopsis thaliana genome had three copies of GID1,while each one genome or sub-genome of Triticeae species only had one copy of GID1.Genomes of early formed species in Triticeae only contained A-type GID1,while genomes of later formed species contained A-and B-type GID1 sequences.This result indicated that the N-terminal extension of B-type GID1 emerged under some mechanism,such as insertion or chromosomal rearrangement,during Triticeae evolution.5.Selection pressure comparision between sub-genomes of bread wheat.801 orthologous clusters were identified in the Brachypodium distachyon genome and three sub-genomes of bread wheat(at least contained one sequence from one sub-genome).By comparing the Ka/Ks values between coding region sequences from different sub-genomes of bread wheat,the results indicated that the selection pressure trend of the three sub-genomes was basically the same and three sub-genomes play the same important role in biological function.The Ka/Ks values in the range of 0.6-1.00 were less than 10%,which suggested that the majority of the orthologous sequences were in the selective evolutionary state and fewer sequences were in the neutral or nearly neutral state.Four orthologous clusters had different selection pressure trends,which might be concerned with chromosomal rearrangement.
Keywords/Search Tags:Triticeae, phylogeny, comparative transcriptome, GID1, Ka/Ks
PDF Full Text Request
Related items