Font Size: a A A

Brassica Rapa Reference Genome Upgrade And Genome Evolution Analysis

Posted on:2018-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:C C CaiFull Text:PDF
GTID:2333330518483683Subject:Vegetable science
Abstract/Summary:PDF Full Text Request
The crop species Brassica rapa belongs to the genus Brassica,which includes many important crops that are cultivated as vegetables,condiments,and oilseeds.Recently,the Brassica genomes have been sequenced extensively: a B.rapa draft reference genome in 2011,a Brassica.oleracea in 2014,a Brassica.napus in 2014,and Brassica.nigra and Brassica.juncea in 2016.The first released B.rapa genome reference serves as a valuable resource in genome assembly and annotation of other Brassicas.It has been used widely in Brassica comparative genomics and in genome-based evolutionary analysis of Brassicaceae species.However,the total size of the first genome assembly of B.rapa(version 1.5)is only about 283.8 Mb,58.52% of estimated B.rapa genome(485 Mb).Considering that much of the genome assembly is still missing(41.48%),there is a considerable possibility that important genes have been missed.Additionally,the previous pseudo-chromosomes were constructed on low-density genetic map that built with markers as InDels and SSRs,which limited the ratio of scaffolds assigned to chromosomes and the accuracy of assignment.Furthermore,PacBio single-molecule sequencing data was also not available for the assembly of V1.5,which can be used to resolve repeat sequences and close gaps in scaffolds effectively.And for gene model prediction,large volumes of mRNA-Seq data were not used in the first annotation of B.rapa genome,and there is still great improvement.Therefore,it is necessary to improve the B.rapa reference genome by genome re-assembly with high-quality data generated from Illumina and Pacbio sequencing,construction of pseudo-chromosomes on high-density genetic maps,and re-annotation of genes with bulk mRNA-Seq data.In this study,the assembly and annotation of B.rapa were much improved upon by adding more Illumina and PacBio genome data and mRNA-Seq data.Detailed comparisons of gene models and TE elements predicted in the two genome assemblies were conducted.With B.rapa genome V2.0,we reconstructed the three sub-genomes and performed comparative and evolutionary genomic analyses,such as over-retention of specific genes,biased fractionation,dominant gene expression among sub-genomes,and small RNA mediated TE methylation in the regulation of expression between paralogs.The results are listed as follows:1.The new assembly has a total size of 389.2 Mb,106 Mb more than V1.5.It covers approximately 80.25% of the B.rapa genome.The new assembly contains 86,986 scaffolds,with a scaffold N50 of 3.38 Mb.Two high-density genetic maps were constructed with total genetic distances of 1,316.731 cM and 1,391.516 cM,respectively.Approximately 330 Mb of the assembly was assigned to chromosomes.2.48,826 protein-coding genes were predicted in V2.0,which is 7,652 more than V1.5.In V2.0,32.30%(126 Mb)of the genome is composed of TEs,while only 25.44% was found previously in V1.5.The content of the additional 106 Mb added in V2.0,approximately 6 Mb are CDSs(coding sequences),4 Mb are introns,54 Mb are TEs,30 Mb are intergenic components,and 12 Mb are gaps(Ns).3.Detailed comparisons of gene models and TE elements predicted in the two genome assemblies support the considerable quality of gene and TE annotation in B.rapa genome V2.0.With B.rapa genome V2.0,we reconstructed the three sub-genomes and the LF sub-genome maintained more gene copies than MF1/2 sub-genome The analysis of gene expression among sub-genomes showed that genes in sub-genome LF were dominantly expressed over genes in MF1 and MF2.GO term enrichment analysis showed that these over-retained genes after tandem duplication were inconsistent with genes that were over-retained after WGD,which supports the gene balance hypothesis.For sub-genome LF and MF1,LF and MF2,small RNA targeted TEs(RNA+ TEs)showed negative association with expression levels of gene doublets,namely,dominantly expressed genes from a sub-genome showed low level of RNA+ TEs in the 2 kb of 5? UTR regions.4.Two new biology findings were observed in B.rapa genome V2.0.First,there were much more tandem genes assembled in the updated genome than the previous assembly,genes related to adaptability were found to be expanded through tandem duplication in B.rapa.Second,a unique TE expansion event(?6.5 million years)was detected in the new genome assembly of B.rapa,which was not observed in V1.5.It changes our previous opinion that there was no TE expansion event in B.rapa genome after Whole Genome Triplication(WGT).
Keywords/Search Tags:Brassica rapa, genome assembly, gene annotation, linkage map, transposable elements, genome evolution, comparative gemome
PDF Full Text Request
Related items