Font Size: a A A

Improving The Genome Annotation Information Of Brassica Napus L.by Multi-transcriptome Datasets Integrating Analysis

Posted on:2023-07-19Degree:MasterType:Thesis
Country:ChinaCandidate:L L ZhangFull Text:PDF
GTID:2543306800492744Subject:Genetics
Abstract/Summary:PDF Full Text Request
Brassica napus L.(B.napus)is an important allopolyploid oil crop in China,with complex genome and extensive variation in different germplasms and ecotypes.The genomes of the French B.napus variety"Darmor-bzh"and the Chinese B.napus variety Zhongshuang 11(ZS11)were published firstly in 2014 and 2017,respectively,providing an important foundation for Post-Genomics research in B.napus.However,the quality of B.napus assembly genome needs to be greatly improved as compared with many other crops,such as rice,maize,soybean etc.Transcriptome sequencing technology has been proven to be an important skill for exploring the genome-wide expression pattern,mining specifically expressed genes and functional genes,studying gene structural variation,and improving genome annotation information,etc.Therefore,in this study,to improve the genome annotation information of B.napus,we integrated the 10 transcriptome datasets that were constructed by our lab to analyze the gene expression patterns at genome-wide level,optimize gene structure,exploit new genes,etc.The transcriptome datasets used in this study mainly included two categories:(1)the RNA-Seq datasets constructed in this study which were generated from root and leaf samples of ZS11 at five-leaf stage under low-phosphorus(LP)and low-potassium(LK)treatments for 1d,3d,5d,7d and 12d respectively.(2)The 8 transcriptome datasets constructed in our previous studyies:including the transcriptome datasets of ZS11 and Zhongyou 821(ZY821)samples obtaied at different growth stages;the transcriptome datasets of ZS11 roots at five-leaf stage under phytohormone IAA,6-BA,ABA,ACC and GA3treatments for 0(CK),1,3,6,12 and 24h respectively;and the transcriptome dataset for ZS11 roots and leaves at five-leaf stage under low nitrogen(LN)stresses for1,3,5,7 and 12 days.Given that the 8 transcriptome datasets were mapped to the"Darmor-bzh"reference genome,in this study,we re-mapped and analyzed these datasets based on the ZS11 reference genome updated in 2020.The main results of this study were as follows:1.Construction and analysis of ZS11 transcriptome datasets at five-leaf stage under LP and LK stresses.In this study,we constructed the transcriptome datasets of ZS11 at five-leaf stage under LP and LK stresses respectively,which included a total of 80 sequenced samples.And 261.69Gb and 278.33Gb Clean Data were obtained respectively,the uniquely mapped percentage was 79.17%~96.36%,and 4,640 and 4,365 new genes were obtained,respectively.Alternative splicing(AS)analysis indicated that the retained intron(RI)event was the most extensive under both of the two stresses,and the number of AS events occurred in roots was more than that in leaves.Differential expression analysis showed that the DEGs in leaves under the two stresses all respond to long-term stresses,while the trend of DEGs in the roots was volatile.Furthermore,function annotation showed that the DEGs in roots were mainly enriched in phosphorylation and redox processes;whereas the DEGs in leaves were mainly enriched in photosynthesis and photosynthetic related pathways.Under LK stress,the DEGs in roots were mainly involved in phenylpropane biosynthesis and MAPK signaling pathways;while the DEGs in leaves were mainly involved in photosynthesis and energy metabolism related pathways,including 135 DEGs(120 up-regulated DEGs)were involved in phosphorylation pathway,suggesting that K deficiency stress can promote energy metabolism and photophosphorylation by affecting the phosphorylation pathway.2.Re-mapped and analyzed the 8 transcriptome datasets constructed in our previous studies.(1)A total of 110,111 genes were annotated in the spatio-temporal transcriptome datasets.Among them,75,779 and 76,725 expressed genes(FPKM≥1.0)were identified in ZS11 and ZY821 respectively.In all,we obtained a total of 80,018 non-redundant expressed genes in the two varieties,including 52,293 structurally optimized genes,7,552 new genes,and 7,532 organ-specifically expressed genes.K-mean analysis showed that the expression patterns of the 80,018 expressed genes were divided into two major types:the genes of type I showed similar expression profiles between the two varieties,and the genes of which were mainly enriched in organelle structure,basal metabolism and regulation;while the genes in type II showed opposite expression trends in the two varieties,and were mainly enriched in protein biosynthesis and processing pathway etc.Differentially expressed analysis showed that the DEGs between the two varieties were mainly present at flowering and seed development stages,implying that the growth and reproduction stages mainly attributed to difference between these two varieties.(2)We obtained 106,254 annotated genes in the five hormone induction transcriptome datasets in this study,and 60,156 of them were expressed genes including44,440 structurally optimized genes,3,095 hormone-induction expressed genes,and4,299 new genes.A total of 37,505 non-redundant DEGs were identified under the five hormone treatments,and the number of down-regulated DEGs were more than that of up-regulated DEGs,which implying that exogenous hormone treatments tend to repress gene expression.Function annotation of DEGs indicated that the functional enrichment tends of DEGs under the five hormone treatments were similar,which were mainly enriched in metabolism,biosynthesis,and some key response processes.Trend and correlation analysis of DEGs indicated that the expression profiles of genes under exogenous hormone treatments showed 5-7 major expression patterns,and the gene expression patterns were generally not conserved under different hormone treatments.(3)In the LN transcriptome,we obtained 106,632 annotated genes,and 67,008 of which were expressed genes,including 47,950 structurally optimized genes and 4,729new genes.We identified 16,770 and 25,807 non-redundant DEGs in roots and leaves in this study,respectively.And we found that the DEGs in leaves were generally responding to short-term LN stress,while those in roots were responding to long-term LN stress.Functional annotation of DEGs showed that the DEGs in roots were mainly enriched in nitrogen metabolism,amino acid metabolism,transport and biological defense processes;while the number of down-regulated DEGs in leaves was generally higher than that of up-regulated DEGs,except for the photosynthesis and carbon fixation pathways showing a opposite trend.3.Improve the genome annotation information of ZS11Overall,we identified a total of 74,079 non-redundant expressed genes in ZS11genome by integrating the above 10 transcriptome datasets,including 1,072hormone-induced specific expression genes and 6,633 nutrient-induced specific expressed genes.Moreover,we obtained 27,692 new genes that were functional annotated,which greatly supplemented and improved the ZS11 genome annotation information.Furthermore,17 constitutively expressed genes were screened,providing a more accurate reference gene for q RT-PCR analyses in B.napus.A total of 14,794non-redundant genes having AS events were identified in ZS11 genome,with 51.34%of which having the RI event.In all,73,086 genes in this study were consistent with that of the ZS11 reference genome,indicating their sequence information were high confidence;and the sequences of 35,466 genes with isoforms(AS events)were refined which is important for improving the quality of ZS11 genome sequence information.Microcollinear analysis illustrated that gene expansion in ZS11 genome were mainly derived from the heterogeneous polyploidization between Brassica rapa and Brassica oleracea,and the purification selection was the main evolutionary source for duplicated gene pairs.Expression profile correlation analysis showed that duplicated gene pairs within the same sub-genome tended to undergo expression profile differentiation,while those between different sub-genomes tend to have redundant expression profile.
Keywords/Search Tags:Brassica napus, Transcriptome, Expression profile analysis, Spatio-temporal expression profiles, Nutrient stress treatments, Hormone treatments, Functional annotation
PDF Full Text Request
Related items