Font Size: a A A

Improved Reference Genome And Chromatin Interactions Analysis In Brassica Rapa

Posted on:2019-03-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:L ZhangFull Text:PDF
GTID:1363330548987506Subject:Vegetable science
Abstract/Summary:PDF Full Text Request
Brassica rapa comprises various economically important species,many of which are extensively cultivated around the world as oil crops and vegetables.The first released draft genome,B.rapa genome v 1.5,was created using a whole-genome shotgun strategy with Illumina short reads and facilitated genome assemblies of other Brassica species.A more recent release,B.rapa genome v2.0,resulted from iterative updates with additional short read data.It was further updated to the B.rapa genome v2.5 after improving the scaffold order.The B.rapa genome not only underwent an additional whole-genome triplication(WGT)event after divergence from Arabidopsis thaliana but also shared their ancestral polyploidization events(?,? and ?)in Cruciferae.Due to the relatively recent whole-genome triplication event,the B.rapa genome harbors highly repeated sequences and complicated centromeric regions,making it difficult to assemble the genome using the short read technologies only.The inaccuracy in assembly and the low contiguity of the current draft assemblies have limited applications in both genomic and genetic studies of B.rapa.Chromatin interactions affect gene expressions and other process in plants,but it is little known that affects of WGT on the chromatin interactions in B.rapa.The main results in this study are listed as follws:1.An improved assembly,B.rapa genome v3.0,was presented using a combination of single-molecule sequencing(PacBio),optical mapping(BioNano),and high-throughput chromosome conformation capture(Hi-C)technologies.The new assembly of B.rapa consisted of 1,476 contigs and 1301 scaffolds,with a contig N50 of 1.45 Mb and a total length of 351.06 Mb.B.rapa genome v3.0 represents a?27-fold(contig N50:1,446 Kb vs.53 Kb,v2.5)and?31-fold(contig N50:1,446 Kb vs.46 Kb,v1.5)improvement in contiguity over the two previous assemblies.Compared to the previous assemblies,v3.0 had much improved on the number and size of gaps.There were 45,985 protein-coding gene models identified in v3.0,98.75%(45,411 of 45,985)of the genes were annotated on chromosomes and only 1.25%(574 of 45,985)was located on the unanchored scaffolds.A total of 134 Mb repeats were identified and represented 37.51%of the assembled genome.2.Detailed comparisons of gene models and TEs in the three genome assemblies support the high-quality of genome annotation of B.rapa genome v3.0.A total of 2,077 tandem arrays(corresponding to 4,963 tandem genes)were identified in v3.0.However,more tandem arrays(3,535 arrays,8,002 genes)were identified in v2.5.Systemic analysis indicated that the assembly errors which produced by gaps closing using PacBio reads in v2.5 might led to the invalid annotation of tandem genes.In v3.0,13,318 intact LTR-RTs were identified.However,there were only 4,129 and 801 intact LTR-RTs in v2.5 and v1.5,respectively.The insertion time of intact LTR-RTs indicated that the B.rapa genome underwent three waves of LTR-RT expansion since it diverged from B.oleracea.The three subgenomes(LF,MF1,and MF2)were constructed based on the syntenic relationship between v3.0 and A.thaliana,and the genome blocks,centromeres and telomeres were defined in the B.rapa genome v3.0.3.Compared to Arabidopsis thaliana,the effects of WGT on the chromatin interactions of B.rapa were analyzed in this study.A high-resolution whole-genome chromatin interactions heatmap of B.rapa was obtained at 100 kb resolution.In this study,chromatin interactions between centromers were higher than that of chromosome arms and chromatin interactions berween telomeres were also higher than that of chromosome arms,indicating that centromeres and telomeres were clustered at opposite in the nucleus.There were 3.31%paralog pairs of B.rapa detected the chromatin interactions,and the total contact of the interactions was 1,022 and the average frequency of interactions was 2.40.Compared to chromatin interactions of paralogs in Arabidopsis thaliana,the most of chromatin interactions between paralogs in Arabidopsis thaliana were lost in the genome of B.rapa,while only 8.50%paralogs pairs inherited the chromatin paralogs in B.rapa from A.thaliana.
Keywords/Search Tags:Brassica rapa, de novo genome assembly, PacBio, Hi-C, TE expansion, centromeres' site, chromatin interactions
PDF Full Text Request
Related items