Font Size: a A A

Assembly Of Gossypium Raimondii Genome Based On Hi-C Data

Posted on:2019-06-19Degree:MasterType:Thesis
Country:ChinaCandidate:Q H YangFull Text:PDF
GTID:2393330545475992Subject:Agricultural Extension
Abstract/Summary:PDF Full Text Request
Cotton is an important natural fiber crops,It holds an important position nationwide in terms of its national economy.In recent years,along with sequencing of Gossypium genome,we found the assembly of genome is very hard,or even inaccurate,the main reasons are the relatively large number of Gossypium chromosomes,the large genome size and abundant repeat sequences.Hi-C(High-throughput Chromosome Conformation Capture)is a technique to study how chromatin folding in the whole cell nucleus,obtaining high resolution chromatin 3D structure information by capturing the interactions between any two loci of chromatin.There are two regular models with Hi-C data: the first one is that the rate of Hi-C interaction is inversely proportional to the genomic distance between the pairs of loci,we can cluster the scaffolds into some groups,pseudo-chromosome,based on this model.And the second one is that the rate of Hi-C interaction of pairs of loci within a chromosome is significantly higher than that in different chromosomes,we can order and orient clustered scaffolds.Wang et al.have sequenced and assembly a draft genome of G.raimondii using the next-generation Illumina paired-end sequencing strategy in 2012.Because of the low density of genetic map,only about73.2% of the assembled sequences(scaffolds)were anchored on 13 G.raimondii chromosomes and52.4% of scaffolds were ordered and oriented successfully.So we reassemble Gossypium raimondii genome based on Hi-C data.We finished Hi-C experience of G.raimondii and filtered invalid Hi-C interactions to gain valid Hi-C data.First we correct the errors in scaffolds by Hi-C data,and then we clustered the scaffolds into 13 pseudo-chromosomes,in each pseudo-chromosome,we ordered and oriented the scaffolds by Hi-C data.98.42% of total length has been clustered successfully,and 98.07%of total length has been ordered and oriented successfully.To verify the results,we draw a Heat-map of Hi-C data and collinear-map of reassemble Gossypium raimondii genome and Paterson Gossypium raimondii genome.We can clearly see that Hi-C data can be used to assemble Gossypium raimondii genome,and this method is much effective,accurate,and economic.
Keywords/Search Tags:Gossypium raimondii, genome, assemble, Hi-C
PDF Full Text Request
Related items