Font Size: a A A

Practical Strategies Of Bacterial Genome Assembly:Hybird Assembly

Posted on:2016-03-05Degree:MasterType:Thesis
Country:ChinaCandidate:C Y LinFull Text:PDF
GTID:2180330503953052Subject:Bio-engineering
Abstract/Summary:PDF Full Text Request
Background: Determining the genomic sequences of microorganisms is the basis and prerequisite for understanding their biology and functional characterization. And getting the complete genome sequence is the primary task. At present, Illumina technology has developed very mature, the read from the initial SE36bp(GA) to present PE300bp(Miseq). A large number of genome assembly de novo using Illumina sequence data, but the data produced have short read-lengths and biased genome coverage leading to fragmented genome assemblies, Especially in high repetition, high hybrid and complex region. The emergence of Pac Bio technology with the long read, can make up for the part of Illumina short read on assembly defects to a certain extent. But the cost of expensive and low throughput, limited Pac Bio large-scale application.Results: We sequence the genome of S.enterica, S.aureus and B.parapertussis with Illumina Hiseq2500 and Pac Bio RS system, respectively yielded more than 150 X Illumina data and more than 50 X Pac Bio data. We assembled the three genomes with Illumina data, Pac Bio data and different multiplier two kinds of data hybrid. The results show that hybrid strategy is feasible, and we can achieve a good assembly results with 50 X Illumina data combination 20 X Pac Bio data. The hybrid assembly results of the three bacteria were, all the number of contigs was 1, the N50 contig length was ~4.85 M, ~2.91 M and ~4.18 M respectively, the coverage ratio reached more than 0.99 and the similarity above 0.95.Conclusions: Two kinds of data hybrid assembly strategy was to make full use the advantage of Illumina data with high accuracy and Pac Bio data with long reads, to reduce the assembly difficulty of complex area such as high repetition, high hybrid area, to make the assembly results to complete degree. This strategy was met the requirements of the assembly accuracy, integrity and continuity, at the same time to reduce the cost of sequencing at the most extent.
Keywords/Search Tags:Next-generation Sequencing, Third-generation Sequencing, De Novo Assembly, Illumina Hiseq, Pacific Biosciences
PDF Full Text Request
Related items