Font Size: a A A

Preliminary Assembly And Analysis Of Chinese Sika Deer Genome And Single Nucleotide Polymorphisms Studies

Posted on:2013-01-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:H X BaFull Text:PDF
GTID:1113330374458007Subject:Special economic animal breeding
Abstract/Summary:PDF Full Text Request
The advent and rapid development of next generation sequencing technology have enabled us toanalyse the whole-genome of any species comprehensively. In this study, we sequenced thewhole-genome of an adult male Chinese sika deer (Cervus nippon hortulorum), one of the mostimportant species for the Chinese deer industry, using a whole-genome shotgun sequencing approachand SOLiD (Applied Biosciences) sequencing technology platform.Following the preliminary analysis of the raw sequencing data, we obtained1.92billion of the50bp paired-end reads equivalent to32x coverage of the sika deer genome. Two separate librarieswere made, one with1kbp inserts and the other with2kbp inserts. For the sika deer genomeassembling we combined a variety of currently available techniques. This included global de novoassembly, reference-guided local assembly and conserved synteny local assembly. Furthermore, weutilized the conserved synteny between sika deer and cattle genome. We generated4.15millioncontigs of100bp size or longer, comprising a total of1.83billion base pairs of assembled sequence.Half number of the contigs are longer than695bp (N50), the maximum contig length is10.80kbp. Thequality of these assembled contigs was checked and about0.3%contigs are misassembled. Thesemisassembled contigs were then corrected before the subsequent analyses were carried out. We furtherutilized the paired-end information to generate1.94million scaffolds with an N50of21.6kbp, amaximum length of249kbp. The assembled genome has1.8million gaps.Next, we analyzed the assembled sika deer genome by a verity of bioinformatics approaches andtools. The main results and conclusion are as follows.1. The sequencing bias in the different regions of the assembled sika deer genome led to anapparently different coverage. The higher the GC content in different genome regions, the lowerthe region coverage. The assembled sika deer genome covers about62%of the reference cattlegenome (the identity is above85%), and covers about62%of the deer transcriptome sequence (theidentity is above90%).2. Amongst the collected datasets of the microsatellite primers of deer, cattle and sheep, the1,534microsatellite primers,61%in total, were screened as candidates in the sika deer genome becausethey are conservative between the sika deer and the cattle and sheep.3. The variations between the deer and the cattle genome are more conservative in the functionalregions than in the non-functional regions. There is a strong positive correlation between the SNPvariations and Indel variations among the different genome regions. The SNPs between humanand chimpanzee are compared with those between cattle and deer, the results validated once againthe theory of molecular clock.4. Within this sequenced and assembled genome, the2.7million SNPs were detected, which isequivalent to one SNP of every678bp. The SNP heterozygous rates were0.152%,0.087%and0.082%in the autosomal genome, exon and coding regions, respectively. The high SNPheterozygous rates in the sika deer genome implies that the sika deer belongs to the different genetic background that had produced the gene flowing in the long-term domestication andartificial breeding processes.5. This study produced the6,367SNP sites that are suitable for the development of medium densityof genotype SNP chips. This type of chip if developed can be applied to different deer species,including sika deer, red deer and wapiti. In addition, it proved the feasibility about thedevelopment of high density whole genome SNP genotype chip.6. We further confirmed the claim that the Numts sequence are widespread in vertebrates andinvertebrates nuclear genome, which is in accordance with the assembled sika deer mitochondrialgenome. This implies that there are numerous Numt sequences in the sika deer nuclear genomealmost equivalent to1,867mitochondrial genomes...
Keywords/Search Tags:Sika deer genome sequencing, Whole-genome assembly, Genome analysis, SNP, Mitochondrial genome
PDF Full Text Request
Related items