Font Size: a A A

The Genome Assembly, Repeat Annotation And Comparative Analysis Of Mo17 And Teosinte

Posted on:2017-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:R R WangFull Text:PDF
GTID:2180330485477700Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Maize is the most productive crop in the world, and it is an important species in biological research. Teosinte is the wild ancestor of maize. Maize domesticated come from teosinte in thousands of years ago. Studying the difference between teosinte and maize can contribute to find key genes for the domestication, which is conducive to the breeding and other production practice problems in maize.With the development of sequencing technologies and bioinformatics algorithms, people can use a relatively low cost to decipher complex genomes and analysis, which lay a solid foundation for us to reveal the more important biological problems.In this study, we sequenced ten individuals which were chosen from the BC2F6 progenies crossed by Mo17 that a variety of maize and teosinte. We inferred the Mo17 and teosinte genome from ten individuals by the binmap based on B73 reference genome. We used the reference genome, Pac Bio data, de novo assembly and long inserts library to obtain genome sequence with high quality. Then, we annotated the repeat sequence in the assembled genome. We detected SNP and In Del between genomes by genomic sequence alignment method. Finally, we detected PAV and PA genes by the method of combining whole genome alignments and reads depth. The main findings of this study are as follows:1) The assembled results. The total length and N50 of Mo17 contigs are 2,005,908,899 bp and 60,508 bp respectively. The total length of Mo17 scaffold and N50 are 2,041,547,554 bp and 2,995,073 bp respectively. The total length of TEO contigs and N50 are 1,157,520,532 bp and 26,638 bp respectively. The total length of TEO scaffold and N50 are 1,204,281,382 bp and 107,689 bp respectively.2) The annotation of repeat sequence. The repeat sequence identified in Mo17 and TEO genomes are 79.67% and 72.79%.3) The Identification of SNP and In Del. The Mo17 and B73 genomes have 3,090,851 SNPs and 527,406 In Dels. The TEO and B73 genomes have 3,697,069 SNPs and 578,249 In Dels. The TEO and Mo17 genomes have 3,124,430 SNPs and 464,498 In Dels.4) The Identification of PAV. Between the Mo17 and B73 genome, the length of PAV is 88,736,738 bp, and the number of PA genes is 1293. Between the TEO and B73 genome, the length of PAV is 133,138,377 bp, and the number of PA genes is 1968.Between the TEO and Mo17 genome, the length of PAV is 100,658,635 bp, and the number of PA genes is 1854.
Keywords/Search Tags:maize, genome, assembly, repeat, sequence variation
PDF Full Text Request
Related items