Font Size: a A A

Software Development For Extracting SNPs Across A Genetic Mapping Population And Construction Of Genetic Linkage Maps In Populus

Posted on:2020-04-11Degree:MasterType:Thesis
Country:ChinaCandidate:D YaoFull Text:PDF
GTID:2393330626451142Subject:Tree genetics and breeding
Abstract/Summary:PDF Full Text Request
Traditional molecular markers have the limitations of low throughtput and great difficulty in genotyping individuals,usually leading to sparse linkage maps and thus hindering further studies in quantitative trait loci(QTL)mapping,marker-assisted breeding,genome assembly and comparative genomics.Restriction site-associated DNA sequencing(RAD-seq)technology allows us to identify a large number of single nucleotide polymorphisms(SNP)across genomes of many individuals in a fast and cost-effective way,and makes it possible to construct high-density genetic linkage maps.Although many software packages are available for ecological and evolutionary studies,a few effective tools are available for extracting genotype data with RAD-seq for genetic mapping.How to extract a large number of SNP genotype data from the RAD-seq data is a very challenging problem.Therefore,it is necessary to develop new software to obtain a large number of high-quality SNP genotype data in the population,so as to construct high-density genetic linkage maps in forest trees.We developed an integrated pipeline called gmRAD for generating SNP genotypes de novo across a genetic mapping population with RAD-seq data.The software package can not only handle PE reads but also analyze reads of different lengths within or between samples.gmRAD is freely available at https://github.com/tongchf/gmRAD.As an analytical strategy,gmRAD takes five steps to implement the whole algorithms:(1)clustering the first(forward)reads of each parent;(2)building two parental references;(3)generating parental SNP catalogs;(4)calling SNP genotypes across all individuals;(5)filtering the genotype data for genetic linkage mapping according to segregation patterns,Mendel’s law of segregation and missing genotypes in the population.All the steps can be completed with a simple command line,but they can be also performed optionally if prerequisite files are available.We used the new software gmRAD to analyze the RAD-seq data from the two partents and their 418 progeny in an F1 hybrid populations of Populus deltoides and Populus simonii,obtaining a large number of SNPs and constructing high-density linkage maps of the two parents.The RAD-seq data of the parents and progeny amounted to 1486.2 Gb,from which 4021 SNPs with the segregation type of ab?aa and 2101 of aa?ab were generated.After two-point linkage analysis,2018 SNPs of segregation type ab?aa were grouped into 19 linkage groups under a large range of LOD thresholds from 7 to 55,while 2097 SNPs segregating in the type of aa?ab were also consistently grouped into 19 linkage groups under a wide range of LOD thresholds from 7 to 29.The number of linkage groups was perfectly matched the karyotpe of Populus.Next,we used multiple software for genetic mapping to order the SNPs within a linkage group,and chose the optimal result for constructing the linkage map.Consequently,the maternal linkage map of the female parent Populus deltoides was constructed with the SNPs of segregation type of ab?aa,spanning a total length of 7838.48 cM in genome,with the group lengths ranging from 217.03 to928.64 cM and the average distance of 1.96 cM between adjacent SNP markers.The paternal genetic linkage map of Populus simonii was constructed with SNPs segregating in the type of aa?ab,with a total length of 5506.35 cM,group lengths from 178.60 to 716.40 cM,and the average length of 2.65 cM between adjacent SNP markers.The new software gmRAD developed in this study is a powerful tool to extract a large number of SNPs from RAD-seq data across a mapping population for constructing high-density linkage maps.The software can not only be used to construct high-density genetic maps in forest trees regardless of an already reference genome,but also can be applied to the traditional backcross(BC)and F2 populations in inbred lines.The two Populus linkage maps constructed here are of high quality in the aspect of the SNP genotype data and the SNP ordering within a linkage group,providing an important genetic resource for identifying QTLs,accelerating molecular breeding programs,assembling genome sequences,and performing comparative genomics in Populus.
Keywords/Search Tags:restriction site-associated DNA sequencing, genetic linkage map, single nucleotide polymorphism, software, poplar
PDF Full Text Request
Related items