Font Size: a A A

Iterative Screen Regression-a New Approach Of Dissecting Genetic Effects For Complex Traits

Posted on:2019-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:M LuoFull Text:PDF
GTID:2393330545470020Subject:Crop Genetics and Breeding
Abstract/Summary:PDF Full Text Request
Genome-wide association studies(GWASs)have successfully identified thousands of genetic variants associated with complex traits that included human,animal,and plant.A few GWAS were able to determine all genetic factors for those complex traits,which were influenced by a combination of multiple genetic and environmental factors.However,most of the current analysis techniques that existing the problem of missing heritability,and expressed awkward situation of low explaining ability.Commonly,genome-wide association(GWA)methods rely on the mixed model for population structure correction,which to avoid spurious associations between a trait and genetic markers.However,the single-locus analyses were inconsistent with the nature of complex traits that the result of multiple gene synergistic effect(including epistasis).Therefore,further study of gene interactions may help to detect genetic effects,which were not detectable by univariate methods model.Hence,it is necessary to propose and develop new methods that can simultaneously detect additive effects and epistasis effects models.Moreover,in addition to the discovery of trait-associated variants,there is increasingly interested in predicting complex trait phenotypes from individual genotypic data in plant breeding,animal breeding,and human.These predictions are based on the selection of SNPs and estimation of their effects in a discovery sample,which is the validation in an independent sample with a known phenotype and ultimately application to samples with unknown phenotype.This study based on our proposed statistical method,iterative Screen Regression(ISR),which construct firstly new model selection criteria RIC(Regression Information Criteria)and then using special variable selection procedure to maximize the RIC,in order to achieve the optimal dissecting of genetic effects of complex traits.We mainly applied to detect epistasis and additive effects in genome-wide association analysis(GWAS)and genomic selection(GS).The study includes the following three parts:(1)All common genome-wide association(GWA)methods rely on population structure correction,to avoid false genotype and phenotype associations.However,population structure correction is a stringent penalization,which also impedes identification of real associations.Here,we used recent statistical advances and proposed iterative screen regression(ISR),which enabling simultaneous multiple marker associations and shown to appropriately correction population stratification and cryptic relatedness in GWAS.Results from analyses of simulated suggest that the proposed ISR method performed well in terms of power(sensitivity)versus FDR(False Discovery rate)and specificity,also less bias(higher accuracy)in effect(PVE)estimation than existing multi-loci(mixed)model and the single-locus(mixed)model.We also show the practicality of our approach by applying it to rice,outbred mice,and A.thaliana datasets.It identified several new causal loci that other methods did not detect.Our ISR provides an alternative for multi-loci GWAS,and the implementation was computationally efficient,analyzing large datasets practicable(n>100,000).(2)Interacted between gene and gene,also known as epistasis,regulate many complex traits in different species.With the availability of low-cost genotyping,it is now possible to study epistasis on a genome-wide scale.However,identifying genome-wide epistasis is supersaturated multiple regression problems,which is challenging and inefficient.Our ISR method applicable to dissect the genetic effects for complex traits in such problems.Two datasets of real genotypes were used to simulate phenotypic data(human and rice),and the most commonly used populations in the plant were IMF2(rice)and MAGIC(barley),which ISR was used to dissect the genetic architecture of related complex traits.Simulations showed that our approach was significantly better than the commonly used exhaustive search method(PLINK)in terms of detection power and type I error.In rice RIL dataset,our QTL full model includes additive and dominance main effects of 1,619 markers and all pairwise interactions,with a total of more than 5 million possible effect terms.The QTL association mapping identified 42,45,38 and 48 significant effects for the four traits in two years,namely the number of panicles per plant,the number of grains per panicle,grain weight,and yield per plant.Most identified QTLs were involved in digenic interactions.In barley MAGIC dataset,our QTL model included additive effects of 3,413 markers and all pairwise interactions,and with more than 5 million possible effect terms.The QTL association mapping identified 21 significant effects for the FT(flowering time)trait.We also observed that many of our findings(genomic regions with main or higher-order epistasis effects)overlap with known candidate genes that had been already reported in rice and barley closely related species for the complex traits.All results suggest that ISR was an efficient approach to detect high-order interactions in a multi-loci model.(3)Although genome-wide association studies had identified markers that were associated with various complex traits and diseases,the ability to predict such phenotypes remains limited.A perhaps overlooked explanation lies in the limitations of the genetic models and statistical techniques commonly used in association studies.Moreover,in most cases,these loci explain such a small fraction of phenotypic variability that their use for predicting diseases was limited.However,using genotype data to perform accurate genetic prediction of complex traits could facilitate genomic selection in animal and plant breeding programs,and could aid in the development of personalized medicine in humans.Because most complex traits have a polygenic architecture,accurate genetic prediction often requires modeling all genetic variants together via polygenic methods.Herein,we also utilize ISR method for genome prediction.We compared ISR with several commonly used prediction methods with simulations.We further applied ISR to the prediction of 12 traits including the four species of cattle,rice,wheat and mice.The results of the study indicated that the ISR method had higher prediction power than several commonly used polygenic prediction methods(eg,rrBLUP,BSLMM and BayesC etc.)and stability.
Keywords/Search Tags:variable selection, linear model, nonlinear model, multi-loci model, genome-wide association study, genomic prediction
PDF Full Text Request
Related items