Font Size: a A A

Exploration And Application Of Association Mapping And Genomic Prediction

Posted on:2017-01-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y XuFull Text:PDF
GTID:1313330518469224Subject:Crop Genetics and Breeding
Abstract/Summary:PDF Full Text Request
Association study and genome prediction both are the crucial methods to detect genes and improve the genetic for the complex quantitative traits in plant breeding.The application of associate study on the identification of gene of quantitative traits has been developed widely since 2001.The statistical power and operation efficiency have been increased according to the development of numerous algorithms and software packages.However,association study for the multi-trait and multi-locus has not attracted much attention.Moreover,genome prediction has become more popular in the research of plant genome.Comparing to the traditional method of molecular marker analysis selection(MAS),genome selection(GS)uses the whole genetic markers on the chromosome to predict the performance of progenies from hybridization or selfing.Various approaches including parametric regression,semiparametric regression and nonparametric regression have been applied to the GS,which not only significantly shorten the period of the breeding cycle,but also effectively decrease the cost of the breeding.According to the development of the molecular technology,various data could be obtained,for example the metabolomics data,transcriptomic data and so on,which provide new data source for GS.Nevertheless,there is no systematic comparison between different methods and different types of data for GS.In our study,we developed a new method for the multi-trait and multi-locus base on the LASSO and partial least square(PLS).Our purpose is to increase the statistical power of the association study.Furthermore,we also compared the difference of predictability of GS among various methods and data source,and used the omics data to predict the yield of hybrid rice.The research contains four parts.1.Multi-locus and multiple traits method(1)LASSO-Based Genome-wide Association Studies.We sampled 422 oil palm from the Angola germplasm collection and measured 13 economic traits from these palms.A total of 1081 informative SNP markers were identified from the 4451 markers on the Illumina Infinium ?Bead Chip platform.Multi-locus genome-wide association studies(GWAS)were conducted using LASSO(least absolute shrinkage and selection operator)and GEMMA(genome-wide efficient mixed-model analysis).We identified 19 SNPs for the 13 traits.The majority of the QTLs were detected by LASSO,in which the p-values of individual markers were calculated based on bootstrapped standard errors.The method has detected QTLs with effect as small as 0.1%of the phenotypic variance.Many of the detected QTLs are nearby known QTLs detected from linkage studies reported by other research groups.(2)A multivariate partial least squares approach to joint association analysis for multiple correlated traits.Many complex traits are highly correlated rather than independent.By taking the correlation structure of multiple traits into account,joint association analyses can achieve both higher statistical power and more accurate estimation.To develop a statistical approach to joint association analysis that includes allele detection and genetic effect estimation,we combined multivariate partial least squares regression with variable selection strategies and selected the optimal model using the Bayesian Information Criterion(BIC).We then performed extensive simulations under varying heritabilities and sample sizes to compare the performance achieved using our methods with those obtained by single-trait multilocus methods.Joint association analysis has measurable advantages over single-trait methods,as it exhibits superior gene detection power,especially for pleiotropic genes.Sample size,heritability,polymorphic information content(PIC),and magnitude of gene effects influence the statistical power,accuracy and precision of effect estimation by the joint association analysis.2.Comprison and application of statistical methods for genomic prediction(1)Comparison of statistical methods for omic prediction of agronomic traits in maize The genome selection holds a great promise to accelerate plant breeding via early selection before phenotypes are measured,and it offers major advantages over marker-assisted selection for highly polygenic traits.In addition to genomic data,metabolome and transcriptome are increasingly receiving attention as new data sources for phenotype prediction.Several statistical methods have been applied to predict phenotypes.We now have 100k SNPs,28769 transcripts and 748 metabolites measured from 368 diverse inbred lines of maize.We compared the predictive abilities of six agronomic traits with three different data sources using eight representative methods including linear unbiased prediction(BLUP),least absolute shrinkage and selection operator(LASSO),partial least squares(PLS),BayesA,BayesB,reproducing kernel Hilbert spaces regression(RKHS)and support vector machine(SVM-RBF and SVM-POLY),and we found that GBLUP possess overall good performance across different traits and omics,and genomic prediction performs better than transcriptomic and metabolomic prediction.Finally,we evaluated the predictive performance using the selected markers based on the result of GW AS and discovered that the combination did not improve prediction.(2)Omic Prediction of Yield-related Traits in Hybrid Rice.We now have transcriptomic and metabolomic data as potential resources for prediction.We found that the predictability of hybrid yield can be further increased using these omic data.LASSO and BLUP are the most efficient methods for yield prediction.For high heritability traits,genomic data remain the most efficient predictors.When metabolomic data are used,the predictability of hybrid yield is almost doubled compared with genomic prediction.Of the 21945 potential hybrids derived from 210 recombinant inbred lines,selection of the top 10 hybrids predicted from metabolites would lead to a-30%increase in yield.
Keywords/Search Tags:Association analysis, LASSO, PLS, Omic prediction, BLUP
PDF Full Text Request
Related items