Font Size: a A A

Hierarchical extensions of Bayesian parametric models for whole genome prediction

Posted on:2015-05-24Degree:Ph.DType:Dissertation
University:Michigan State UniversityCandidate:Yang, WenzhaoFull Text:PDF
GTID:1470390017994308Subject:Animal sciences
Abstract/Summary:
Whole genome prediction (WGP) is increasingly used to predict breeding values (BV) of plants and animals based on the use of single nucleotide polymorphism (SNP) marker panels. Two particularly popular WGP models, labeled BayesA and BayesB, are based on specifying all SNP-associated effects to be independent of each other. In this dissertation, we further extend these two models to allow for greater flexibility to infer upon BV and SNP effects in three different frameworks: 1) allowing for correlated SNP effects, 2) reaction norm modeling of genotype by environment interaction (GxE) and 3) bivariate WGP models. We complement these efforts with focusing on strategies to infer upon key hyperparameters that anchor some of these specifications.;Based on a first order nonstationary antedependence specification, we extended BayesA and BayesB to account for spatial correlation between SNP effects due to the proximal QTL; we label the corresponding extensions as ante-BayesA and ante-BayesB respectively. Using simulation studies and application to the publicly available heterogeneous stock mice data and other provided benchmark data, we determined that antedependence models had significantly higher WGP accuracies compared to their conventional counterparts, especially at higher LD levels. Subsequently, we extended reaction norm (RN) and random regression (RR) models to account for GxE. Several specifications on the SNP-specific variance-covariance matrices (VCV) of intercept and slope effects were considered using independent inverted Wishart (IW) prior densities (IW-BayesA, IW-BayesB and IW-BayesC). Two potentially more flexible RR/RN models using square root free Cholesky decomposition (CD) were proposed (CD-BayesA and CD-BayesB). Based on a RN simulation study and a RR data analysis in pigs, RR/RN WGP models provided greater WGP accuracies compared to conventional WGP models although differences were not substantial between the competing IW- vs CD- based methods except with simpler genetic architectures (i.e., low number of QTL). We also developed bivariate WGP models based on more or less the same specifications for SNP-specific VCV in RR/RN models (i.e., IW-BayesA, CD-BayesA and CD-BayesB) comparing them to the more conventional bivariate genomic BLUP (bGBLUP) model. Using a LD simulation study, the three bivariate trait models generally demonstrated higher WGP accuracy than univariate BayesA or BayesB when the number of pleiotropic QTL was relatively large and the heritability of the trait was low. Furthermore, in an application to data from pine trees, CD-BayesB exhibited higher predictive ability compared to other competing models. Comparisons between competing WGP models require appropriate tuning of key hyperparameters. Hence we also studied three alternative Metropolis-Hastings (MH) sampling strategies to infer upon key hyperparameters in BayesA and BayesB. Both simulation studies and application to the heterogeneous stock mice data, strategies that were more heavily based on Metropolis Hastings sampling of key hyperparameters demonstrated significantly greater computational efficiencies compared to strategies that deferred to usage of Gibbs sampling.
Keywords/Search Tags:Models, WGP, Key hyperparameters, SNP effects, Compared, Strategies
Related items