Font Size: a A A

Improved individual ancestry estimates for proper adjustment of ancestral confounding in association analysis

Posted on:2009-03-30Degree:Ph.DType:Dissertation
University:Case Western Reserve UniversityCandidate:Parrado, TonyFull Text:PDF
GTID:1440390002490793Subject:Biology
Abstract/Summary:
Case-control studies are susceptible to false positive findings because of population stratification either when allele frequencies differ between cases and control due to differences in ancestry distribution or when there is unrecognized stratification within cases or controls due to the presence of subgroups with different ancestries. Thus, accurate estimates of the admixture proportions at an individual level are important. Statistical methods exist to infer individual ancestry from genetic data, and to assign (probabilistically) admixed individuals jointly to two or more populations. We seek to improve the accuracy of individual ancestry estimates (IAEs) using these statistical approaches, thereby reducing the number of false-positive findings (due to ancestry) in case-control studies.;We evaluate several approaches to improve the accuracy of the IAEs using the methods implemented in Structure (Pritchard, Stephens, and Donnelly 2000), and in the principal components approach (Zhang et al. 2002). First, we considered whether using prior information to preselect the prior admixture distribution parameter (a) would improve the IAEs. We show that the IAEs are insensitive to the preselected a parameter. Second, we assess the importance of including pseudo-ancestral subjects (PAs) during the inference process and conclude that including PAs does not improve the accuracy of the IAEs when moderately or highly informative markers for ancestry are used. Third, we determine the number of markers required to obtain accurate IAEs, given the absolute allele frequency difference (delta) between parental populations of the preselected SNPs, the level of divergence between the parental populations, and the genetic contribution from the parental populations to the admixed sample. We show that the number of SNPs necessary to infer accurate IAEs not only depends on the distribution of delta values, but also on the range of ancestry contribution from the population that contributes less to the mixture. Finally, we determine whether combining sociocultural information (e.g., great grand-parental origin) with genetic information to infer a genetic background variable will reduce the number of false positive results. We show no statistically significant difference in the number of false positive results by incorporating great grand-parental origin with the SNP data to derive the genetic background variable for each subject. Our findings will improve the study design to control for population stratification in association studies of admixed populations.
Keywords/Search Tags:Improve, Individual ancestry, False positive, Population, Findings, Stratification, Studies, Estimates
Related items