Font Size: a A A

Using Dimension Reduction Techniques to Model Genetic Relationships for Association Studie

Posted on:2013-02-09Degree:Ph.DType:Dissertation
University:Carnegie Mellon UniversityCandidate:Crossett, AndrewFull Text:PDF
GTID:1453390008476446Subject:Statistics
Abstract/Summary:
Cryptic relatedness can have a detrimental impact on nominal false positive rates for genome-wide association (GWA) tests. One way this confounding variable arises in genetic studies is when there are inherent ancestral differences between the supposedly unrelated cases and controls. A common way to alleviate this problem is to implement a family-based design. Unfortunately, it is not always easy to collect enough families for the test to have an acceptable amount of power. However, it is often the case that the researcher will have both case-control and family-based data. To that end, we propose a method to analyze combining the two study designs called matched-conditional logistic regression (mCLR). We match individuals between the studies based on an eigenanalysis of genotype information and then perform conditional logistic regression on the estimated strata. Once samples are well-matched, mCLR yields comparable power to competing methods while ensuring excellent control over Type I error.;Another source of cryptic relatedness may be due to a researchers desire to sample from individuals of some isolated population. Standard GWA tests do not apply because everyone in the study is related on some level. Most algorithms developed for such purposes rely on knowing the relatedness between the individuals in the study. Unfortunately, estimates of pairwise relatedness are typically noisy. We developed a method called Treelet Covariance Smoothing (TCS) that refines genetically inferred relationships. We apply this method to both simulated and freely available datasets to show its many advantages. In particular, we use less noisy estimates of the relationships to get better estimates of a key quantitative genetics concept called heritability. Finally, we develop a subsampling technique for choosing the tuning parameter used in TCS that uses the vast amount of genotype information available.
Keywords/Search Tags:Relationships, Relatedness
Related items