Font Size: a A A

Optimizing rare variant association studies in theory and practice

Posted on:2015-05-04Degree:Ph.DType:Dissertation
University:Harvard UniversityCandidate:Wang, RanFull Text:PDF
GTID:1474390017996033Subject:Biology
Abstract/Summary:PDF Full Text Request
Genome-wide association studies (GWAS) have greatly improved our understanding of the genetic basis of complex traits. However, there are two major limitations with GWAS. First, most common variants identified by GWAS individually or in combination explain only a small proportion of heritability. This raises the possibility that additional forms of genetic variation, such as rare variants, could contribute to the missing heritability. The second limitation is that GWAS typically cannot identify which genes are being affected by the associated variants. Examination of rare variants, especially those in coding regions of the genome, can help address these issues. Moreover, several studies have recently identified low-frequency variants at both known and novel loci associated with complex traits, suggesting that functionally significant rare variants exist in the human population.;However, without sufficiently large sample size, we are underpowered to detect rare variant effects due to the low allele frequencies and the large numbers of rare variants in the exome. This dissertation is broadly divided into two parts to explore strategies for optimizing the power of rare variant association studies. First, we developed a cost-efficient pooled sequencing scheme as well as the analytic framework that ensures low false positive and false negative rates in variant discovery. We showed that this strategy is good for follow-up studies of candidate genes and for identifying potential genetic diagnosis in well-phenotyped patients. Second, we employed forward simulation to assess the usefulness of founder populations in rare variant association studies and compare the efficiency of exome array genotyping vs. high coverage exome sequencing. We developed a novel simultaneous simulation of sequence variation in the non-Finnish European and the Finnish population that closely approximates the empirical sequence data. We showed that studies of founder populations like Finland can substantially increase power for discovery in a subset of genes and exome chip is currently much more cost-efficient than exome sequencing. Taken together, our results have highlighted the usefulness of having diverse sets of populations (ideally founder populations) and employing cost-efficient study designs such as exome chip followed by pooled sequencing to boost power of rare variant association studies.
Keywords/Search Tags:Association studies, GWAS, Exome, Sequencing
PDF Full Text Request
Related items