Font Size: a A A

Statistical Inference In Genetic Association Studies

Posted on:2010-07-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:M YuanFull Text:PDF
GTID:1114360275955461Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Genetic association analysis is commonly used to detect susceptible gene of human disease.Case-control design,matched case-control design and family design are three common approaches to collect data in association studies.One of the critical issues in association studies is to increase power of association tests,especially in the framework of genome-wide association studies.Under some ideal conditions,genes of the population stay in a state of stability described by the Hardy-Weinberg equilibrium law.Any deviations from the Hardy-Weinberg equilibrium may imply mutation or association of the disease with the gene.Interaction patterns between alleles,i.e. the information about genetic model,is determined by the mode of Hardy-Weinberg disequilibrium.In this article,we investigate how to estimate genetic model using information of Hardy-Weinberg disequilibrium and propose efficiency robust tests by incorporating genetic model into trend tests for the three designs mentioned above.Pearson's chi-square test,Cochran-Armitage trend test are the most frequently used methods in association studies.Pearson's chi-square test doesn't depend on the underlying genetic model and is thus a robust test.Cochran-Armitage trend test is derived for a specific genetic model,and is therefore sensitive to model specifications. Cochran-Armitage trend test is most powerful for the true genetic model,however,it suffers from substantial power loss if the genetic model is misspecified.In practice, because of the complex mechanism of complex diseases,the true models are usually unknown.Efficiency robust tests are thus desirable.MAX type tests and tests based on genetic model selection are the most popular methods.Although researches on genetic model selection and other related efficient robust tests have been studied thoroughly for case-control design,there still lacks studies for other important designs,such as the matched case-control design,the family trio design and the two-stage design in genome-wide association studies etc.Matched case-control design and family sampling are two methods to control confounding factors in association studies.Estimating genetic model for stratified data is different from that for case-control data.In this study,we propose novel methods to estimate genetic model for matched case-control data and the family trio data respectively. For the matched case-control data,we use margins to construct a new Hardy-Weinberg disequilibrium test and followed by the corresponding matched trend test. For the family trio data,we derived the Hardy-Weinberg disequilibrium test from the score test of the conditional likelihood function.The optimal TDT-type test is then applied to test the association.Theoretical studies,simulation results and real data analysis demonstrate that our method can select the true model correctly with a high rate and the related test is more efficient robust compared with the MAX,MERT and Pearson's chi-square test.The two-stage design in genome-wide association studies is a widely used strategy to reduce experimental expenses and to increase statistical power.In this article,we propose to screen out a small fraction of hundreds of thousands SNPs in the fist stage with DNA pooling technique,then a genetic model selection-based trend test is applied to those SNPs selected in the first stage.A joint analysis is finally applied to test for association between genetic markers and the disease.Simulation studies show that the proposed approach behaves better than those in the literature.Thus,the proposed test is cost-effective and holds promise in real world applications.Finally,we study MIN2 test by combining Pearson's chi-square test and the Cochran-Armitage trend test and extend it to a multiply ordered 2×J contingency table.We derived the asymptotic distribution of MIN2 and its p value under null hypothesis.Simulation results show that the extension can be used not only for retrospective studies,but also for the prospective and cross-sectional studies.It is even more important that the proposed MIN2 is the most efficient robust among all the existing robust tests.The proposed method can also be applied to analyze general 2×J contingency tables with multiply ordered categories.
Keywords/Search Tags:Genetic association studies, Case-control design, Family data, Matched-pair, Genome-wide association studies, Pearson's chi-square test, Cochran-Armitage trend tests, Hardy-Weinberg disequilibrium coefficient, MAX, MIN2, GMS, Robust test
PDF Full Text Request
Related items