Font Size: a A A

Haplotype-based Association Studies

Posted on:2012-12-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:L N JinFull Text:PDF
GTID:1224330368995635Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
The completion of the Human Genome Projection, both on quantity and qual-ity, has enriched the data resource of human genetic, which makes people easily lost in the oceans of information. Statistics, as a powerful data analysis tool, has been focused on by more researchers, and it also has played an irreplaceable role in genetic epidemiology.Association analysis, with the aim of investigating genetic variations, is de-signed to detect genetic associations with observable traits, which has played an increasing part in understanding the genetic basis of diseases. Haplotypes, as a common data style, are generally considered to possess more linkage disequilib-rium (LD) information, and haplotype-based association studies are believed to provide high resolution and potentially greater power for identifying genetic disease associations, compared to the other approaches, especially for the rare diseases in case-control studies. However, when modeling these haplotypes, they are subjected to statistical problems caused by rare haplotypes. Abundant parameters limits the power and decreases the efficiency. Fortunately, haplotype clustering offers an ap-pealing solution. This dissertation aims to propose new statistical methods, which combine the structure information of the loci in order to improve the power in the haplotype-based association studies.In this dissertation, we first present APEG for haplotype clustering in haplotype-based association studies, which adopts "affinity propagation" clustering algorithm with EG distance. The new befitting similarity EG distance, designed specially for haplotypes, can incorporate haplotype structure information, which is believed to enhance the power and provide high resolution for identifying associations between genetic variants and disease. Our simulation studies show that the proposed ap-proach offers merits in detecting disease-marker associations in comparison with other methods. We also illustrate an application of our method to a real data set, which shows quite accurate estimates during fine mapping. Then, we develop a non-parametric method based on U-statistics called U-EGS, which has an asymptotic normally distribution and without assumption to the distribution of the samples. The following simulations also shows that the U-statistics with EGS, which could incorporate locus information, gains greater power than the U-statistics without locus information, under different parameters and different disease models.
Keywords/Search Tags:haplotype-based association, logistic regression model, haplotype clustering, case-control study, U-statistics, entropy
PDF Full Text Request
Related items