Font Size: a A A

Statistical methods for discovering disease susceptibility genes in human populations

Posted on:2003-05-15Degree:Ph.DType:Thesis
University:Carnegie Mellon UniversityCandidate:Zhang, XiaohuaFull Text:PDF
GTID:2464390011486782Subject:Statistics
Abstract/Summary:PDF Full Text Request
This thesis develops statistical methods for discovering disease susceptibility (DS) genes in human populations. Namely, we model linkage disequilibrium (LD) measures across genetic markers using free-knot splines with non-IID errors; discover candidate DS genes using microarrays; and implementing both microarrays and linkage disequilibrium studies for pharmacogenomics feasibly in current clinical trials.; Bayesian models of free-knot splines are developed for correlated errors, non-constant variance or both are developed. These models are generalized as a single model and the distributional properties based on the general model are discovered and proved. Algorithms for MCMC simulation of fitted curve and its functionals are designed. The correlation of LD among genetic markers are modeled with exponential decay through half decay and unit decay. Variances for LD measures are estimated based on the Delta method. Gene differentiation measures are employed in LD mapping for both bi-allelic and multi-allelic markers. Free-knot splines developed in this thesis, smooth splines and gene differentiation measures are then applied to data on LD for three diseases: Hereditary Hemochromatosis, Cystic Fibrosis and Huntington Disease, as well as of simulated data. The results demonstrate that Nei's GST can be applied as a LD measure for k-allelic markers, that is, comparable to or better than other LD measures for bi-allelic markers and that the confidence intervals obtained with free-knot splines have the right coverage while those with smoothing splines do not.; This thesis also develops a method and related algorithms for identifying candidate DS genes from the whole genome using microarrays. With this method, categorization through clustering and scoring are used to reduce the impact of a high amount of noise and to capture the major biological features of gene expression levels. The method and algorithms are applied to a database of leukemia. The results demonstrate that the candidate DS genes selected by this method not only are differentially expressed but also achieve a low misclassification rate when these genes are used to discriminate patients.; Finally, a feasible strategy for implementing pharmacogenomics in current clinical trials is proposed in this thesis. With this strategy, microarrays are used to screen out some candidate DS genes or interest chromosome regions, association studies are then used to determine the impact of gene variation on drug action in more details.
Keywords/Search Tags:Gene, Method, Candidate DS, Disease, Free-knot splines, Thesis, Used
PDF Full Text Request
Related items