Font Size: a A A

Efficient confidence sets for disease gene locations

Posted on:2008-08-21Degree:Ph.DType:Dissertation
University:Case Western Reserve UniversityCandidate:Sinha, RitwikFull Text:PDF
GTID:1440390005478299Subject:Biology
Abstract/Summary:
In positional cloning of disease susceptibility genes, identification of a linked chromosomal region via linkage studies is often followed by fine mapping with association studies. Efficiency can be gained via an intermediate step where confidence regions for the locations of disease genes are constructed. We proposed and explored the properties of two novel practical approaches, one frequentist and one Bayesian, to constructing such intervals using affected sibling pair data.; The first approach draws upon a promising paradigm, Confidence Set Inference (CSI) [Papachristou and Lin, 2006a], that converts a sequence of tests to obtain the interval. CSI replaces the traditional null hypotheses of no linkage with a new set of null hypotheses where the chromosomal position under consideration is in tight linkage with a trait locus, and was proposed for the Mean test statistics (CSI-Mean). We postulate that a more efficient test statistic, the Maximum LOD Score (MLS), will lead to more efficient confidence sets when used in the CSI framework. We propose a procedure that tests the CSI null hypotheses using the MLS statistic (CSI-MLS). Compared to CSI-Mean, CSI-MLS provides tighter confidence regions over a range of single- and two-locus disease models. In addition, the MLS test is more powerful than the Mean test in testing the CSI null over a wide range of disease models. Furthermore, CSI-MLS is computationally much more efficient than CSI-Mean. The CSI framework requires knowledge of some disease model related parameters. In practice, such knowledge is often absent and a two-step procedure may be employed. The advantages of CSI-MLS over CSI-Mean is preserved in this practical setting as well.; Though the CSI framework compares favorably with other competitors, and CSI-MLS has further improved its statistical and computational efficiency, the two-step procedure is often conservative. This motivates us to explore a Bayesian approach, that formulates the disease gene location as a parameter, to seek possible improvements. In this case, credible intervals with a uniform prior on the location are confidence regions. A Metropolis-Hastings algorithm is implemented to sample from the posterior distribution and Highest Posterior Density Intervals of the disease gene location are constructed. The proposed Bayesian method is shown to provide precise confidence sets with correct coverage probabilities when compared to competing methods. The two novel methods are applied to a Rheumatoid Arthritis data example.
Keywords/Search Tags:Disease, Confidence sets, CSI framework, Efficient, Location
Related items