Font Size: a A A

Haplotype inference using a hidden Markov model with efficient Markov chain sampling

Posted on:2008-06-12Degree:Ph.DType:Thesis
University:University of Toronto (Canada)Candidate:Sun, ShuyingFull Text:PDF
GTID:2440390005956206Subject:Statistics
Abstract/Summary:
Knowledge of haplotypes is useful for understanding block structures of the genome and finding genes associated with disease. Direct measurement of haplotypes in the absence of family data is presently impractical. Hence several methods have been developed previously for reconstructing haplotypes from population data. In this thesis, a new population-based method is developed using a Hidden Markov Model (HMM) for the source of ancestral haplotype segments. A higher-order Markov model is used to account for linkage disequilibrium in the ancestral haplotypes. The HMM includes parameters for the genotyping error rate, the mutation rate, and the recombination rate. Four mutation models with varying number of parameters are developed and compared. Parameters of the model are inferred by Bayesian methods, using Markov Chain Monte Carlo (MCMC). Crucial to the efficiency of the Markov chain sampling is the use of a Forward-Backward algorithm for summing over all possible state sequences of the HMM. This model is tested by reconstructing the haplotypes of 129 children in the data set of Daly et al. (2001) and of 30 children in the CEU and YRI data of the HAPMAP project. For these data sets, family-based haplotype reconstructions found using MERLIN (Abecasis et al. 2002) are used to check the correctness of the population-based reconstructions. The results of this HMM method are quite close to the family-based reconstructions and comparable to the PHASE program (Stephens et al. 2001, Stephens and Donnelly 2003, Stephens and Scheet 2005) and the fastPHASE program (Scheet and Stephens 2006). The recombination rates inferred from this HMM method can help to predict haplotype block boundaries, and identify recombination hotspots.
Keywords/Search Tags:Haplotype, Markov model, Markov chain, HMM, Using
Related items