Font Size: a A A

A family-based likelihood ratio test for general pedigree structures that allows for missing data and genotyping errors

Posted on:2008-03-13Degree:Ph.DType:Thesis
University:State University of New York at Stony BrookCandidate:Yang, YangFull Text:PDF
GTID:2440390005478989Subject:Biology
Abstract/Summary:
The purpose of this work is to design a likelihood ratio test (LRT) that uses the information of both affected and unaffected individuals from a general pedigree to test association between marker and disease. The null hypothesis is that of equal marker penetrances, and the alternative hypothesis implies the presence of both allelic association and linkage between the disease and marker loci. The test is based on a conditional likelihood, which is a product of two factors: the first factor, LFounder, uses founder's genotypes and phenotypes to estimate population frequencies of marker genotypes. The second factor, LNonfounder, evaluates disequilibrium in transmission of marker alleles from parents to offspring. The test statistic built on this conditional likelihood allows for two problems: (1) missing parental genotypes, and (2) random genotyping errors. Derivations of the conditional likelihoods are given for trios (two parents and a child), general nuclear families, multiple-marriage nuclear families, and zero-looped three- and four-generation pedigrees. For example, the following scenarios are considered for a general nuclear family: complete parental genotype data and no genotyping errors; only one genotyped parent and no genotyping errors; no parental genotype data and no genotyping errors; and with genotyping errors in the previous three scenarios. A robust algorithm grid-UOBYQA is used to locate log-likelihood maxima under the null and alternative hypotheses as well as to estimate marker penetrances and population genotype frequencies.;The results of a null simulation study suggest that the test statistic appears to follow a central chi-square distribution with one degree of freedom under the null hypothesis, even in the presence of missing data and genotyping errors. The power comparison based on a 23 factorial design shows that this LRT is more powerful than the original TDT, even when 20% genotypes in trios are missing and 1% genotypes are mistyped. Including the information of unaffected children in the likelihood calculation appears to increase the power to test marker-disease association. Finally, the application of this LRT to an idiopathic scoliosis dataset and a psoriasis dataset successfully identifies the significant associations between the markers and the disease that were previously published.
Keywords/Search Tags:Test, Genotyping errors, Likelihood, Data, Missing, General, LRT, Marker
Related items