Font Size: a A A

Multiple hypothesis testing: Methodology, software implementation, and applications to genomics

Posted on:2010-01-01Degree:Ph.DType:Dissertation
University:University of California, BerkeleyCandidate:Gilbert, Houston NashFull Text:PDF
GTID:1448390002987660Subject:Biology
Abstract/Summary:PDF Full Text Request
Modern biological experiments are increasingly characterized as high-throughput and high-dimensional. The research contained in this dissertation concerns itself with statistical approaches for the control of Type I errors, i.e., false positive results, arising in particular in functional genomic settings. The material contained herein is organized as follows.;Chapter 1 introduces general issues surrounding multiple testing and its increasing importance in applied settings, while Chapter 2 lays the statistical framework for our methods as well as reviews recent proposals for the estimation of a test statistics joint null distribution.;Chapter 3 extends the discussion of a test statistics null distribution to include a computationally efficient, continuous t-statistic-specffic null distribution useful for testing hypotheses involving means, linear model regression coefficients, and correlation parameters.;Chapter 4 introduces and characterizes powerful empirical Bayes multiple testing procedures for controlling generalized tail probability and expected value Type I error rates. These empirical Bayes procedures effectively make use of the test statistics null distributions described in the earlier chapters.;Chapter 5 illustrates the implementation of our methodological work in Chapters 1-4 through the open-source multtest software package available as part of the Bioconductor project (http://www.bioconductor.org ). Careful consideration has been given in the design phase to allow for wide, modular functionality while also ensuring our methods are accessible to the user. An application to a gene expression microarray experiment highlights the new software developments.;Chapter 6 contains an application to the problem of graphical model selection. Using a dataset from Arabidopsis thaliana, we observe that our methodology and software may be used for detecting significant edges in a graph, and, hence, in aiding the reconstruction of biological networks.;Chapter 7 concludes the dissertation, providing a summary the research as well as directions for future work.
Keywords/Search Tags:Chapter, Testing, Software, Multiple
PDF Full Text Request
Related items