Multiple hypothesis testing: Methodology, software implementation, and applications to genomics | | Posted on:2010-01-01 | Degree:Ph.D | Type:Dissertation | | University:University of California, Berkeley | Candidate:Gilbert, Houston Nash | Full Text:PDF | | GTID:1448390002987660 | Subject:Biology | | Abstract/Summary: | PDF Full Text Request | | Modern biological experiments are increasingly characterized as high-throughput and high-dimensional. The research contained in this dissertation concerns itself with statistical approaches for the control of Type I errors, i.e., false positive results, arising in particular in functional genomic settings. The material contained herein is organized as follows.;Chapter 1 introduces general issues surrounding multiple testing and its increasing importance in applied settings, while Chapter 2 lays the statistical framework for our methods as well as reviews recent proposals for the estimation of a test statistics joint null distribution.;Chapter 3 extends the discussion of a test statistics null distribution to include a computationally efficient, continuous t-statistic-specffic null distribution useful for testing hypotheses involving means, linear model regression coefficients, and correlation parameters.;Chapter 4 introduces and characterizes powerful empirical Bayes multiple testing procedures for controlling generalized tail probability and expected value Type I error rates. These empirical Bayes procedures effectively make use of the test statistics null distributions described in the earlier chapters.;Chapter 5 illustrates the implementation of our methodological work in Chapters 1-4 through the open-source multtest software package available as part of the Bioconductor project (http://www.bioconductor.org ). Careful consideration has been given in the design phase to allow for wide, modular functionality while also ensuring our methods are accessible to the user. An application to a gene expression microarray experiment highlights the new software developments.;Chapter 6 contains an application to the problem of graphical model selection. Using a dataset from Arabidopsis thaliana, we observe that our methodology and software may be used for detecting significant edges in a graph, and, hence, in aiding the reconstruction of biological networks.;Chapter 7 concludes the dissertation, providing a summary the research as well as directions for future work. | | Keywords/Search Tags: | Chapter, Testing, Software, Multiple | PDF Full Text Request | Related items |
| |
|