Font Size: a A A

On data-driven chi square statistics

Posted on:2010-11-14Degree:Ph.DType:Dissertation
University:Lehigh UniversityCandidate:Qian, HuiyuFull Text:PDF
GTID:1442390002980369Subject:Mathematics
Abstract/Summary:PDF Full Text Request
Pearson chi square tests have been very popular because they are intuitive, natural and easy to carry out for most categorical data sets. However, the construction of the cells has to be determined when the population is continuous. Moreover, the power of such an arbitrarily selected chi square test for continuous data is very unstable and depends on the choice of the cells. We propose several data-driven chi square tests in which the choice of cells is based on the data itself. Two-cell data-driven chi square tests for data on a line and on a circle are our main concerns. For data on a line, the tests require a minimum cell length epsilon to avoid singularity. We study how to choose the proper value of epsilon and the set of possible cutpoints. For directional data, we show that the circular two-cell data-driven chi square test with equal cell lengths is equivalent to Ajne's N test. By comparing with several related tests, we find that our proposed tests are more powerful for a generic alternative than a particular Pearson chi square test with the cells taken without investigating the data. Examples on applications of the methods are also given.
Keywords/Search Tags:Chi square
PDF Full Text Request
Related items