Font Size: a A A

Discrepancy-based model selection criteria using cross validation

Posted on:2003-12-20Degree:Ph.DType:Thesis
University:University of Missouri - ColumbiaCandidate:Davies, Simon LeeFull Text:PDF
GTID:2460390011987349Subject:Statistics
Abstract/Summary:
An important component of any linear modeling problem consists of determining an appropriate size and form of the design matrix. Improper specification may substantially impact both estimators of the model parameters and predictors of the response variable: underspecification may lead to results which are severely biased, whereas overspecification may lead to results with unnecessarily high variability.; Model selection criteria provide a powerful and useful tool for choosing a suitable design matrix. Once a setting has been proposed for an experiment, data can be collected, leading to a set of competing candidate models. One may then attempt to select an appropriate model from this set using a model selection criterion.; In this thesis we establish four frameworks which initialize with previously proposed model selection criteria targeting well-known traditional discrepancies, namely the Kullback-Leibler discrepancy, the Gauss discrepancy, the transformed Gauss discrepancy, and the Kullback symmetric discrepancy. These criteria are developed using the bias adjustment approach. Prior work has focused on finding approximately or exactly unbiased estimators of these discrepancies. We expand on this work to additionally show that the criteria which are exactly unbiased serve as the minimum variance unbiased estimators.; In many situations, the predictive ability of a candidate model is its most important attribute. In light of our interest in this property, we also concentrate on model selection techniques based on cross validation. New cross validation model selection criteria that serve as counterparts to the standard bias adjusted forms are introduced, together with descriptions of the target discrepancies upon which they are based. We then develop model selection criteria which are minimum variance unbiased estimators of the cross validation discrepancies. Furthermore, we argue that these criteria serve as approximate minimum variance unbiased estimators of the corresponding traditional discrepancies.; We propose a general framework to unify and elucidate part of our cross validation criterion development. We show that for the cross validation analogue of a traditional discrepancy, we can always find a "natural" criterion which serves as an exactly unbiased estimator. We study how the cross validation criteria compare to the standard bias adjusted criteria as selection rules in the linear regression framework. This is done by concluding our development of each of the four frameworks with simulation results which illustrate how frequently each criterion identifies the correctly specified model among a sequence of nested fitted candidate models. Our results indicate that the cross validation criteria tend to outperform their bias adjusted counterparts.; We close by evaluating the performance of all the model selection criteria considered throughout our work by investigating the results of a simulation study compiled using a sample of data from the Missouri Trauma Registry.
Keywords/Search Tags:Model, Cross validation, Using, Minimum variance unbiased estimators, Discrepancy, Results
Related items