Font Size: a A A

Statistical Missing Data and Computation Problems: Theories and Applications in Astrophysics, Finance and Economics

Posted on:2012-09-10Degree:Ph.DType:Dissertation
University:Harvard UniversityCandidate:Li, ZhanFull Text:PDF
GTID:1460390011462904Subject:Statistics
Abstract/Summary:
The missing data problems are everywhere in statistics. The groundbreaking EM algorithm offers a simple algorithm with sound theoretical properties for dealing with missing data problems where there is a parametric likelihood. The EM algorithm guarantees the monotonicity of the observed likelihood when it iterates and its output generally is a (local) MLE. However, when we have missing data problems where there is no parametric likelihood, life becomes much more complicated. Many approaches have been proposed in the literature but they tend to be based on case by case consideration without unified principles. We aim to fill this gap in the estimation with missing data by establishing a general framework of "self-consistent" estimators, which not only have an easy-to-follow algorithm for their computation but also have nice theoretical properties.;We began with an overview of the concept and examples of the "self-consistent" estimator, which is first proposed in Efron (1967) and gained considerable interest in the literature on nonparametric estimation with missing data, especially for the survival or distribution functions. We also reviewed the ES algorithm of Elashoff and Ryan (2004), which is a generalization of EM to the cases with estimation equations. It therefore forms an intermediary step between the EM algorithm with parametric likelihood and our algorithm for cases with no parametric likelihood or estimating equations. Subsequently, we presented two general approaches in establishing the algorithmic convergence and theoretical properties of the "self-consistent" estimator. The first approach is based on contraction mapping theories, and we applied it to wavelet denoising with soft thresholding. The second approach is based on fixed point theories. We gave wavelet denoising with hard thresholding and lasso regression as the examples for the second approach.;Besides theoretical and methodological developments, we also applied the missing data methods in several interdisciplinary settings. We first presented the application in astrophysics, where we provide algorithms adapted from the EM algorithm for analyzing a set of astrophysics data for the purpose of identifying the lightcurves for the stars and then event detection based on the light curves. The second application is on a general equilibrium model of the U.S. economy in order to evaluate the impact of the tax policy on U.S. economic growth. Latent variables of production and consumption biases are introduced into the model and Kalman filters are adopted to deal with the missing latent variables. The last application is on financial credit risk modeling. Latent gamma variables are introduced to credit risk model by Duffie and Singleton (1999) to capture the contagious effects between the defaults. We also presented efficient algorithms and large deviation theories for the credit risk models.
Keywords/Search Tags:Missing data, EM algorithm, Theories, Theoretical properties, Credit risk, Parametric likelihood, Astrophysics, Application
Related items