Some methods for robust inference in econometric factor models and in machine learning

Posted on:2015-08-06

Degree:Ph.D

Type:Dissertation

University:Boston University

Candidate:Nikolaev, Nikolay

Full Text:PDF

GTID:1479390017992973

Subject:Statistics

Abstract/Summary:

Traditional multivariate statistical theory and applications are often based on specific parametric assumptions. For example it is often assumed that data follow (nearly) normal distribution. In practice such assumption is rarely true and in fact the underlying data distribution is often unknown. Violations of the normality assumption can be detrimental in inference. In particular, two areas affected by violations of assumptions are quadratic discriminant analysis (QDA), used in classification, and principal component analysis (PCA), commonly employed in dimension reduction. Both PCA and QDA involve the computation of empirical covariance matrices of the data. In econometric and financial data, non-normality is often associated with heavy-tailed distributions and such distributions can create significant problems in computing sample covariance matrix. Furthermore, in PCA non-normality may lead to erroneous decisions about numbers of components to be retained due to unexpected behavior of empirical covariance matrix eigenvalues.;In the first part of the dissertation, we consider the so called number-of-factors problem in econometric and financial data, which is related to the number of sources of variations (latent factors) that are common to a set of variables observed multiple times (as in time series). The approach that is commonly used in the literature is the PCA and examination of the pattern of the related eigenvalues. We employ an existing technique for robust principal component analysis, which produces properly estimated eigenvalues that are then used in an automatic inferential procedure for the identification of the number of latent factors. In a series of simulation experiments we demonstrate the superiority of our approach compared to other well-established methods.;In the second part of the dissertation, we discuss a method to normalize the data empirically so that classical QDA for binary classification can be used. In addition, we successfully overcome the usual issue of large dimension-to-sample-size ratio through regularized estimation of precision matrices. Extensive simulation experiments demonstrate the advantages of our approach in terms of accuracy over other classification techniques.;We illustrate the efficiency of our methods in both situations by applying them to real datasets from economics and bioinformatics.

Keywords/Search Tags:

Methods, Data, Econometric, PCA

Related items

1	A Macro-Level Analysis of Safety Data Using Geospatial Techniques and Spatial Econometric Methods and Model
2	Econometric Methods For Mixed-Frequency Data: Theory And Application
3	The Research On Econometric Modeling Methods And Quantitative Market Trading Behaviors Based On Ultra High Frequency Data
4	Essays on the econometric analysis of welfare
5	Parameters Solution Methods And Application Research Of Stochastic Frontier Models For Panel Data
6	Spatial Econometric Model Selection, Estimation And Its Application Based On The Comparison Of Classical Methods And Mcmc Methods
7	Financial constraints in United States agricultural cooperatives: Theory and panel data econometric evidence
8	Investment Theory Of Econometric Research Based On The Value Of Buffett's Methods
9	Essays on models of the term structure of interest rates and econometric methods for continuous time stochastic processes
10	Rural roads, education, and agriculture: A micro-econometric evaluation study using Trinidad and Tobago data