Error correction for high-dimensional data via convex programming

Posted on:2010-09-11

Degree:Ph.D

Type:Thesis

University:University of Illinois at Urbana-Champaign

Candidate:Wright, John N

Full Text:PDF

GTID:2440390002973073

Subject:Engineering

Abstract/Summary:

Modern data processing applications in signal and image processing, web search and ranking, and bioinformatics are increasingly characterized by large quantities of very high-dimensional data. As datasets grow larger, however, methods of data collection have necessarily become less controlled, introducing corruption, outliers, and missing data. This thesis addresses the basic question of when and how simple data representations can be recovered from such non-ideal observations. In particular, we develop theoretical explanations for the efficacy of convex programming approaches to error correction for both vector and matrix-valued data.;For vector-valued observations, we prove that if a signal is sufficiently sparse with respect to a highly correlated basis, then as long as the fraction of errors is bounded away from one, it can be recovered from almost any error by solving a simple convex program. This result suggests that accurate recovery of sparse and nonnegative signals is possible and computationally feasible even with almost all of the observations corrupted.;For matrices, we consider the fundamental problem of recovering a low-rank matrix from large but sporadic corruption. We prove that "almost all" matrices of low enough rank can be efficiently and exactly recovered from almost all error sign-and-support patterns, again by solving a simple convex program. This result holds even when the rank is proportional to the observation dimension (up to a logarithmic factor) and a non-vanishing fraction of the observations are corrupted. It also implies the first proportional growth result for completing a low-rank matrix from a small fraction of its entries.;Finally, we show how these theoretical developments lead to simple, scalable, and robust algorithms for face recognition in the presence of varying illumination and occlusion. The idea is extremely simple: seek the sparsest representation of the test image as a linear combination of training images plus a sparse error term due to occlusion. In addition to achieving excellent performance on public databases, this approach sheds light on several important issues in face recognition, such as the choice of features and robustness to corruption and occlusion.

Keywords/Search Tags:

Data, Error, Convex

Related items

1	Optimizing Methods And Application Of Data Ming Error Systems Based On Error-eliminating Theory
2	On The Linear Convergence Of Difference-of-convex Algorithm With Error Bound
3	Convex and Non-Convex Optimizations for Recovering Structured Data: Algorithms and Analysi
4	Global Error Bounds And Metric Regularity For Almost Convex Inequality System In R~n
5	Scale Effect And Error Analysis Of Vector Data To Raster Conversion With Currentland Use Of Typical Landform
6	The Research Of The Generalized E-convexity And Related Optimization Questions
7	Multi Beam Bathymetric Data Preprocessing And Research And Implementation Of Weakening System Error
8	Gap Functions And Error Bounds For Non-differentiable Convex Vector Optimization Problems
9	K-convexity Of Convex Sets On Banach Spaces And Its Applications
10	A Posteriori Error Estimates For Hp Finite Element Solutions Of Convex Optimal Control Problems