Font Size: a A A

Embedding and hallucination for image and video

Posted on:2008-06-17Degree:Ph.DType:Thesis
University:The Chinese University of Hong Kong (Hong Kong)Candidate:Liu, WeiFull Text:PDF
GTID:2448390005461974Subject:Computer Science
Abstract/Summary:
In this thesis, we propose a novel dimensionality reduction algorithm called graph-regularized projection (GRP) to tackle the problem of semi-supervised dimensionality reduction that is rarely investigated in the literature. Given partially labeled data points, GRP aims at learning a not only smooth but also discriminative projection from high-dimensional data vectors to their latent low-dimensional representations. Motivated by recent semi-supervised learning process: graph regularization, we develop a graph-based regularization framework to enforce smoothness along the graph of the desired projection initiated by margin maximization. As a result, GRP has a natural out-of-sample extension to novel examples and thus can be generalized to the entire high-dimensional space. Extensive experiments on a synthetic dataset and several real databases demonstrate the effectiveness of our algorithm.;Next, this thesis addresses the problem of how to learn an appropriate feature representation from video to benefit video-based face recognition. By simultaneously exploiting the spatial and temporal information, the problem is posed as learning Spatio-Temporal Embedding (STE) from raw videos. STE of a video sequence is defined as its condensed version capturing the essence of space-time characteristics of the video. Relying on the co-occurrence statistics and supervised signatures provided by training videos, STE preserves the intrinsic temporal structures hidden in video volume, meanwhile encodes the discriminative cues into the spatial domain. To conduct STE, we propose two novel techniques, Bayesian keyframe learning and nonparametric discriminant embedding (NDE), for temporal and spatial learning, respectively. In terms of learned STEs, we derive a statistical formulation to the recognition problem with a probabilistic fusion model. On a large face video database containing more than 200 training and testing sequences, our approach consistently outperforms the state-of-the-art methods, achieving a perfect recognition accuracy.;For face identification, especially by human, it is desirable to render a high-resolution (HR) face image from the low-resolution (LR) one, which is called face hallucination or face super-resolution. A number of super-resolution techniques have been proposed in recent years. However, for face hallucination the utilization of the special properties of the faces is conductive to generate the HR face images.;In this thesis, we propose a new face hallucination framework based on image patches, which integrates two novel statistical super-resolution models. Considering that image patches reflect the combined effect of personal characteristics and patch-location, we first formulate a TensorPatch model based on multilinear analysis to explicitly model the interaction between multiple constituent factors. Motivated by Locally Linear Embedding, we develop an enhanced multilinear patch hallucination algorithm, which efficiently exploits the local distribution structure in the sample space. To better preserve face subtle details, we derive the Coupled PCA algorithm to learn the relation between HR residue and LR residue, which is utilized for compensate the error residue in hallucinated images. Experiments demonstrate that our framework not only well maintains the global facial structures, but also recovers the detailed facial traits in high quality. (Abstract shortened by UMI.)...
Keywords/Search Tags:Video, Hallucination, Image, GRP, Embedding, Face, Problem, STE
Related items