Font Size: a A A

Computing three-dimensional scene from a single image by bottom-up/top-down Bayesian inference

Posted on:2006-08-03Degree:Ph.DType:Thesis
University:University of California, Los AngelesCandidate:Han, FengFull Text:PDF
GTID:2458390005997379Subject:Computer Science
Abstract/Summary:
It is common experience for human vision to perceive full 3D shape and scene from a single 2D image with the occluded parts "filled-in" by prior visual knowledge. Thus, computing the 3D structures of all the objects in the scene from a single image is a fundamental problem in computer vision. In this thesis, we propose a bottom-up/top-down Bayesian inference framework to compute the 3D structures of objects in the scene from a single image, which integrates the involved visual tasks (segmentation, perceptual grouping, object detection and recognition, 3D reconstruction) in a principled way and incorporates the prior visual knowledge in the inference.; The output of the inference framework is a hierarchical "parsing graph" with the scene label at the top (or root), objects with 3D structures and their parts at intermediate nodes, and image pixels at the bottom. The number of layers in this parsing graph is determined by the types of objects or visual patterns. The nodes in this parsing graph correspond to visual patterns represented by probabilistic models. The parsing graph also has both top-down connections and horizontal spatial connections, which correspond to the generative models and spatial relations modeled by Markov Random Field (MRF) respectively.; Formulated in Bayesian framework, the inference algorithm computes the parsing graph from the input image by optimizing a posterior probability. In this optimization process, we integrate two popular computing paradigms in computer vision: generative methods, and discriminative methods. The former formulates the posterior probability to maximize in terms of generative models for images defined by likelihood functions and priors. The latter computes discriminative proposals using some bottom-up tests to drive the maximizing process in the solution space. Thus, the inference algorithm achieves both speed and consistency.; We also investigate three mechanisms to efficiently construct the parsing graph based on the properties of visual patterns being computed: bottom-up construction mechanism, top-down construction mechanism, and bottom-up/top-down construction mechanism.
Keywords/Search Tags:Image, Scene, Single, Bottom-up/top-down, Inference, Visual patterns, 3D structures, Construction mechanism
Related items