Computing three-dimensional scene from a single image by bottom-up/top-down Bayesian inference

Posted on:2006-08-03

Degree:Ph.D

Type:Thesis

University:University of California, Los Angeles

Candidate:Han, Feng

Full Text:PDF

GTID:2458390005997379

Subject:Computer Science

Abstract/Summary:

It is common experience for human vision to perceive full 3D shape and scene from a single 2D image with the occluded parts "filled-in" by prior visual knowledge. Thus, computing the 3D structures of all the objects in the scene from a single image is a fundamental problem in computer vision. In this thesis, we propose a bottom-up/top-down Bayesian inference framework to compute the 3D structures of objects in the scene from a single image, which integrates the involved visual tasks (segmentation, perceptual grouping, object detection and recognition, 3D reconstruction) in a principled way and incorporates the prior visual knowledge in the inference.; The output of the inference framework is a hierarchical "parsing graph" with the scene label at the top (or root), objects with 3D structures and their parts at intermediate nodes, and image pixels at the bottom. The number of layers in this parsing graph is determined by the types of objects or visual patterns. The nodes in this parsing graph correspond to visual patterns represented by probabilistic models. The parsing graph also has both top-down connections and horizontal spatial connections, which correspond to the generative models and spatial relations modeled by Markov Random Field (MRF) respectively.; Formulated in Bayesian framework, the inference algorithm computes the parsing graph from the input image by optimizing a posterior probability. In this optimization process, we integrate two popular computing paradigms in computer vision: generative methods, and discriminative methods. The former formulates the posterior probability to maximize in terms of generative models for images defined by likelihood functions and priors. The latter computes discriminative proposals using some bottom-up tests to drive the maximizing process in the solution space. Thus, the inference algorithm achieves both speed and consistency.; We also investigate three mechanisms to efficiently construct the parsing graph based on the properties of visual patterns being computed: bottom-up construction mechanism, top-down construction mechanism, and bottom-up/top-down construction mechanism.

Keywords/Search Tags:

Image, Scene, Single, Bottom-up/top-down, Inference, Visual patterns, 3D structures, Construction mechanism

Related items

1	Research On Indoor Scene Modeling From A Single Image
2	Image Object Recogniyion Based On Visual Perception Mechanism
3	Research On Target Recognition Methods In Remote Sensing Images Based On Knowledge Inference And Visual Mechanism
4	Research And Implementation Of Scene Graph Generation Algorithm Based On Attention Mechanism
5	Research On Bottom-up Visual Saliency Detection Algorithms
6	High-quality Monocular Scene Depth Inference Method For Heterogeneous Camera Data
7	Research On Bottom-up Visual Attention
8	Study Of Visual Selective Attention Models
9	Research Of Salient Regions Detection Based On Bottom-Up Visual Attention Model
10	Image Target Segmentation Of The Natural Scene Based On Visual Attention Mechanism