Font Size: a A A

Semantic Video Object Segmentation And Its Performance Evaluation For Content-based Multimedia Applications

Posted on:2005-02-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:G B YangFull Text:PDF
GTID:1118360122496201Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Classical video coding standards such as H.26x and MPEG-1/2 are frame-based techniques, and no segmentation of video scenarios is required. Their high compression performance makes them widely used in video applications. With the proliferation of multimedia information, people are no more satisfied with simple navigation of video contents, but require object-based functionalities. Therefore, MPEG-4 introduces the concept of video object to support content-based functionalities. MPEG-7 defines a universal and normalized description of various multimedia objects. According to the MPEG-4 verification model, video sequence must be segmented into semantic video objects. Their motion, shape and texture information are coded respectively. The main values are: improved coding efficiency by allocating different bit rate to different video object in accordance with their importance to human visual system; object-based scalability so as to obtain better visual effect at low bit rate applications; content-based storage, interactivity and retrieval by organizing video content according to video object.Though MPEG-4 introduces the concept of video object, it does not specify any concrete techniques for obtaining video objects from video sequence. On one hand, the semantic homogeneity of video object is hard to be modeled by any low level features, which makes a generic segmentation algorithm for various video sequences is still a classical problem to be resolved; On the other hand, priori knowledge can often be utilized for specific applications.Therefore, the dissertation focuses on the methodology and techniques for video object segmentation under the framework of MPEG-4 and its application in content-based multimedia systems. The main objectives are as follows. For some specific video sequences such as head-shoulder sequence, video object segmentation should meet the real time performance; Automatic video segmentation can achieve better results for video sequences with simple or still background. For sequences with complex background, semi-automatic segmentation can achieve satisfactory results, and the human intervention should be simple. Major work of this dissertation is as follows:First, two automatic video object segmentation schemes are proposed. The first one is based on background registration and change detection. It consists of preprocessing, background registration, background buffering, change detection and post-processing. It doesn't need computation-intensive operations such as motion estimation, and it can overcome the influence of shadow and illumination variance. It can produce background information, which makes it support MPEG-4 sprite coding. The second one is an improved spatio-temporal segmentation. Temporal segmentation is based on change detection, and its key is the selection of threshold, which is obtained by threshold analysis. Spatial segmentation is the core of the whole algorithm, which is a wavelet based watershed scheme.Second, a semi-automatic video object segmentation algorithm is proposed. To facilitate users defining the initial object contour, a modified intelligent scissors is proposed on the basis of original intelligent scissors. By introducing bounding box, simplified image features and improved searching strategies, it can improve about 6~8 times the processing speed with just slight sacrifice of segmentation accuracy, which fully meets the requirements for initial object extraction in semi-automatic segmentation. To avoid errors accumulating and propagating during object tracking, video decomposing is conducted based on the rigidity of video object and global/local histogramcomparisons. Then, region-based backward projection is utilized to interpolate the VOPs of successive frames. Because of video decomposing and human intervention, it can solve the occlusion problem to most extent. Experimental results demonstrate that it can achieve better segmentation results than COST211 AM.Third, video object segmentation in the cellular neural networks is proposed. Since most of th...
Keywords/Search Tags:video object segmentation, cellular neural networks, performance evaluation, MPEG-4
PDF Full Text Request
Related items