Font Size: a A A

Key Technologies On Video Semantic Information Extraction

Posted on:2006-08-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y L YuFull Text:PDF
GTID:1118360155972177Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Video semantic information is referred to the information about the objects in video streams, including object shape, spatial relation between objects, and events related to objects, etc. When video stream is described by semantic information extracted from video, the compression ratio can be considerably increased, and the functionalities of accessing, indexing and retrieval, manipulating for video can be enhanced. It has attracted much attention from academe and industry. This dissertation is focused on the key technologies for video semantic information extraction, including three fundamental parts.In order to implement the automatic segmentation of video semantic object plane, statistical change detection is addressed. First, an algorithm combing statistical change detection and spatial-temporal filter is proposed for video object segmentation, which can remove the uncovered background due to the motion of objects effectively and work well even the object moves very slowly. Secondly, video object segmentation algorithm based on background frame is investigated, an efficient background construction method is proposed with the information from multiple frame-to-frame change detection results, which is not affected by slow motion of object. Finally, an algorithm using multiple successive frame difference to segment header-shoulder video stream is proposed, the segmentation result is preferable, and the calculation complexity is very low.Semantic video object scalable tracking is researched. In order to meet the requirements on the processing speed and accuracy of object tracking in different video applications, a scalable object tracking framework is designed, which is composed of two layers, high-level and low-level image analysis. In the high-level image analysis, the object tracking implements the trajectory correspondence of high-level semantic characteristics (bounding box, area, centroid, etc.) of objects at different frames. In the low-level image analysis, regions are segmented by fuzzy C-means clustering algorithm according to low-level visual characteristics (color, edge, texture, etc.), region characteristic descriptors are projected to the the next frame as the prediction of segmentation, the object regions can be obtained with precise borders and correct correspondence between successive frames. By interaction between high-level and low-level image analysis, temporal coherence and spatial accuracy of objects are enhanced according to the precise borders and correspondence relationship of regions included in the objects.Hierarchical description and extraction technologies of video semantic information areresearched. A hierarchical representation model of semantic information is proposed, which organizes detailed information at shot layer, semantic video object layer, semantic video object plane layer, and semantic video object region layer, and can integrate high-level semantic information and low-level visual information effectively. A structural description scheme for video semantic information is designed, the relationship of objects at different layers is described by object hierarchy, and the relationship of objects at the same layer is described by entity relation graph, thus, the multi-level abstraction of video semantic information is formed. According to the model proposed, a video query system is designed, users can efficiently browse and search video database by different characteristics from different layers.
Keywords/Search Tags:Semantic video object, Semantic video object plane, Video semantic information, Statistical change detection, Segmentation and Tracking, Characteristic descriptor, MPEG-4, MPEG-7
PDF Full Text Request
Related items