Font Size: a A A

Video Retrieval With Coding Based Track Clusters

Posted on:2012-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:L H XingFull Text:PDF
GTID:2218330338465045Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
In recent years, the global informatization and economic globalization hasbecome the trend of the times. So studying the new broadband services, developingthe network multimedia applications and improving the people's quality of life havebecome the issues of common concern all over the world. The main part of broadbandservices and multimedia information are images and videos, but the video combinesimage, text, sound and other media, and has the strongest expressive force. Therefore,the research and development of video multimedia service has become a majorresearch field of the information science and technology. How to retrieval theinformation effectively from the mass video data is a key issue for developing ofvideo multimedia services. Because of the deficiencies of the Keywords-BasedRetrieval method, the Content-Based Video Retrieval(CBVR) method is proposed.The main goal of this paper is to retrieve the shot with the search target in thevideo. The user gives an image with the search target and extracts the features fromthe search target. Features of search target are matched with features of candidatetarget in the video database to obtain the retrieval results.The video contains more extensive information than text and image, but it'scontents can not be given directly and be retrieved as text. To achieve content-basedvideo retrieval, the video must firstly be preprocessed, which includes video structureanalysis and video feature extraction. Video structure analysis is that the video isdivided into shots by means of the video shot boundary detection. Video featureextraction is to get a description of the video shots with features like color, texture orshape. Then, the content-based video retrieval depended on these video features.Therefore, this paper firstly extracts SIFT (Scale Invariant Feature Transform)features from the video frames. SIFT features are local features of the image, whichare invariant to rotation, scaling and illumination changes, and partially keep stable to angle changes, affine transformation and noise. Secondly, based on local invariantfeature vector matching between the adjacent video frames, the shot boundary isdetected to segment the video. Thirdly, the video features are tracked to extract thestable features, which are called tracks. Fourthly, in the RGB space, pixels in theimage are quantized and coded with color. The features in the first frame of per shotare detected by the MSER, and the main colors of the features are counted. Thefeatures are clustered using the color information and spatial location information offeatures. Finally, the tracks are counted in the cluster regions and multi-candidatetargets expressed by track clusters in the shots are generated. To improve theefficiency of retrieval, these candidate targets are used to represent the shots and dovideo retrieval. When video retrieving, the similarity between search target and thecandidate target within per shot of video database is measured. According to the orderof similarity, all the video shots with search target are returned.
Keywords/Search Tags:content-based video retrieval, shot segmentation, cluster, color code
PDF Full Text Request
Related items