Font Size: a A A

Spatioteporal-Phrase Based Video Retrieval

Posted on:2012-12-13Degree:MasterType:Thesis
Country:ChinaCandidate:C N LiuFull Text:PDF
GTID:2218330338964827Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the fast development of Internet and multimedia technology, there are moreand more video resources which are widely used everywhere. Scholars and users havepaid more and more attention to the research of the video information including muchlively and abundant content. Consquently efficient usage and resonable organizationof video information turn out to be an urgent issue. Traditional text-based searchmethods can not meet the requirements of information processing. Thus content-basedvideo retrieval technology (CBVR, Content-based Video Retrieval) was proposed.Content-based Video Retrieval technology uses computer to automatically analyzeand organize the content of video. People can submit sample date or description tofind the required video data.In this paper,based on the research of content based video retrieval,weconsiderate the spatial relation between tracks, a novel concept ofspatiotemporal-word is proposed.The spatiotemporal-phrases are constructed for eachshot. Each shot is quantified to a vector which record the frequency of eachspatiotemporal-phrase in the shot. This is helpful for increasing efficiency of imageretrieval. Experimental results show that the method of the proposedspatiotemporal-phrase based video retrieval is effective, and lay a foundation for thefuture research.In this paper, the whole video is first preprocessed. (1)In order not to lose theimage information, the system uses all the frames in the video for retrieval, which notaffects the performance of the on-line retrieval. Although this method increasespreprocessing time, it avoids losing image information when using key frames intraditional video retrieval. (2)The SIFT algorithm is used to detect features and gettheir descriptors of each frame. (3) Features between adjacent frames are matched andthe video is segmented into shots based on the number of matched features. (4) Eachfeature is tracked in the shot and the stable feature tracks are obtained with their track descriptors are also generated. (5) All track descriptors of each shot are clustered intoa codebook which contains all the spatiotemporal-words . (6) Spatiotemporal-phrasesare constructed for each shot based on the spatial relationship between feature tracksand the shot vector is obtained for each shot that record the frequency of eachspatiotemporal-phrase.In the retrieval stage, a query image which covers the query object should begiven to the system. The SIFT descriptors are obtained from the query region and arequantified to spatiotemporal-words. The spatiotemporal- phrases for the query imageare constructed, so a query vector for query image is finally obtained. Similarityvalues between query vector and each shot vector are calculated, then the values aresorted. Finally the target shots which contain the query object are given to the usersand the query object in frames of these shots are localized. Compared with the methodproposed in [1], the better performance of our spatiotemporal-phrase based videoretrieval approach is demonstrated .In the process of query object localization, the bounding box of query objectmay contains many background image content, so in this paper, a simple and effectivemethod which optimizing the localization is proposed, experimental results show thatthe method is effective.
Keywords/Search Tags:Feature extraction, Track extraction, Spatiotemporal-word, Spatiotemporal-phrase, Object retrieval
PDF Full Text Request
Related items