Font Size: a A A

Study Of Building Video Abstract Based On Feature Clustering

Posted on:2017-12-23Degree:MasterType:Thesis
Country:ChinaCandidate:J M ShangFull Text:PDF
GTID:2348330536476729Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Content-based retrieval and indexing technology is one of the hot spots in the multimedia field.How to extract the highlights of the video to generate a video summary with data features has high practical value and broad application.This paper presents a method based on feature clustering to generate the abstract,the main work includes video feature extraction,feature reduction,feature fusion,feature clustering,key frame extraction with clustering and abstract generation.Feature extraction:This paper chooses a new feature which fuses a color texture feature CEDD(Color and Edge Directivity Descriptor)and visual vocabulary histogram.CEDD using fuzzy classification fuses two kinds of commonly used low-level features---color and texture.It has not only good results,but also a advantage of small storage and high processing speed.On the other hand,the bag of visual words makes the SIFT features quantize to a dictionary to generate a histogram description.The dictionary in this paper is built by K-leans clustering method,the sample data includes probably 4.5 million frames in the field of movies,animation,news,music videos,sports and real-time recording.In the final,a dictionary of J0,000 words will be built.Feature dimension reduction:In order to reduce the time complexity of subsequent processing,PCA principal component analysis will be used to reduce the dimension of the histogram and improve the computing power in this paper.Fusion features:Fuse the CEDD and the bag of visual words which has been reduced to make the descriptor more comprehensive.In experimental testing,this paper uses ANMRR to detect the fusion descriptor,and the image database is from James Wang.The result shows that the value of ANMRR is just 0.24,better than the other features.Clustefring features:This paper selects SGONG adaptive clustering to extract key frames.Compared to other clustering methods,this method does not need artificially set the number of clusters,it can cluster the input data automatically.This paper does a experiment using the rate of recall and precision based on 6 videos,and the result shows the method of this paper is better than the other traditional shot segmentation lethod based on color features.Key-frame extraction and abstract generation:Select the frame which is the nearest from the class center as the key frame.According to the user selected video digest length determines the number of key frames and the associated frame number,the resulting video summary of any length to ensure summary accuracy and fluency.Experiments on a 15-minute news video tests were generated summary 90 seconds and 5 minutes two lengths.In the 90s’ summary,there are 101 key frames and 2207 reference frames,the precision rate is 80.0%,the missing rate is 2.0%.In the 5-minute summary,there are 530 key frames and 5542 reference frames,the precision rate is 97.2%,and the missing rate is 0.
Keywords/Search Tags:CEDD, Bag of Visual Words, ANMRR, SGONG, Video Abstract
PDF Full Text Request
Related items