| Content based sports video analysis has been paid increasing attention in recent years. Lots of work on structure analysis and events extraction has been done, but there are two limits on these methods. First, generality is not well supported as they focus on only finite kinds of video such as football and baseball. Second, they don't consider the scalability problem which is important for emerging application such as mobile video access. In this thesis, we propose a general framework of analyzing periodic structured scored game video (such as tennis, table tennis). A potential application of this proposed framework is flexible video summarization, which can satisfy the need of users of cell phone and Palm-PDA. The thesis takes tennis and table tennis for example in the following work.This analysis framework is proposed based on adequate comparison of existing multi-model information fusion methods and the discovery of periodic characteristic of racquet sports video. It is a general sports video content analysis method based on audio/visual middle level features, domain rules, context, and highlights ranking. Its advantages include simplicity, intuitiveness, generality, context-sensitive and affectivity. The details are as follows:Firstly, in order to extract audio/visual middle level features, we adopt supervised audio classification and unsupervised scene clustering to satisfy the requirement of generality. The audio is robust and supervised audio classification is general in the same type sports video. When we apply this method to other sports video, such as diving, baseball, only a little label work for audio is needed. Because scenes have much difference in appearance, we adopt unsupervised scene clustering for its universality, which groups video shots with similar visual content into same cluster. This thesis proposes a new effective scene clustering method which can automatically decide the stop point without prior knowledge.Secondly, multi-model information fusion method is used to extract the structure events. By analyzing the periodicity characteristic of racquet sports, we propose a general fusion rule, temporal voting strategy, which is suitable for analyzing periodic structured scored game. It assigns labeled audio keywords to clusters according to time axis. Then we get the semantic meaning of each cluster by voting the audio keywords so that the structured events can be obtained. This method makes use of the semantic meaning of audio keywords and the confident boundary of unsupervised scene clusters, the structured events are extracted more accurately.Thirdly, for extraction of specified events, highlight ranking is adopted in order to solve the generality problem. This thesis resorts to affective experience theory and designs three steps as follows. For highlights rank, it is determined by an optimal quantization process based... |