Font Size: a A A

Highlights Detection In Soccer Videos Based On Multimodal Fusion

Posted on:2021-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:P ChengFull Text:PDF
GTID:2427330602981630Subject:Engineering
Abstract/Summary:PDF Full Text Request
As one of the most popular sports in the world,soccer game video has a wide audience.However,due to the long duration of soccer games,people are interested in different content.Some people like to watch shots,penalties and so on.Others like to watch the part of midfield cooperation.In the face of massive video data,if we rely on the traditional way of manual editing to make the video collection,it is not only a great waste of human resources,but also can not guarantee the timeliness of collection generation.Therefore,this paper focuses on the detection of wonderful events in soccer video.The main research work of this paper focuses on three aspects:first,the shot annotation of soccer video;second,the detection of soccer video highlights based on single-mode;third,the detection of soccer video highlights based on multimodal fusion.Firstly,the structure of soccer video is analyzed.In this paper,the shot is selected as the basic unit to extract and analyze the features.For shot segmentation,after analyzing the characteristics of shot switching of soccer video,twin comparison algorithm is used to detect the shot switching of soccer video to achieve shot segmentation.Then,we need to give semantics to each independent lens unit.In this paper,the lens is divided into goal lens,Corner Shot lens,penalty shot lens and foul shot lens to mark.According to the characteristics of the scene when the wonderful events of soccer match happen,the lens is marked by combining the artificial rules,which reduces the time cost and energy of manual marking.Secondly,because of the wonderful events in soccer games,the shots are always presented in some specific sequences.When the traditional CNN algorithm extracts image features,it can't save the changes of motion state between adjacent frames in the video sequence.Therefore,when detecting soccer video events based on single-mode features of image dimension,3D CNN is proposed to extract soccer video features,so as to realize the detection of soccer video wonderful events.In the detection of soccer wonderful events based on audio features,this paper proposes to send the spectrum corresponding to Mel cepstrum coefficient to CNN network for training,so as to get the detection results of soccer wonderful events based on audio features.Finally,this paper uses multimodal fusion method to fuse the recognition results based on image features and audio features.Experiments show that the highest recognition accuracy is 96%based on the results of multimodal fusion,which is higher than that of singlemodal recognition.
Keywords/Search Tags:soccer event detection, multimodal fusion, 3D CNN, semantic annotation
PDF Full Text Request
Related items