Font Size: a A A

Video Vulgar Action Segmentation Based On Deep Learning

Posted on:2024-05-06Degree:MasterType:Thesis
Country:ChinaCandidate:R XuFull Text:PDF
GTID:2568307103974819Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous development of social media and Internet communication technology,the explosive growth of Internet video has made vulgar information more accessible.Effective regulation of vulgar information on the Internet is crucial to maintaining a healthy Internet environment.The rapid development of video platforms provides convenient entertainment but also brings vulgar information.Unlike pornographic actions,vulgar actions in videos may not expose sexual organs.However,there is a lot of "Side Ball" information,which makes it difficult to distinguish them from ordinary actions.Due to the complexity of video information on the Internet,traditional vulgar content monitoring methods can no longer match the data growth of the current video platform.In recent years,deep learning-based video processing model has performed excellently in many fields.This thesis mainly studies the video vulgar action segmentation method based on deep learning.(1)To solve the problem of boundary blurring between consecutive vulgar actions,this thesis presents a Boundary Filtered U-shaped Temporal Convolutional Network(BFUTCN)to split vulgar actions.Firstly,to cope with the challenge of distinguishing the "Side Ball" information in vulgar actions from that of normal actions using solely image features,BFUTCN introduces a U-shaped structure in the encoder-decoder temporal convolution network to extract the spatiotemporal information and uses video context to enhance the feature recognition ability of the model.Secondly,this thesis introduces the boundary filtered network to extract action boundary information,which can combine frames with unclear boundaries with higher confidence predicted boundaries.By combining these features,the model improves the accuracy of action segmentation.Lastly,BFUTCN presents a segment refinement method to suppress action fragmentation in long action,which can significantly reduce over-segmentation errors without losing accuracy.Since there is no public vulgar dataset on the Internet,this thesis builds the Vulgar dataset under the guidance of experts.Our proposed method achieved competitive results on general action datasets and state-of-the-art results on the Vulgar dataset.(2)To address the challenge of densely labeling vulgar action datasets,this thesis introduces a Parallel Subspace Network(PSN)framework based on timestamp annotation.On the one hand,PSN leverages action eigenvectors from multiple subspaces to enhance the model’s resilience to complex backgrounds.On the other hand,the efficacy of weakly-supervised vulgar action segmentation is contingent upon the quality of pseudo labels.This thesis uses Viterbi decoding and a semi-supervised clustering algorithm for the pseudo-label generation to use timestamp-supervised information fully.In addition,a new subspace learning loss function is introduced,prioritizing the dynamic motion features in vulgar videos while suppressing the static background features.As limited timestamp-supervised action segmentation models exist,this thesis conducts a comparative analysis of classic action segmentation models across varying supervisory levels and benchmarking against the most representative TSS methods.The experimental results show that PSN achieves excellent weaklysupervised action segmentation on Vulgar and Breakfast datasets.Its performance exceeds that of partial full-supervised methods.This thesis effectively solves the problems of blurring the boundaries of vulgar actions and dense annotation of datasets in video vulgar action segmentation tasks,providing assistance for future research on vulgar action segmentation.
Keywords/Search Tags:Vulgar Action Segmentation, U-shaped Network, Boundary Filtered, Temporal Convolutional Network, Timestamp-level Supervision, Subspace Learning, Viterbi Decoding
PDF Full Text Request
Related items