Font Size: a A A

Research On Semi-Supervised Video Object Segmentation Method Based On Deep Learning

Posted on:2024-09-27Degree:MasterType:Thesis
Country:ChinaCandidate:L LiFull Text:PDF
GTID:2568307079955579Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,various video applications have sprung up.And huge amounts of video data are generated all the time,so an automatic video processing and analysis technology is urgently needed.Computer vision is the core technology and important means to process and analyze massive video data intelligently.Among them,video object segmentation is one of basic tasks of computer vision,which is the basis of video analysis.Under the semi-supervised setting,the main task of video object segmentation is to segment the foreground object continuously at pixel level in the video sequence by using the segmentation mask of the foreground object given in the first frame.In recent years,semi-supervised video object segmentation method based on deep learning has attracted more and more attention and become a research hotspot in this field.At present,the method based on space-time memory network has achieved advanced performance,and performed well in difficult scenes such as target deformation,occlusion,disappearance-reappearance,fast movement,etc,which has attracted wide attention.However,this kind of methods still have two limitations.One is the interference problem of similar objects caused by the way of non-local matching,and the other is the neglect of temporal smoothness.Aiming at these two limitations,based on space-time memory network,thesis carries out research on the semi-supervised video object segmentation method.The main work of thesis is as follows:(1)Aiming at the interference problem of similar objects,thesis studies and designs a Kernel-guided Attention Matching Network(KAMNet),which enhances the model’s discrimination between foreground objects and background regions by using the spatialtemporal attention mechanism.At the same time,KAMNet also uses gaussian kernel to guide the matching between the current frame and the reference set,so that it can be modified from non-local matching to local matching.In order to verify the effectiveness of KAMNet algorithm,experiments are carried out on DAVIS data sets.The experimental results show that KAMNet algorithm has achieved good performance and fast running speed without online learning,achieving a balance between accuracy and speed.The visualization results show that the KAMNet algorithm is robust to the interference problem of similar objects.(2)Considering that the methods based on space-time memory network didn’t make full use of the relationship between frames,combining temporal smoothness constraint with the spatial constraint,thesis designs a Spatial-Temporal Constraint Matching Network(STCM-Net)to optimize the above-mentioned Kernel-guided Attention Matching Network.Firstly,STCM-Net establishes pixel-level region tracking between adjacent frames through region tracking module to impose temporal smoothness constraint.Then,the kernel guidance module in STCM-Net imposes spatial constraint by using gaussian kernel,which guides network to be modified from non-local matching to local matching.The experimental results on DAVIS data sets show that STCM-Net algorithm further optimizes the segmentation accuracy of KAMNet algorithm,which indicates that the performance of video object segmentation algorithm can be significantly improved by using temporal smoothness constraint.
Keywords/Search Tags:Semi-supervised Video Object Segmentation, Space-time Memory Network, Attention Mechanism, Kernel Guidance, Spatial-temporal Constraint
PDF Full Text Request
Related items