Font Size: a A A

Research And Application Of Video Object Segmentation Based On Samples Generation Via Deep Learning

Posted on:2021-04-22Degree:MasterType:Thesis
Country:ChinaCandidate:S WangFull Text:PDF
GTID:2428330626958583Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,with the popularity of smart phones and the maturity and commercial use of 4G and 5G,various short video and live video streaming have sprung up rapidly.At the same time,the recognition,understanding and retrieval of massive videos has become an urgent need.Different from text and image,video carries more information,so the processing of video is more difficult.As the basic work in the field of video processing,video object segmentation uses pixel level precision to segment specific object in video frame sequence,which is widely used in automatic driving,video reetrieval,video monitoring,video semantic understanding.Therefore,the research and implementation of video object segmentation is very important.However,there are the following two problems in the state of the art semi-supervised video object segmentation methods: Firstly,neural network has a high demand for training samples.If training samples are not enough,the training model can not fully extract the underlying features of the image;Secondly,the scene is easily affected by the occlusion of the object,image blur,sharp change of the object,clutter of the object background and so on,resulting in the problem of incomplete object segmentation contour.Secondly,the scene is easy to be affected by occlusion,blurred image,sharp change of target,clutter of target background and so on.Affected by this,the network model cannot segment the complete object contour.These two problems weaken the robustness and accuracy of the video object segmentation algorithm.Based on the above problems,this thesis proposes a data augmentation method based on random grid-hiding and a sample generation method based on generative adversarial networks.The specific research work and innovations of this topic are as follows:(1)Aiming at the problem that traditional data augmentation methods can't solve the object occlusion,this thesis proposes a samples generation method based on random grid-hiding.The algorithm first uses a random grid-hiding method during the training of the neural network model to blind part of the image to obtain sufficient training samples.Then,make the network model learn relevant features from the rest of the image content.It improves the robustness of the video object segmentation network when the foreground object is occluded,and at the same time,it can alleviate the over-fitting problem and fake label problem in the neural network training.(2)Aiming at the problems of background interference and background clutter of moving object in video,we propose a samples generation method based on a triplet loss function generation adversarial network to improve the performance of video object segmentation in this scenario.First of all,this algorithm uses the principle of generating adversarial network to build the video object segmentation model.Secondly,the joint training of triplet loss function is used to make the segmentation model get better training on the semi-supervised model.Finally,a better semi-supervised video object segmentation model is obtained by iterative training between generator network and discriminator network.(3)For the application of the two proposed algorithms in engineering,this paper designs and implements a comprehensive application platform for video target segmentation.Users can customize the training and test of the video target segmentation algorithm proposed in this paper through this application platform.Experiments show that the research of this thesis has the following advantages: First,this method is an effective supplement to the currently popular data augmentation methods.Secondly,the robustness of video object segmentation based on convolutional network model is improved under various interference conditions.Finally,it has universal applicability and can be extended to other network training such as image classification and recognition,object detection,Person re-identification,and so on.There are 42 figures,12 tables and 81 references in this thesis.
Keywords/Search Tags:video object segmentation, random grid-hiding, samples generation, Generative Adversarial Networks(GAN), triplet loss function
PDF Full Text Request
Related items