Font Size: a A A

Study On Image/Video Object Segmentation

Posted on:2019-03-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:H X XiaoFull Text:PDF
GTID:1368330611492980Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the increasing popularity of smart devices and the rapid development of social networks and social media,digital image and video,as the main carriers for recording vi-sual information,are rapidly changing the way people live and produce.Explosive image and video data not only bring challenges such as huge data storage and processing diffi-culties,but also provide opportunities for mining and understanding of image/video data.Unlike object classification and detection,object segmentation is a high-level,fine-grained task that accurately locates and provides detailed boundary information for a given type of target,and is used in autonomous driving,video editing,image search,medical image analysis and other fields which have shown great potential.In this thesis,we have car-ried out in-depth research work on multiple sub-tasks of object segmentation,including visual saliency detection,image semantic segmentation and video object segmentation.The main innovative research results of this thesis include the following aspects:1.A novel robust dictionary representation model is proposed and applied to bottom-up image saliency detection.The dictionary representation model adopts the?2,1,1 norm which effectively avoids the conflict between the kernel norm and the sparse norm in the decomposition process of the existing low-rank feature matrix.Besides,the proposed method enhances the robustness of the dictionary representation against the outliers in the data.The results on multiple image saliency detection datasets show that the pro-posed method is superior to existing matrix decomposition-based methods and effectively reduces the computational complexity.2.A self-explanatory convolutional neural network model is proposed and used to solve the problem of image saliency detection.The model consists of two sub-modules,i.e.,the saliency detection network and the distraction detection network.The saliency detection network densely utilizes convolution features of different scales and integrates these features through various types of connections,which enhances the connection be-tween the final classifier and features.The distraction detection network analyzes the sen-sitivity of the saliency detection network to a particular input through an interpretable dis-traction mining method,and further improves the results of the saliency detection network by correcting the input image.The results on multiple image saliency detection datasets indicate that the method is superior to existing convolutional neural network based meth-ods.3.We define a semi-supervised image semantic segmentation problem,i.e.,cross-category semi-supervised image semantic segmentation,and propose a transferable con-volutional neural network model to solve this problem.The method uses the property between similar object categories and transfers knowledge learned on an object category?such as the category dog?to other similar object categories?e.g.cat,horse,cow?.In order to further transfer the segmentation knowledge,the method also utilizes the strategy of adversarial training to train the semantic segmentation model.In the case where only50%of the category has pixel-level annotations,the proposed method achieves 96.5%performance of the fully supervised learning model.4.We propose to deeply exploit the motion information in the problem video ob-ject segmentation.The method uses motion cues to correct and integrate the video frame features of the current frame image adjacent in the time domain,thereby enhancing the feature representation of the convolutional neural network for the video frame.In addi-tion,the method extracts the prior knowledge from the motion cues to effectively filter out the noisy objects or regions,and assists the prediction of the segmentation model.The results on the DAVIS-16,Youtube-Objects and SegTrack-v2 datasets show that the proposed method can provide more accurate segmentation results.5.A fast video object segmentation method based on meta-learning framework is proposed.The method transforms the video object segmentation problem from data-driven to task-driven,and performs fast and accurate meta-learning of the segmentation model on multiple similar video object segmentation tasks.The method also proposes a novel online adaptation strategy to ensure that the segmentation model can continu-ously adjust its model parameters over time.The results on the DAVIS-16,DAVIS-17and Youtube-Objects datasets show that the proposed method can significantly reduce the single frame processing time without sacrificing the accuracy of the model.
Keywords/Search Tags:Saliency detection, Semantic segmentation, Video object segmentation, Deep learning, Convolutional neural network
PDF Full Text Request
Related items