Font Size: a A A

Research On Deep Portrait Matting Via Double-grained Segmentation And Multi-scale Sub-objectives Consistency

Posted on:2024-07-25Degree:MasterType:Thesis
Country:ChinaCandidate:Z W MaFull Text:PDF
GTID:2568306929994619Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Image matting is an image processing technology which aims to accurately extract alpha matte,it has a wide variety of applications,such as image synthesis,movie re-creation,video conference and so on.Early matting methods represented by the color key matting usually require that the background of the input image is a single color.Although this method improves the accuracy of matting results,it also puts forward high requirements for the background color in the image,so it is generally only used in specific fields(such as the movie-creation).In order to avoid the constraint of the background color,the natural image matting methods that can be applied to the general background has emerged.However,there are still many problems on such methods:First of all,due to the lack of necessary prior information,this kind of method generally requires users to manually provide the trimap corresponding to the original image,and there are still problems of blurry and inaccurate prediction in areas with high transparency such as the boundary of the foreground;Secondly,there are still some problems in deep matting methods,such as insufficient data sets and single number of foreground.Finally,for the segmented matting model,the outputs of each sub-network are often inconsistent.These problems greatly affect the predictive effect of matting model.In order to solve such problems,we put forward the following three innovations:1)Firstly,aiming at the problems about the dependence on trimap and the insufficient prediction accuracy,we propose an end-to-end deep portrait matting model based on double-grained segmentation,it can extract the alpha matte of foreground portraits directly and accurately from the input RGB images without providing any prior information(such as trimaps).This model is mainly composed of the following three sub-networks:first,the crude segmentation network,which segments the crude trimap through the deep semantic segmentation network;secondly,the fine segmentation network,which takes the output of the previous stage as input and segments the fine eleven-value segmentation map by the shallow encoder-decoder network;finally,the fine matting network,which is used to integrate and refine the prediction results of the first two stages,and finally output high-quality alpha matte.The experimental results show that our model’s prediction is close to or even better than some of the state-of-the-art model.2)Secondly,in order to better matting on multi-person images,we also put forward a new maximum fusion strategy,which is used to simulate multi-person on the same image in the training set.The experimental results show that this strategy not only improves the prediction accuracy for a single-person scene in MODNet,but also enhances its generalization for multi-person images;3)Aiming at the problem of inconsistent sub-network’s prediction in segmented networks,we propose a method on multi-scale sub-objectives self-supervision,with this self-supervised method,the known region,unknown region and regional distribution of sub-objectives are learned at different scales,and the consistency among sub-jectives and the convergence speed are improved.Moreover,in addition to comparing our proposed model with other classical matting methods,we also set up the ablation experiment to verify the effectiveness of each part in our model.
Keywords/Search Tags:Deep portrait matting, Double-grained segmentation, Fine segmentation network, Refined matting network, Maximum fusion strategy, Self-supervision method on multi-scale sub-objectives
PDF Full Text Request
Related items