| In the past decade,with the improvement of the parallel computing capability of the hardware GPU,deep learning has developed rapidly,and the research and application of the convolutional neural network in the field of computer vision have also been developed,which drives the development of image segmentation technology.Semantic segmentation is a method in the field of image segmentation,which can classify each pixel in an image.Since the proposal of Fully Convolution Neural Network(FCN)in 2015,the semantic segmen'tation method based on FCN has been widely studied by scholars at home and abroad.This method can handle complex data sets by fitting data distribution with a large number of parameters.Due to the limitation of hardware computing power,the high precision real-time semantic segmentation method is urgently needed in many application fields.However,most of the current semantic segmentation methods focus on how to improve the accuracy and ignore the segmentation speed,while the existing real-time semantic segmentation methods have the problem of insufficient segmentation accuracy.To solve this problem,this paper proposes an improved real-time semantic segmentation method based on DeepLabv2.Experiments on Cityscapes and Pascal VOC2012 data sets obtained 68.2%mIoU and 75.3%mIoU,respectively,with the segmentation speeds of 31FPS and 87FPS respectively.Compared with DeepLabv2,this method has the following three improvements:(1)Improve the network structure.In the coding stage,the deep separable convolution was used to reduce the computation amount,and in the decoding stage,the Feature Pyramid Net(FPN)decoding process was added to reduce the number of parameters of Atrous Convolution Spatial Pyramid Pooling(ASPP),therefore,the dot product operation of high resolution features and parameters is greatly reduced and increasing the segmentation speed of the model.(2)Improve the loss function.This paper considers that there is a one-to-many mapping relationship between feature points and pixels in semantic segmentation,and multiple pixels can constitute a subgraph,from the point of view,Triplet Loss is introduced to metric learning.The features extracted from the model in Euclidean space have the advantages of small intra-class distance and large inter-class distance,thus increasing the segmentation ability of the model.(3)To find out the importance weight of the samples in the data setIn this paper,it is considered that there is a serious problem of sample imbalance in semantic segmentation data set,which hinders the improvement of model segmentation accuracy,and it is difficult to manually select the appropriate sample importance weight for multiple categories.To alleviate this problem,this paper proposes a method based on data statistics to select sample importance weight.In general,this paper optimizes the neural network architecture of DeepLabv2 model to real-time segmentation,then proposes to use Triplet Loss as a Loss function,and calculates the importance weight of samples according to the data set,so as to improve the segmentation accuracy of the model. |