In order to better meet the needs of practical applications for accurate and fast semantic segmentation,this paper further studies the real-time semantic segmentation model based on deep convolution neural network by deep learning technology,so as to improve the accuracy of image segmentation and speed of inference.Recently,many real-time segmentation networks using multi-branch structure to extract spatial and semantic features separately usually lacks communication between branches,resulting in poor segmentation performance.To address the problem,a new real-time semantic segmentation network(PBSNet)is proposed to enhance the interaction between context and spatial features.This model uses lightweight classification network as feature extractor,and from which spatial path and semantic path are divided to retain rich details and semantic information.In the decoder,a semantic enhancement module(SEM)is designed to explore the relationship between high-level semantic features and enhance the ability of the model to classify similar objects.An information exchange module(IM)is developed to interact spatial features and semantic features.Through bi-directional vertical propagation and adaptive spatial attention,the feature representation is enhanced,generating semantic and spatial enhanced features.Finally,an attention fusion module(AFM)to aggregate multi-scale features to produce the final segmentation prediction.The results on the Cityscapes dataset demonstrate the superiority of PBSNet over state-of-the-art methods,achieving a balance of accuracy and efficiency with 74.6% m Io U and 82.5 fps.Secondly,most of the existing real-time segmentation models that combine boundary and attention mechanisms often perform worse than other models that use only one mechanism to improve performance.Therefore,a new real-time semantic segmentation network(SABA)based on scale adaptive attention and edge aware is proposed to explore the use of attention and edge information to further improve segmentation accuracy.As for encoder,selecting more effective short-term dense cascade module as the base block to extract features,and obtain multi-scale features with a small number of parameters.Inspired by ASPP,a depth residual pyramid(DRP)is proposed at the end of the encoder to enhance the feature context,and enable each branch to learn different feature information through residual connection.An adaptive attention module(SAM)is designed to generate segmentation prediction of three different scales,and three boundary modules(BAM)are connected in series to combine them in order to accumulate the final segmentation prediction through(a)prediction offset between successive scales,(b)learnable weights to control relative contributions,and(c)progressive fusion guided by segmentation loss and boundary loss.Experiments on Cityscapes and Cam Vid datasets show that on a single RTX 2080 Ti GPU,SABA achieves the speed of 52.63 fps and 80.15 fps and the accuracy of 76.7% and 72.23%m Io U respectively,achieving a better balance between efficiency and performance. |