Font Size: a A A

Real-time Semantic Segmentation Based On Deep Learning

Posted on:2022-03-16Degree:MasterType:Thesis
Country:ChinaCandidate:H L ChenFull Text:PDF
GTID:2558307169479444Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Image semantic segmentation is an important task in the field of computer vision.This task needs to predict a semantic label for each pixel based on the texture and color information of the image.Two important factors that have a significant impact on segmentation performance are: abstract semantic information and spatial structure information.In recent years,the technology has been widely used in fields such as autonomous driving,robotics,and augmented reality.At the same time,it also puts forward higher requirements of real-time for image semantic segmentation tasks.The focus and difficulty of real-time semantic segmentation technology is how to obtain highresolution feature maps with strong semantic representation capabilities under the condition of limited computing,which mainly relies on the contextual features of the large receptive field,cross-level fusion features and multi-scale features.Therefore,based on the fully convolutional neural network,this paper conducts in-depth research on efficient global context awareness schemes,cross-level feature fusion strategies and multi-scale information capture structures.The specific research contents are as follows:In terms of global context perception,this paper proposes a general class-guided asymmetric non-local block,which can help each position in the feature layer to adaptively obtain global context information to enhance the category discrimination ability of the feature layer.The core idea of this module is to calculate an asymmetric attention map based on the coarse semantic segmentation results,and use the attention map as a weight to aggregate global features for each position in the feature map.By integrating this module,this paper designs a class-guided asymmetric non-local network to do real-time semantic segmentation tasks,which greatly improves the segmentation accuracy of the network.In terms of cross-level feature fusion strategy,due to the dislocation phenomenon between cross-level features,the conventional fusion module without feature alignment cannot achieve effective feature fusion.This paper proposes a feature alignment fusion module based on deformable convolution.It can effectively solve the problem of misalignment between cross-level features,so that high-level abstract semantic information and low-level spatial structure information can be more accurately integrated.In this paper,a real-time semantic segmentation network is constructed by combining this module,and its segmentation accuracy and reasoning speed have reached the leading level in the field.In terms of multi-scale information capture,this paper analyzes the shortcomings of the existing cross-scale feature perception module,and then combines the data adaptation characteristics of the self-attention mechanism,and proposes a cross-scale contextual attention module.Experiments have proved that this module is more sensitive to multiscale information.This paper builds a real-time semantic segmentation network by integrating this module,and its accuracy and reasoning speed have reached a high level.In summary,based on the full convolutional network,this paper discusses the reasons that affect the performance of real-time semantic segmentation algorithms from the perspectives of global perception schemes,cross-layer feature fusion strategies,and multi-scale feature capture.The real-time semantic segmentation method proposed in this paper is verified on the Cityscapes and Cam Vid datasets.The experimental results show that the method proposed in this paper has achieved leading segmentation accuracy while ensuring high inference speed.
Keywords/Search Tags:Full Convolution Network, Real-time Semantic Segmentation, Deformable Convolution, Self-attention, Multi-scale
PDF Full Text Request
Related items