Font Size: a A A

Research On Street Scene Semantic Segmentation Based On Lightweight Neural Networ

Posted on:2024-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y F CaoFull Text:PDF
GTID:2568307130458844Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Image semantic segmentation is one of the key technologies in the field of computer vision,which is widely used in the fields of autonomous driving,traffic monitoring and security.In recent years,lightweight street semantic segmentation algorithms can be used in more application scenarios by reducing the computational complexity and reducing the need for hardware devices.The problem that the model does not capture enough low-level detail information due to image blurring,noise interference,lighting changes and shooting angles in the current lightweight street semantic segmentation,resulting in reduced segmentation accuracy.In this paper,we improve Segformer,a lightweight neural network based on Transformer,to improve the model’s understanding of complex scenes and further enhance the model’s inference speed,making it easier to use on mobile.The details of the research are as follows:(1)To address the problems that the Segformer hierarchical structure produces different resolution feature maps to interpolate the location encoding will lead to accuracy degradation,and the lack of fused channel information not only lacks attention to spatial information but also increases the complexity of the model computation.We propose to improve the feature extraction network model by using global pooling layer to integrate global spatial information to improve the robustness of the model,and using 1×1 convolution to integrate channel information to reduce the input channel dimension and number of parameters,thus reducing the computational complexity and enhancing the nonlinear representation of the model.Comparative experiments on Cityscapes and ADE20K datasets demonstrate that the improved model has better generalization ability and achieves more accurate segmentation results.(2)To address the problem that Segformer network model has difficulty in fully acquiring effective feature information in the feature extraction process,we introduce a hybrid attention mechanism to acquire key feature information from two different feature dimensions of channel and space,thereby updating the bottom feature map,enhancing the feature extraction capability by fusing the feature map with the top-level feature map,and improving the model’s ability to acquire low-level feature information.Through comparison experiments on Cityscapes and ADE20K datasets,it is demonstrated that the improved model can improve the understanding of image semantics and thus further improve the segmentation accuracy.(3)In order to solve the problems of difficulty in extracting detailed feature information caused by scale changes between objects and mutual occlusion interference in complex scenes,as well as insufficient connection between each detail information,the communication between windows is enhanced by moving window self-attention,which can realize relative position encoding,and at the same time,the number of windows is reduced by using mask mechanism to reduce the computational complexity of the model.Through comparison experiments on Cityscapes and ADE20K dataset,it is proved that the improved model can better balance segmentation accuracy and inference speed.
Keywords/Search Tags:Convolutional Neural Network, Semantic Segmentation, Lightweight Neural Networks, Transformer, Shifed Window, Attention Mechanism
PDF Full Text Request
Related items