Font Size: a A A

Research On Semantic Segmentation Method Of Traffic Scene Image Based On Deep Learning

Posted on:2022-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:M H XuanFull Text:PDF
GTID:2492306521994939Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Image semantic segmentation is an important research direction in computer vision,and it has a wide range of applications in areas such as autonomous driving,drones,and medical treatment.Traditional image segmentation can only segment images based on underlying features such as color,texture,and shape.The segmentation accuracy that can be achieved is limited and cannot meet actual needs.However,image semantic segmentation is to transform the classification object from image to pixel based on image classification.For the past few years,the semantic segmentation technology of traffic scene images based on deep learning has turn into a new research hotspot and has attracted more and more scholars attention.At the beginning,image semantic segmentation based on deep learning is mainly improved by simply increasing the depth or width of convolutional neural network.Then,by introducing transfer learning,attention mechanism,conditional random field,feature fusion and other methods to replace the original increase in the number of network layers.The method improves the accuracy and controls the number of network layers.At the same time,in order to reduce the amount of parameters and calculations,Many existing models will also replace standard convolutions with deepwise separable convolutions,group convolutions,etc.Inspired by the research ideas of scholars at home and abroad,we have conducted research on the semantic segmentation technology of traffic scene images based on deep learning to improve the accuracy of image semantic segmentation and reduce model complexity.The main research work of this paper includes the following two aspects :(1)In the process of extracting the image features of the traffic scene,the semantic segmentation model loses the spatial position due to continuous downsampling,resulting in poor semantic segmentation accuracy.For this reason,an image semantic segmentation model with spatial and channel attention fusion multi-level features is proposed.Firstly,the channel attention module is introduced to the semantic information path with high-level features.Based on the image features extracted by the pre-training model Resnet101,the interdependence between channels is explicitly modeled,and the content of the feature map in each layer that needs to be focused is determined;Secondly,the spatial attention module is introduced to the spatial information path with low-level features,the spatial attention matrix is extracted on the basis of the reserved spatial position information,and the obtained matrix is used on the corresponding feature map of the semantic information path to determine the location information that needs to be focused.Finally,compare experiments with the existing 9 and 7 methods on the Cam Vid dataset and Pascal_Context dataset,the results show that the proposed method has good performance.(2)Aiming at the high complexity of existing models,a lightweight image semantic segmentation model that decomposes multi-hole deep convolution is proposed.Firstly,the semantic features of the traffic scene are extracted by using the atrous spatial pyramid structure with different receptive fields,and the atrous convolution is decomposed in the depthwise convolution process to reduce the amount of parameters and calculations;Secondly,the features maps obtained at different stages are fused,sub-pixel convolution is used for up-sampling,and the extracted low-resolution image is inserted into the final output high-resolution feature map to improve the accuracy of image semantic segmentation;Finally,it is verified on the Cam Vid dataset,compared with the existing 9 methods,the results show that the parameters and calculations of the method in this paper are relatively low,and the accuracy of semantic segmentation can be improved.
Keywords/Search Tags:Image semantic segmentation, Spatial attention, Channel attention, Atrous spatial pyramid structure, Depthwise separable convolution, 1D-factorized convolution
PDF Full Text Request
Related items