| With the rapid development of computer vision and autonomous driving technology,more and more internet companies or car companies are deploying autonomous driving technology.Autopilot vehicles use sensors to obtain various environmental information of the external roads,and then comprehensively analyze it to provide information decision for path planning and decision control,and reasonably plan the driving direction and speed to reduce the occurrence of traffic accidents.This paper mainly studies the detection of roads and the semantic segmentation of images.Two improved road segmentation algorithms are proposed.The main contents are as follows:Currently semantic segmentation models have different structures such as image pyramid,encoder-decoder structure,and dilated convolution.We have selected FCN,U-NET,RefineNet,and DeeplabV3+,which are the optimal segmentation models based on different structures,as the comparative experiment of the improved algorithm.Through the verification on BDD100K data set,these models have the disadvantages of inaccurate edge prediction and target background prediction.We have improved the U-NET and Deeplab V3+networks for these shortcomings.In this paper,an algorithm for improving DeeplabV3+is proposed.For the problem of information loss in feature extraction stage of semantic segmentation model,we imporved model fusion.DeeplabV3+uses four different dilated ratios for dilated convolution parallelization for feature fusion,but the larger dilated rate will cause the pixel sampling rate to be more sparse than traditional convolution,which will lose more details information.Therefore,this paper proposes a cascaded pyramid pooling method.Compared with the parallel pyramid,the cascaded pyramid uses more pixels for feature calculation and has less information loss.In order to make neural network more simple in parameter adjustment,faster in learning rate and convergence,an IBN-NET normalized network is proposed.This network is a combination of batch normalized network and instance normalized network.Therefore,it also combines the advantages of both networks.It can not only learn the features that don’t change with appearance changes such as color and space,but also retain the information about the content.By testing on the data set,the network can effectively improve the segmentation effect of the road travelable area.Aiming at the improvement of U-NET algorithm,we propose an attention mechanism.The attention mechanism uses a series of attention coefficients to emphasize important information of the target object and can suppress some irrelevant details.By selecting the position of the focus,more distinguishing features are produced.In the upsampling stage of semantic segmentation,a design of the max pooling index structure is proposed.When upsampling,the feature map is mapped to the upsampled result according to the max coordinate information saved in the encoding process,and other non-indexed positions are added with zeros.This can accurately retain the position information in the original image,avoiding the loss of information during the upsampling process.Experiments have shown that the improved network achieves better results in identifying details such as the edge of the road. |