Automatic driving is one of the current research hotspots.Scene perception and understanding are the main contents of automatic driving systems.The study of complex traffic scene understanding can effectively improve the decision-making ability and intelligence level of automatic driving systems and effectively guarantee the safety of vehicle driving.Complex traffic scene understanding for urban roads mainly includes various traffic target detection,traffic scene semantic segmentation and scene understanding.Aiming at the problems of poor detection of small targets in urban road scenes,difficulties in expressing target features due to factors such as illumination weather,and low efficiency in understanding complex traffic scenes,taking vehicle vision in urban road environment as the starting point,this paper focus on the research of traffic target detection,road scene segmentation,and automatic driving scene understanding methods with the starting point of in-vehicle vision in urban road environments,taking into full consideration factors such as algorithm accuracy,speed,and computational resources.The specific research contents are as follows:(1)A traffic target detection method suitable for urban road scenes is proposed.A multi-scale features reassembly YOLO(MFR-YOLO)road traffic target detection method is proposed to address the problems of large differences in road traffic target size,many small targets and occlusion in in-vehicle vision.Firstly,a small target detection layer is introduced into the feature fusion and detection network to improve the detailed feature representation capability of small-sized traffic targets.Secondly,the feature reassembly upsampling module is used in the feature fusion network to improve the feature representation ability of traffic targets with different sizes.Then,the size difference between traffic targets is reduced by introducing an efficient intersection ratio loss function.Finally,experiments are carried out on the BDD100 K datasets.The experimental results show that compared with the baseline algorithm,the detection accuracy of the proposed method is improved by 3.7%,and the detection speed reaches172.41frames/s.(2)A road scene segmentation method for urban roads is proposed.Aiming at the problems of irregular obstacles in complex road scenes,large changes in environmental factors such as illumination and weather,and the difficulty of scene segmentation,a Mixing middle domain and Fourier domain adapted Transformer(MFFormer)urban road scene segmentation method is proposed.Firstly,the semi-supervised semantic segmentation network structure of DAFormer is introduced to reduce the dependence of the model on labeled data.Secondly,considering the factors of scene and weather environment separately,the intermediate domain is added between the source image and the target image to optimize the network loss function and improve the adaptability of the model scene.Then,using the characteristics of Fourier domain adaptation with replaceable amplitude and phase information,the content of the target image is retained and the style is changed,which enhances the adaptability of the image to the scene.Finally,experiments are carried out on Cityscapes and ACDC datasets.The experimental results show that compared with the baseline algorithm,the proposed method improves by 3.92%,2.01%,3.99% and 4.71% in night,fog,rain and snow scenes respectively.(3)A multi-task scene understanding method for urban roads is proposed.A MultiTask Learning YOLO(MT-YOLO)scene understanding method is proposed to address the problems of low efficiency and high consumption of computational resources in automatic driving scene understanding.Firstly,the feature extraction network and feature fusion network of target detection are used as the encoder of scene understanding algorithm,so that multiple tasks share the same encoder and improve the efficiency of the model usage.Then,according to the characteristics of the semantic segmentation network,two segmentation branches are constructed and the internal details of the network connection are optimized.Finally,single-task and multi-task experiments are carried out on the Cityscapes datasets.The experimental results show that the multi-task scene understanding algorithm simultaneously implements the traffic target detection and road scene segmentation tasks,achieving 54.4% detection accuracy,71.5% segmentation results and 45.85frames/s speed with 7.82 M parameters. |