| Research on AI has led to the development of intelligent vehicles.In recent years,the scale of intelligent vehicles market and penetration expand quickly,which can effectively reduce the current traffic congestion,energy consumption and traffic accidents,is of great significance to alleviating the increasingly complicated traffic environment in our country.As one of the three key of intelligent driving(perception,planning and control),the importance of perception algorithms is self-evident.Among the two sensor solutions,lidar has the advantages of highprecision and long-distance detection but cost can not be neglected so it does not have the premise of mass production.In contrast,the camera-based methods have high recognition efficiency and large market scale,which has been widely used in the perception system.Perception algorithm based on monocular vision takes pedestrians,vehicles,traffic signs,etc.as the recognition targets,through feature extraction,classification,and then localization and segmentation the objects.However,the current algorithms have problems in poor environment,with insufficient robustness,and small dataset size,and hard to real-time performance.In view of the above problems,the research contents of this paper are as follows:(1)In view of the current poor performance of existing object detection algorithms in the face of complex conditions,the feature mechanism is studied,and the attention mechanism optimization for channel and spatial dimensions is proposed;an adaptive fusion algorithm is designed for better feature fusion through network self-learning parameters.By using the BDD100 K dataset,the experiment proves that the proposed algorithm can effectively improve the performance of the model while ensuring real-time performance;(2)To address the problem in scarcity of existing dataset,the augmentation method is designed through the random image stitching and random instance clipping.And a semisupervised pseudo label generation is proposed based on the KITTI dataset.On this basis,a multi-scale pooling module and a multi-dilated rate module are proposed for the instance segmentation and realize the improvement;(3)To meet the real-time deployment of embedded processors,lightweight models of CNN are designed,through feature reusing,spatial and channel separation and grouping calculation at the channel level and the stage level,achieves significant reduction in the amount of network parameters and calculations.Experiments on the NVIDIA Jetson TX2 and Jetson Xavier prove that the designed model can be deployed on the embedded processors in real time,and maintain accuracy. |