| As the rapid advance of science and computing hardware,the fields that highly depend on labor resources now resort to the technology that requires few or zero human power.Autonomous unmanned system,who makes automatic and intelligent decision to replace human,is gaining more and more attentions from various research communities,given its novelty in design and appealing prospects in applications.For example,the driverless car at the forefront of the new trend of technological application development is one of the important applications of autonomous unmanned systems.It receives external environmental information through external sensors and feedbacks the autonomous unmanned systems to control the driving of the vehicle.The premise of autonomous unmanned system judgment is to fully obtain the external environment information and adjust its own action decision in real time according to the external information.Therefore,a method with superior information acquisition capabilities plays a vital role in autonomous unmanned systems.Because of its low price and abundant information,the camera makes computer vision technology an important method of obtaining external information.Computer vision includes research in image classification,target recognition,and instance segmentation.For the driverless car scene,its visual system has a time limit on the speed of detection processing,so it is generally used for fast target detection methods,such as the classic one-stage target detection method YOLO and SSD algorithm.Compared with the two-stage detection method,the one-stage detection method has the advantage of fast detection speed,but its detection accuracy and recall rate are both low.In order to improve the detection performance of the one-stage detection method,this paper proposes an improved model YOLO-CBAM based on YOLOv3.The innovative features and research work of the improved model YOLO-CBAM based on YOLOv3 are as follows:(1)Improve the residual block model in the Dark Net-53 network,we add the spatial attention module and the channel attention module after the residual block feature extraction.The attention module generates two weight mapping functions to modify the weights in the convolutional features,so that the feature weights related to the target object are larger,thereby improving the ability to express convolutional features.(2)The minimum mean square error boundary regression method used in YOLOv3 does not well represent the detection effect of the bounding box on the target.By introducing three geometric factors,the Io U area,the distance between the center points of the two boxes,and the width-to-height ratio of the two boxes as the penalty items for the bounding box regression,the network training convergence is faster and the bounding box regression is more fitted.Through the detection in the test set,the m AP of the YOLO-CBAM model in target detection is 87.9%,which is an increase of 5.9% compared with 82.0% of the YOLOv3 model.For the training data set,due to the variety of traffic signs and its relatively small proportion in the data set,this will have a greater impact on the performance of object detection network.This paper proposes to use a special traffic sign recognition network as the postprocessing of object detection to improve the performance of target detection and improve the accuracy of traffic sign recognition.By analyzing the performance of various lightweight networks and stacked convolutional networks,we give advice on network selection. |