| In the road scene,the vehicle relies on the perception sensor to obtain the external environment information,and then the decision-making body relies on the perception information to control the driving action of the car.The automatic driving of vehicles needs to rely on the rich and key traffic sign information from the outside world.The technology of completing traffic sign recognition in natural scenes with complex backgrounds is relatively mature,but there are still many challenges for the text recognition of traffic signs with more information on traffic signs.In order to solve the problem of traffic sign text information detection in the external environment perception system of unmanned vehicles,this paper proposes a two-stage method to detect and recognize the text information of traffic signs in automatic driving scenarios.In the stage of traffic sign text detection,firstly,the traffic sign text detection module composed of the traffic sign detection model and the automatic driving scene text detection model is used to detect the traffic sign text.The traffic sign detection model uses the YOLO detector to detect traffic signs in the scene.The automatic driving scene text detection model uses IDBNet which combines self-attention feature association and spatial receptive field enhancement structure to improve the network detection performance.The feature extraction backbone network of DBNet has been replaced successively,and the SPP(Spatial Pyramid Pooling)structure,the self-attention(Contextual Transformer)structure and the two-way attention structure have been added,so that the network IDBNet proposed in this paper is composed of pixels,channels and receptive fields.Each dimension increases the correlation between features to improve the network detection performance.The traffic sign detection result and the scene text detection result are subjected to region-of-interest screening to obtain the traffic sign text area to be recognized.In the traffic sign text recognition stage,the CRNN network combining the lightweight backbone network and the CTC loss function is used to recognize the text in the region to be recognized.The instructive semantic information in the scene is obtained after the recognition result is post-processed.This paper has conducted experiments on multiple datasets such as MSRA-TD500,CTW1500,BDD100 K,MLT-2017,CTST-1600 and MTWI-2018.The experimental results show that the F1 comprehensive evaluation score of the improved IDBNet on the MSRA-TD500 dataset reaches 87.9%,which is 3% higher than the original network;the F1 comprehensive evaluation score on the CTW1500 dataset reaches 87.0%,which is higher than the original network.An increase of 2.6%.The detection and recognition F1 comprehensive evaluation score on the CTST-1600 dataset reached 92.8%.This paper designs and implements a complete scheme for the detection and recognition of traffic sign text information in automatic driving scenarios.The simulation test results of application scenarios show that this method can accurately detect and recognize traffic sign text,which has high reference value for practical applications. |