As the core object of the transportation industry,different types of vehicles have different effects and influences on transportation,and the information such as the types,location distribution and number of various vehicles plays an extremely critical role on vehicle flow estimation,real-time transportation monitoring,analysis and control and is of great significance for the realization of intelligent transportation systems.Compared with the vehicle detection in the video based on the vehicle-mounted or roadside fixed camera device,the aerial video vehicle detection can obtain more global vehicle information more conveniently by virtue of the unique wide viewing angle of aerial photography.Therefore,in order to achieve high-precision real-time detection of dense vehicles in aerial videos,this paper mainly studies and improves the target detection algorithm based on deep learning,proposes a lightweight aerial vehicle targets detection algorithm based on the improved SSD model.Based on this algorithm,the corresponding aerial vehicle targets detection system is designed and implemented.Finally,the effectiveness of the algorithm and system is verified through experiments on the laboratory data sets.The main work and innovations of this paper are as follows:Aiming at the characteristics of small target size,large number and denseness,uncertain direction and different resolution in aerial video,a lightweight aerial vehicle detection algorithm based on improved SSD is proposed.Firstly,from the perspective of the activation function in the model,the Swish is improved,and the RH-Swish series activation function are proposed to reduce the computational cost and improve the flexibility of the activation function by introducing trainable parameters and random parameters that may not be shared by each channel.The expressiveness and generalization ability of the model can also be strengthened at the same time.Then the original VGG16 is replaced by an improved CNN model,namely RHMNet or one of the RHSNet series,as backbone,which effectively avoids the shortcomings of poor real-time and high resource occupation of the original model.Then,the k-means|| clustering of the aerial training set is used to help determine the scale and aspect ratio of the default boxes.At the same time,the position and default boxes configurations of each prediction layer are determined according to multiple factors such as the size of the feature map,the method of setting the equal interval scale,and the limit on the number of bounding boxes.In this way,the defect can be overcomed that the scale and aspect ratio of default boxes of each prediction layer in the original SSD model do not match the target sizes.Afterwards,the form of the loss function and the total loss are improved to address the “three major imbalance problems” of positive and negative samples imbalances,hard and easy samples imbalances and category imbalances.At the same time,a dual-threshold NMS based on an improved penalty function was introduced in the post-processing section to effectively reduce false alarms and the missing.Afterwards,from the perspective of model training and optimization,a large amount of data augmentation is utilized,and a best optimization algorithm is introduced,and a model initialization method is designed for the new activation function proposed in this paper to improve training efficiency and model accuracy.Moreover,the introduction of multi-scale training helps the model determine the best input scale,and achieve a good compromise between improving the detection accuracy of dense small targets and ensuring real-time detection.Offline hard example mining further effectively solves the missing of "extremely hard positive examples" and the false alarms of "extremely hard negative examples".From the perspective of attention mechanism and multi-scale context information enhancement,the algorithms and models proposed in the previous section are further enhanced:1)In the model enhancement part based on the attention mechanism,first,from the two dimensions of channel attention and spatial attention,an improved channel attention unit CAU and spatial attention unit SAU are proposed and applied to the building blocks and the overall structure of the backbone.In this way,the feature extraction ability of the backbone and the expression ability of the model are enhanced,with model redundancy reduced.Then,for different backbone structures,the two attention mechanisms are integrated through specific connection mode,and at the same time,more refined feature enhancement is achieved in the two dimensions of channel and space.2)The model enhancement part based on multi-scale context information is developed mainly from two levels of receptive field module and feature fusion.First,the original RFB is improved,and the lightweight receptive field module L-RFB series is proposed and used for feature extraction and bounding boxes generation of the detection model.On one hand,it introduces multi-scale context information to the model.On the other hand,the receptive fields of the prediction layers are increased.Then based on the existing feature fusion method,from the top-down and bottom-up perspectives,this paper respectively propose deconvolution-based,Reshape-based feature fusion methods and pooling-based feature fusion method,in order to solve the problem of the insufficiency of contextual information in shallow layers and detailed information such as location,edges,etc.in deep layers respectively.Last but not least,the aerial video vehicle detection algorithms proposed in the two stages of this paper are implemented based on the deep learning framework.The test results on the laboratory data set show that compared with the original SSD and its several lightweight variants,the lightweight aerial vehicle detection algorithm and model based on the improved SSD proposed in this paper have achieved a double improvement in detection accuracy and real-time performance.Furthermore,the model enhancement algorithm has further greatly improved the detection accuracy,and the m AP of the enhanced model RHSDet proposed in this paper surpasses several mainstream state-of-the-art non-lightweight target detection models,which proves the effectiveness and superiority of the detection algorithm and model enhancement algorithm in this paper. |